Google's Gemini 2.5 pro is out of beta.

diz@awful.systems · edit-2 4 months ago

Google's Gemini 2.5 pro is out of beta.

lIlIlIlIlIlIl@lemmy.world · 4 months ago

Why would you think the machine that’s designed to make weighted guesses at what the next token should be would be arithmetically sound?

That’s not how any of this works (but you already knew that)

diz@awful.systems · edit-2 4 months ago

The funny thing is, even though I wouldn’t expect it to be, it is still a lot more arithmetically sound than what ever is it that is going on with it claiming to use a code interpreter and a calculator to double check the result.

It is OK (7 out of 12 correct digits) at being a calculator and it is awesome at being a lying sack of shit.

lIlIlIlIlIlIl@lemmy.world · 4 months ago

lying sack of shit

Random tokens can’t lie to you, because they’re strings of text. Interpreting this as a lie is an interesting response

swlabr@awful.systems · 4 months ago

lol the corollary of this is that LLMs are incapable of producing meaningful output, you insufferable turd

lIlIlIlIlIlIl@lemmy.world · 4 months ago

Im using it literally every single day to make huge gains. Every single day I disprove this comment

self@awful.systems · 4 months ago

I knew you were a lying promptfondler the instant you came into the thread, but I didn’t expect you to start acting like a gymbro trying to justify their black market steroid habit. new type of AI booster unlocked!

now fuck off

GregorGizeh@lemmy.zip · edit-2 4 months ago

Idk personally i kind of expect the ai makers to have at least had the sense to allow their bots to process math with a calculator and not guesswork. That seems like, an absurdly low bar both for testing the thing as a user as well as a feature to think of.

Didn’t one model refer scientific questions to wolfram alpha? How do they smartly decide to do this and not give them basic math processing?

lIlIlIlIlIlIl@lemmy.world · 4 months ago

I would not expect that.

Calculators haven’t been replaced, and the product managers of these services understand that their target market isn’t attempting to use them for things for which they were not intended.

brb, have to ride my lawnmower to work

diz@awful.systems · edit-2 4 months ago

Try asking my question to Google gemini a bunch of times, sometimes it gets it right, sometimes it doesn’t. Seems to be about 50/50 but I quickly ran out of free access.

And google is planning to replace their search (which includes a working calculator) with this stuff. So it is absolutely the case that there’s a plan to replace one of the world’s most popular calculators, if not the most popular, with it.

HedyL@awful.systems · 4 months ago

Also, a lawnmower is unlikely to say: “Sure, I am happy to take you to work” and “I am satisfied with my performance” afterwards. That’s why I sometimes find these bots’ pretentious demeanor worse than their functional shortcomings.

lIlIlIlIlIlIl@lemmy.world · 4 months ago

“Pretentious” is a trait expressed by something that’s thinking. These are the most likely words that best fit the context. Your emotional engagement with this technology is weird

froztbyte@awful.systems · 4 months ago

“Pretentious” is a trait expressed by something that’s thinking

diz@awful.systems · 4 months ago

Pretentious is a fine description of the writing style. Which actual humans fine tune.

swlabr@awful.systems · 4 months ago

Given that the LLMs typically have a system prompt that specifies a particular tone for the output, I think pretentious is an absolutely valid and accurate word to use.

HedyL@awful.systems · 4 months ago

Also, these bots have been deliberately fine-tuned in a way that is supposed to sound human. Sometimes, as a consequence, I find it difficult to describe their answering style without employing vocabulary used to describe human behavior. Also, I strongly suspect that this deliberate “human-like” style is a key reason for the current AI hype. It is why many people appear to excuse the bots’ huge shortcomings. It is funny to be accused of being “emotional” when pointing out these patterns as problematic.

ebu@awful.systems · 4 months ago

“emotional”

let me just slip the shades on real quick

“womanly”

checks out