OpenAI claims new GPT-5 model boosts ChatGPT to ‘PhD level’

sabreW4K3@lazysoci.al · 1か月前

OpenAI claims new GPT-5 model boosts ChatGPT to ‘PhD level’

shnizmuffin@lemmy.inbutts.lol · 1か月前

If I asked a PhD, “How many Bs are there in the word ‘blueberry’?” They’d call an ambulance for my obvious, severe concussion. They wouldn’t answer, “There are three Bs in the word blueberry! I know, it’s super tricky!”

panda_abyss@lemmy.ca · edit-2 1か月前

I don’t feel this is a good example of why LLMs shouldn’t be treated like PhDs.

My first interactions with gpt5 have been pretty awful, and I’d test it but it’s not available to me anymore

Edit: I am not having a stroke, I’m bad at typing and autocorrect hates me

shnizmuffin@lemmy.inbutts.lol · 1か月前

Do you smell toast?

panda_abyss@lemmy.ca · 1か月前

BlackBerry toast

darreninthenet@piefed.social · 1か月前

FWIW, ChatGPT 5 gets this correct

shnizmuffin@lemmy.inbutts.lol · 1か月前

Fuckin’ does it?

darreninthenet@piefed.social · 1か月前

darreninthenet@piefed.social · 1か月前

It did for me 🤷🏻‍♂️

limerod@reddthat.com · 1か月前

You appear to be using the older gpt model. The newer model calculates and answers correctly for most words at least for the few I asked

mbtrhcs@feddit.org · 1か月前

It literally says 5 in the screenshot but ok

limerod@reddthat.com · 1か月前

I saw that. I’m using the mobile app. There’s a possibility the web version is using an inferior model.

GissaMittJobb@lemmy.ml · 1か月前

LLMs are fundamentally unsuitable for character counting on account of how they ‘see’ the world - as a sequence of tokens, which can split words in non-intuitive ways.

Regular programs already excel at counting characters in words, and LLMs can be used to generate such programs with ease.

itslilith@lemmy.blahaj.zone · 1か月前

But they don’t recognize their inadequacies, instead spouting confident misinformation

GissaMittJobb@lemmy.ml · 1か月前

This is true. They do not think, because they are next token predictors, not brains.

Having this in mind, you can still harness a few usable properties from them. Nothing like the kind of hype the techbros and VCs imagine, but a few moderately beneficial use-cases exist.

itslilith@lemmy.blahaj.zone · 1か月前

Without a doubt. But PhD level thinking requires a kind of introspection that LLMs (currently) just don’t have. And the letter counting thing is a funny example of that inaccuracy

chaos@beehaw.org · 1か月前

The tokenization is a low-level implementation detail, it shouldn’t affect an LLM’s ability to do basic reasoning. We don’t do arithmetic by counting how many neurons we can feel firing in our brain, we have higher level concepts of numbers, and LLMs are supposed to have something similar. Plus, in the “”“thinking”“” models, you’ll see them break up words into individual letters or even write them out in a numbered list, which should break the tokens up into individual letters as well.