OpenAI will not disclose GPT-5’s energy use. It could be higher than past models

Davriellelouna@lemmy.world · 2 months ago

OpenAI will not disclose GPT-5’s energy use. It could be higher than past models

redsunrise@programming.dev · 2 months ago

Obviously it’s higher. If it was any lower, they would’ve made a huge announcement out of it to prove they’re better than the competition.

Ugurcan@lemmy.world · edit-2 2 months ago

I’m thinking otherwise. I think GPT5 is a much smaller model - with some fallback to previous models if required.

Since it’s running on the exact same hardware with a mostly similar algorithm, using less energy would directly mean it’s a “less intense” model, which translates into an inferior quality in American Investor Language (AIL).

And 2025’s investors doesn’t give a flying fuck about energy efficiency.

PostaL@lemmy.world · 2 months ago

And they don’t want to disclose the energy efficiency becaaaause … ?

AnarchistArtificer@slrpnk.net · 2 months ago

Because the AI industry is a bubble that exists to sell more GPUs and drive fossil fuel demand

Hobo@lemmy.world · edit-2 2 months ago

Because, uhhh, whoa what’s that? ducks behind the podium

RobotZap10000@feddit.nl · edit-2 2 months ago

They probably wouldn’t really care how efficient it is, but they certainly would care that the costs are lower.

Ugurcan@lemmy.world · 2 months ago

I’m almost sure they’re keeping that for the Earnings call.

panda_abyss@lemmy.ca · 2 months ago

Do they do earnings calls? They’re not public.

Tollana1234567@lemmy.today · 2 months ago

probably VC money, the investors going to want some answers.

Sl00k@programming.dev · 2 months ago

It also has a very flexible “thinking” nature, which means far far less tokens spent on most peoples responses.

Chaotic Entropy@feddit.uk · 2 months ago

I get the distinct impression that most of the focus for GPT5 was making it easier to divert their overflowing volume of queries to less expensive routes.

T156@lemmy.world · 2 months ago

Unless it wasn’t as low as they wanted it. It’s at least cheap enough to run that they can afford to drop the pricing on the API compared to their older models.

thatcrow@ttrpg.network · 2 months ago

It warms me heart to see ya’ll finally tune-in to the scumbag tactics our abusers constantly employ.

morrowind@lemmy.ml · 2 months ago

It’s cheaper though, so very likely it’s more efficient somehow.

SonOfAntenora@lemmy.world · edit-2 1 month ago

deleted by creator

Sl00k@programming.dev · edit-2 2 months ago

The only accessible data come from mistral, most other ai devs are not exactly happy to share the inner workings of their tools.

Important to point out this is really only valid towards Western AI companies. Chinese AI models have mostly been open source with open papers.

dinckelman@lemmy.world · edit-2 26 days ago

deleted by creator

aeronmelon@lemmy.world · 2 months ago

Sam Altman looks like an SNL actor impersonating Sam Altman.

Chaotic Entropy@feddit.uk · edit-2 2 months ago

“Herr derr, AI. No, seriously.”

Saledovil@sh.itjust.works · 2 months ago

It’s safe to assume that any metric they don’t disclose is quite damning to them. Plus, these guys don’t really care about the environmental impact, or what us tree-hugging environmentalists think. I’m assuming the only group they are scared of upsetting right now is investors. The thing is, even if you don’t care about the environment, the problem with LLMs is how poorly they scale.

An important concept when evaluating how something scales is are marginal values, chiefly marginal utility and marginal expenses. Marginal utility is how much utility do you get if you get one more unit of whatever. Marginal expenses is how much it costs to get one more unit. And what the LLMs produce is the probably that a token, T, follows on prefix Q. So P(T|Q) (read: Probably of T, given Q). This is done for all known tokens, and then based on these probabilities, one token is chosen at random. This token is then appended to the prefix, and the process repeats, until the LLM produces a sequence which indicates that it’s done talking.

If we now imagine the best possible LLM, then the calculated value for P(T|Q) would be the actual value. However, it’s worth noting that this already displays a limitation of LLMs. Namely even if we use this ideal LLM, we’re just a few bad dice rolls away from saying something dumb, which then pollutes the context. And the larger we make the LLM, the closer its results get to the actual value. A potential way to measure this precision would be by subtracting P(T|Q) from P_calc(T|Q), and counting the leading zeroes, essentially counting the number of digits we got right. Now, the thing is that each additional digit only provides a tenth of the utility to than the digit before it. While the cost for additional digits goes up exponentially.

So, exponentially decaying marginal utility meets exponentially growing marginal expenses. Which is really bad for companies that try to market LLMs.

Jeremyward@lemmy.world · 2 months ago

Well I mean also that they kinda suck, I feel like I spend more time debugging AI code than I get working code.

SkunkWorkz@lemmy.world · 2 months ago

I only use it if I’m stuck even if the AI code is wrong it often pushes me in the right direction to find the correct solution for my problem. Like pair programming but a bit shitty.

The best way to use these LLMs with coding is to never use the generated code directly and atomize your problem into smaller questions you ask to the LLM.

fibojoly@sh.itjust.works · 2 months ago

So duck programming right?

Knock_Knock_Lemmy_In@lemmy.world · 2 months ago

And fancier intellisense

squaresinger@lemmy.world · 2 months ago

That’s actually true. I read some research on that and your feeling is correct.

Can’t be bothered to google it right now.

Sl00k@programming.dev · 2 months ago

Do you use Claude Code? It’s the only time I’ve had 90%+ success rate.

Jeremyward@lemmy.world · 2 months ago

I have, and it doesn’t at least not on the dev-ops stuff I work on.

Sl00k@programming.dev · 2 months ago

Do you use Claude Code? It’s the only time I’ve had 90%+ success rate.

Sl00k@programming.dev · 2 months ago

Do you use Claude Code? It’s the only time I’ve had 90%+ success rate.

daveB@sh.itjust.works · 2 months ago

Is it this?

seraphine@lemmy.blahaj.zone · 2 months ago

what is that? looks funny but idk this

Sheldan@lemmy.world · 2 months ago

Screenshot from the first matrix movie with pods full of people acting as batteries

seraphine@lemmy.blahaj.zone · 2 months ago

so exactly as I guessed, thanks for rhe explanation

fuzzywombat@lemmy.world · 2 months ago

Sam Altman has gone into PR and hype overdrive lately. He is practically everywhere trying to distract the media from seeing the truth about LLM. GPT-5 has basically proved that we’ve hit a wall and the belief that LLM will just scale linearly with amount of training data is false. He knows AI bubble is bursting and he is scared.

Saledovil@sh.itjust.works · 2 months ago

He’s also already admitted that they’re out of training data. If you’ve wondered why a lot more websites will run some sort of verification when you connect, it’s because there’s a desperate scramble to get more training data.

rozodru@lemmy.world · 2 months ago

Bingo. If you routinely use LLM’s/AI you’ve recently seen it first hand. ALL of them have become noticeably worse over the past few months. Even if simply using it as a basic tool, it’s worse. Claude for all the praise it receives has also gotten worse. I’ve noticed it starting to forget context or constantly contradicting itself. even Claude Code.

The release of GPT5 is proof in the pudding that a wall has been hit and the bubble is bursting. There’s nothing left to train on and all the LLM’s have been consuming each others waste as a result. I’ve talked about it on here several times already due to my work but companies are also seeing this. They’re scrambling to undo the fuck up of using AI to build their stuff, None of what they used it to build scales. None of it. And you go on Linkedin and see all the techbros desperately trying to hype the mounds of shit that remain.

I don’t know what’s next for AI but this current generation of it is dying. It didn’t work.

BluesF@lemmy.world · 2 months ago

I was initially impressed by the ‘reasoning’ features of LLMs, but most recently ChatGPT gave me a response to a question in which it stated five or six possible answers sparated by “oh, but that can’t be right, so it must be…”, and none of them was right lmao. Thought for like 30 seconds to give me a selection of wrong answers!

Tja@programming.dev · 2 months ago

Any studies about this “getting worse” or just anecdotes? I do routinely use them and I feel they are getting better (my workplace uses Google suite so I have access to gemini). Just last week it helped me debug an ipv6 ra problem that I couldn’t crack, and I learned a few useful commands on the way.

Tollana1234567@lemmy.today · 2 months ago

MS already released, thier AI doesnt make money at all, in fact its costing too much. of course hes freaking out.

kescusay@lemmy.world · 2 months ago

I have to test it with Copilot for work. So far, in my experience its “enhanced capabilities” mostly involve doing things I didn’t ask it to do extremely quickly. For example, it massively fucked up the CSS in an experimental project when I instructed it to extract a React element into its own file.

That’s literally all I wanted it to do, yet it took it upon itself to make all sorts of changes to styling for the entire application. I ended up reverting all of its changes and extracting the element myself.

Suffice to say, I will not be recommending GPT 5 going forward.

Sanguine@lemmy.dbzer0.com · 2 months ago

Sounds like you forgot to instruct it to do a good job.

dindonmasker@sh.itjust.works · 2 months ago

“If you do anything else then what i asked your mother dies”

Elvith Ma'for@feddit.org · 2 months ago

“Beware: Another AI is watching every of your steps. If you do anything more or different than what I asked you to or touch any files besides the ones listed here, it will immediately shutdown and deprovision your servers.”

discosnails@lemmy.wtf · 2 months ago

They do need to do this though. Survival of the fittest. The best model gets more energy access, etc.

kescusay@lemmy.world · 2 months ago

I’ve tried threats in prompt files, with results that are… OK. Honestly, I can’t tell if they made a difference or not.

The only thing I’ve found that consistently works is writing good old fashioned scripts to look for common errors by LLMs and then have them run those scripts after every action so they can somewhat clean up after themselves.

GenChadT@programming.dev · 2 months ago

That’s my problem with “AI” in general. It’s seemingly impossible to “engineer” a complete piece of software when using LLMs in any capacity that isn’t editing a line or two inside singular functions. Too many times I’ve asked GPT/Gemini to make a small change to a file and had to revert the request because it’d take it upon itself to re-engineer the architecture of my entire application.

hisao@ani.social · 2 months ago

I make it write entire functions for me, one prompt = one small feature or sometimes one or two functions which are part of a feature, or one refactoring. I make manual edits fast and prompt the next step. It easily does things for me like parsing obscure binary formats or threading new piece of state through the whole application to the levels it’s needed, or doing massive refactorings. Idk why it works so good for me and so bad for other people, maybe it loves me. I only ever used 4.1 and possibly 4o in free mode in Copilot.

GenChadT@programming.dev · 2 months ago

It’s an issue of scope. People often give the AI too much to handle at once, myself (admittedly) included.

kescusay@lemmy.world · 2 months ago

Are you using Copilot in agent mode? That’s where it breaks shit. If you’re using it in ask mode with the file you want to edit added to the chat context, then you’re probably going to be fine.

hisao@ani.social · 2 months ago

I’m only using it in edits mode, it’s the second of the three modes available.

kescusay@lemmy.world · 2 months ago

Yep, that’s also pretty safe.

FauxLiving@lemmy.world · 2 months ago

It’s a lot of people not understanding the kinds of things it can do vs the things it can’t do.

It was like when people tried to search early Google by typing plain language queries (“What is the best restaurant in town?”) and getting bad results. The search engine had limited capabilities and understanding language wasn’t one of them.

If you ask a LLM to write a function to print the sum of two numbers, it can do that with a high success rate. If you ask it to create a new operating system, it will produce hilariously bad results.

ayyy@sh.itjust.works · 2 months ago

You can’t blame the user when the marketing claims it’s replacing entire humans.

iopq@lemmy.world · 2 months ago

It is replacing entire humans. The thing is, it’s replacing the people you should have fired a long time ago

FauxLiving@lemmy.world · 2 months ago

I can blame the user for believing the marketing over their direct experiences.

If you use these tools for any amount of time it’s easy to see that there are some tasks they’re bad at and some that they are good at. You can learn how big of a project they can handle and when you need to break it up into smaller pieces.

I can’t imagine any sane person who lives their life guided by marketing hype instead of direct knowledge and experience.

ErmahgherdDavid@lemmy.dbzer0.com · 2 months ago

I can’t imagine any sane person who lives their life guided by marketing hype instead of direct knowledge and experience.

I mean fair enough but also… That makes the vast majority of managers, MBAs, salespeople and “normies” like your grandma and Uncle Bob insane.

Actually questioning stuff that sales people tell you and using critical thinking is a pretty rare skill in this day and age.

AlfredoJohn@sh.itjust.works · 2 months ago

That makes the vast majority of managers, MBAs, salespeople and “normies” like your grandma and Uncle Bob insane.

Correct most of these people are insane, the average person is so fucking dumb and insane today its mind numbing.

Squizzy@lemmy.world · 2 months ago

We moved to m365 and were encouraged to try new elements. I gave copilot an excel sheet, told it to add 5% to each percent in column B and not to go over 100%. It spat out jumbled up data all reading 6000%.

Vanilla_PuddinFudge@infosec.pub · 2 months ago

Ai assumes too fucking much. I’d used it to set up a new 3D printer with klipper to save some searching.

Half the shit it pulled down was Marlin-oriented then it had the gall to blame the config it gave me for it like I wrote it.

“motherfucker, listen here…”

SGforce@lemmy.ca · 2 months ago

It’s the same tech. It would have to be bigger or chew through “reasoning” tokens to beat benchmarks. So yeah, of course it is.

ZILtoid1991@lemmy.world · 2 months ago

When will genAI be so good, it’ll solve its own energy crisis?

xthexder@l.sw0.com · 2 months ago

Most certainly it won’t happen until after AI has developed a self-preservation bias. It’s too bad the solution is turning off the AI.

Saledovil@sh.itjust.works · 2 months ago

Current genAI? Never. There’s at least one breakthrough needed to build something capable of actual thinking.

Tollana1234567@lemmy.today · 2 months ago

intense electricity demands, and WATER for cooling.

Event_Horizon@lemmy.world · 2 months ago

I wonder if at this stage all the processors should simply be submerged into a giant cooling tank. It seems easier and more efficient.

IsoKiero@sopuli.xyz · 2 months ago

Or you could build the centers in colder climate areas. Here in Finland it’s common (maybe even mandatory, I’m not sure) for new datacenters to pull the heat from their systems and use that for district heating. No wasted water and at least you get something useful out of LLMs. Obviously using them as a massive electric boiler is pretty inefficient but energy for heating is needed anyways so at least we can stay warm and get 90s action series fanfic on top of that.

Knock_Knock_Lemmy_In@lemmy.world · 2 months ago

What happens to that heat in summer?

IsoKiero@sopuli.xyz · 2 months ago

There’s experimental storages where heat is pumped to underground pools or sand, but as far as I know there’s heat exchangers and radiators to outside, so majority of excess heat is just wasted to outside. But absolute majority of them are closed loop systems since you need something else than plain water anyways to prevent freezing in the winter.

OADINC@feddit.nl · 2 months ago

Microsoft has tried running datacenters in the sea, for cooling purposes. Microsoft blog

Tollana1234567@lemmy.today · 2 months ago

brings in another problem, so they have to use generators, or undersea cables.

potoooooooo ☑️@lemmy.world · 2 months ago

Fine: make it a data-center powered by water-wheel generators. Water powered AND cooled!

Transtronaut@lemmy.blahaj.zone · 2 months ago

If anyone has ever wondered what it would look like if tech giants went all in on “brute force” programming, this is it. This is what it looks like.

Optional@lemmy.world · 2 months ago

Photographer1: Sam, could you give us a goofier face?

*click* *click*

Photographer2: Goofier!!

*click* *click* *click* *click*

cenzorrll@piefed.ca · 2 months ago

He looks like someone in a cult. Wide open eyes, thousand yard stare, not mentally in the same universe as the rest of the world.

nialv7@lemmy.world · 2 months ago

Looks like he’s going to eat his microphone

kalleboo@lemmy.world · 2 months ago

They literally don’t know. “GPT-5” is several models, with a model gating in front to choose which model to use depending on how “hard” it thinks the question is. They’ve already been tweaking the front-end to change how it cuts over. They’ve definitely going to keep changing it.

cecilkorik@lemmy.ca · 2 months ago

So like, is this whole AI bubble being funded directly by the fossil fuel industry or something? Because the AI training and the instantaneous global adoption of them is using energy like it’s going out of style. Which fossil fuels actually are (going out of style, and being used to power these data centers). Could there be a link? Gotta find a way to burn all the rest of the oil and gas we can get out of the ground before laws make it illegal. Makes sense, in their traditional who gives a fuck about the climate and environment sort of way, doesn’t it?

BillyTheKid@lemmy.ca · 2 months ago

I mean, AI is using like 1-2% of human energy and that’s fucking wild.

My take away is we need more clean energy generation. Good things we’ve got countries like China leading the way in nuclear and renewables!!

cecilkorik@lemmy.ca · 2 months ago

All I know is that I’m getting real tired of this Matrix / Idiocracy Mash-up Movie we’re living in.

ayyy@sh.itjust.works · 2 months ago

Yes, China is producing a lot of solar panels (a good thing!) but the percentage of renewables is actually going down. They are adding coal faster than solar.

Womble@piefed.world · 2 months ago

Do you have a source for that? Because given a chatgpt query takes a similar amount of energy to running a hair dryer for a few seconds i find it hard to believe.

Rimu@piefed.social · 2 months ago

a similar amount of energy to running a hair dryer

We see a lot of those kinds of comparisons. Thing is, you run a hair dryer once per day at most. Or it’s compared to a google search, often. Again, most people will do a handful of searches each day. A ChatGPT conversation can be hundreds of messages back and forth. A Claude Code session can go for hours and involve millions of tokens. An individual AI inference might be pretty tame but the quantity of them is another level.

If it was so efficient then they wouldn’t be building Manhatten-sized datacenters.

Womble@piefed.world · edit-2 2 months ago

ok, but running a hairdryer for 5 minutes is well up into the hundreds of queries which is more than the vast majority of people will use in a week. The post I replied to was talking about it being 1-2% of energy usage, so that includes transport, heating and heavy industry. It just doesnt pass the smell test to me that something where a weeks worth of usage is exceeded by a person drying their hair once is comparable with such vast users of energy.

Tollana1234567@lemmy.today · 2 months ago

its like crypto, they wanted to make money of VC funds, and thats probably running dry right now, and the investors are probably going to demand returns very soon. why do you think the massive layoffs started in 2023.

TheObviousSolution@lemmy.ca · 2 months ago

When you want to create the shiniest honeypot, you need high power consumption.