LLMs are fundamentally unsuitable for character counting on account of how they ‘see’ the world - as a sequence of tokens, which can split words in non-intuitive ways.
Regular programs already excel at counting characters in words, and LLMs can be used to generate such programs with ease.
This is true. They do not think, because they are next token predictors, not brains.
Having this in mind, you can still harness a few usable properties from them. Nothing like the kind of hype the techbros and VCs imagine, but a few moderately beneficial use-cases exist.
Without a doubt. But PhD level thinking requires a kind of introspection that LLMs (currently) just don’t have. And the letter counting thing is a funny example of that inaccuracy
The tokenization is a low-level implementation detail, it shouldn’t affect an LLM’s ability to do basic reasoning. We don’t do arithmetic by counting how many neurons we can feel firing in our brain, we have higher level concepts of numbers, and LLMs are supposed to have something similar. Plus, in the “”“thinking”“” models, you’ll see them break up words into individual letters or even write them out in a numbered list, which should break the tokens up into individual letters as well.
LLMs are fundamentally unsuitable for character counting on account of how they ‘see’ the world - as a sequence of tokens, which can split words in non-intuitive ways.
Regular programs already excel at counting characters in words, and LLMs can be used to generate such programs with ease.
But they don’t recognize their inadequacies, instead spouting confident misinformation
This is true. They do not think, because they are next token predictors, not brains.
Having this in mind, you can still harness a few usable properties from them. Nothing like the kind of hype the techbros and VCs imagine, but a few moderately beneficial use-cases exist.
Without a doubt. But PhD level thinking requires a kind of introspection that LLMs (currently) just don’t have. And the letter counting thing is a funny example of that inaccuracy
The tokenization is a low-level implementation detail, it shouldn’t affect an LLM’s ability to do basic reasoning. We don’t do arithmetic by counting how many neurons we can feel firing in our brain, we have higher level concepts of numbers, and LLMs are supposed to have something similar. Plus, in the “”“thinking”“” models, you’ll see them break up words into individual letters or even write them out in a numbered list, which should break the tokens up into individual letters as well.