Thinking is what humans do. We hold concepts in our working memory and use stored memories that are related to evaluate new data and determine a course of action.
LLMs predict the next correct word in their sentence based on a statistical model. This model is developed by “training” with written data, often scraped from the internet. This creates many biases in the statistical model. People on the internet do not take the time to answer “i dont know” to questions they see. I see this as at least one source of what they call “hallucinations.” The model confidently answers incorrectly because that’s what it’s seen in training.
The internet has many sites with reams of examples of code in many programming languages. If you are working on code that is of the same order of magnitude of these coding examples, then you are within the training data, and results will generally be good. Go outside of that training data, and it just flounders. It isn’t capable and has no means of reasoning beyond its internal statistical model.
Thinking is what humans do. We hold concepts in our working memory and use stored memories that are related to evaluate new data and determine a course of action.
LLMs predict the next correct word in their sentence based on a statistical model. This model is developed by “training” with written data, often scraped from the internet. This creates many biases in the statistical model. People on the internet do not take the time to answer “i dont know” to questions they see. I see this as at least one source of what they call “hallucinations.” The model confidently answers incorrectly because that’s what it’s seen in training.
The internet has many sites with reams of examples of code in many programming languages. If you are working on code that is of the same order of magnitude of these coding examples, then you are within the training data, and results will generally be good. Go outside of that training data, and it just flounders. It isn’t capable and has no means of reasoning beyond its internal statistical model.