

2·
7 days agoIt’s trivial to get LLMs to act against the instructions
I have weird thoughts that people find peculiar, so I write them down and people seem to enjoy reading them.
It’s trivial to get LLMs to act against the instructions
There is a lot more that goes into it than just being correct. 18000 waters may have been the actual order, because somebody decided to screw with the machine. A human who isn’t terminally autistic would reliably interpret that as a joke and would simply refuse to punch that in. The LLM will likely do what a human tells it to do, since it has no contextual awareness, it only has the system prompt and whatever interaction with the user it had so far.
On the bright side, his family might get a solid chunk of dough instead of an abusive husband/father. If they’re lucky, that is
Fun speculation: we are CPUs for the information systems we inhabit, like scientific method, political ideologies, etc.