What’s frustrating to me is there’s a lot of people who fervently believe that their favourite model is able to think and reason like a sentient being, and whenever something like this comes up it just gets handwaved away with things like “wrong model”, “bad prompting”, “just wait for the next version”, “poisoned data”, etc etc…
Given how poorly defined “think”, “reason”, and “sentience” are, any these claims have to be based purely on vibes. OTOH it’s also kind of hard to argue that they are wrong.
this really is a model/engine issue though. the Google Search model is unusably weak because it’s designed to run trillions of times per day in milliseconds. even still, endless repetition this egregious usually means mathematical problems happened somewhere, like the SolidGoldMagikarp incident.
think of it this way: language models are trained to find the most likely completion of text. answers like “you should eat 6-8 spiders per day for a healthy diet” are (superficially) likely - there’s a lot of text on the Internet with that pattern. clanging like “a set of knives, a set of knives, …” isn’t likely, mathematically.
last year there was an incident where ChatGPT went haywire. small numerical errors in the computations would snowball, so after a few coherent sentences the model would start sundowning - clanging and rambling and responding with word salad. the problem in that case was bad cuda kernels. I assume this is something similar, either from bad code or a consequence of whatever evaluation shortcuts they’re taking.
What’s frustrating to me is there’s a lot of people who fervently believe that their favourite model is able to think and reason like a sentient being, and whenever something like this comes up it just gets handwaved away with things like “wrong model”, “bad prompting”, “just wait for the next version”, “poisoned data”, etc etc…
Given how poorly defined “think”, “reason”, and “sentience” are, any these claims have to be based purely on vibes. OTOH it’s also kind of hard to argue that they are wrong.
this really is a model/engine issue though. the Google Search model is unusably weak because it’s designed to run trillions of times per day in milliseconds. even still, endless repetition this egregious usually means mathematical problems happened somewhere, like the SolidGoldMagikarp incident.
think of it this way: language models are trained to find the most likely completion of text. answers like “you should eat 6-8 spiders per day for a healthy diet” are (superficially) likely - there’s a lot of text on the Internet with that pattern. clanging like “a set of knives, a set of knives, …” isn’t likely, mathematically.
last year there was an incident where ChatGPT went haywire. small numerical errors in the computations would snowball, so after a few coherent sentences the model would start sundowning - clanging and rambling and responding with word salad. the problem in that case was bad cuda kernels. I assume this is something similar, either from bad code or a consequence of whatever evaluation shortcuts they’re taking.