Emergence? in large language models (revised edition)
Last year I wrote a post about emergence in AI , specifically on a paper claiming evidence for a "phase transition" in Large Language Models' ability to perform tasks they were not designed for. I found this fascinating. That paper attracted a lot of attention, even winning an award for the best paper at the conference at which it was presented. Well, I did not do my homework. Even before my post, another paper called into question the validity of the original paper. Are Emergent Abilities of Large Language Models a Mirage? Rylan Schaeffer, Brando Miranda, Sanmi Koyejo we present an alternative explanation for [the claimed] emergent abilities: that for a particular task and model family, when analyzing fixed model outputs, emergent abilities appear due to the researcher's choice of metric rather than due to fundamental changes in model behavior with scale . Specifically, nonlinear or discontinuous metrics produce apparent emergent abilities, whereas linear or continu