Posts

Showing posts from February, 2024

Emergence? in large language models (revised edition)

Image
Last year I wrote a post about emergence in AI , specifically on a paper claiming evidence for a "phase transition" in Large Language Models' ability to perform tasks they were not designed for. I found this fascinating. That paper attracted a lot of attention, even winning an award for the best paper at the conference at which it was presented. Well, I did not do my homework. Even before my post, another paper called into question the validity of the original paper. Are Emergent Abilities of Large Language Models a Mirage? Rylan Schaeffer, Brando Miranda, Sanmi Koyejo we present an alternative explanation for [the claimed] emergent abilities: that for a particular task and model family, when analyzing fixed model outputs, emergent abilities appear due to the researcher's choice of metric rather than due to fundamental changes in model behavior with scale . Specifically, nonlinear or discontinuous metrics produce apparent emergent abilities, whereas linear or continu

Launching my book in a real physical bookshop

Image
Physical bookstores selling physical books are in decline, sadly. Furthermore, the stores that are left are mostly big chains. Brisbane does have an independent bookstore, Avid Reader, in the West End. It is a vibrant part of the local community and has several author events every week. My daughter persuaded me to do a book launch, for  Condensed Matter Physics: A Very Short Introduction (Oxford UP, 2023)     It is at Avid Reader on Monday, February 26, beginning at 6 pm. Most readers of this blog are not in Brisbane, but if you are or know people who are please encourage them to consider attending. The event is free but participants need to register , as space is limited.   I will be in conversation about the book with my friend,   Dr Christian Heim , an author, composer, and psychiatrist. Like the book, the event is meant for a general audience.   

The role of effective theories and toy models in understanding emergent properties

Two of the approaches to the theoretical description of systems with emergent properties that have been fruitful are effective theories and toy models. These leverage our limited knowledge of many details about a system with many interacting components. Effective theories An effective theory is valid at a particular range of scales. This exploits the fact that in complex systems there is often a hierarchy of scales (length, energy, time, or number). In physics, examples of effective theories include classical mechanics, general relativity, classical electromagnetism, and thermodynamics. The equations of an effective theory can be written down almost solely from consideration of symmetry and conservation laws. Examples include the Navier-Stokes equations for fluid dynamics and non-linear sigma models in elementary particle physics. Some effective theories can be derived by the “coarse-graining” of theories that are valid at a finer scale. For example, the equations of classical mechanic

Four scientific reasons to be skeptical of AI hype

The hype about AI continues, whether in business or science. Undoubtedly, there is a lot of potential in machine learning, big data, and large language models. But that does not mean that the hype is justified. It is more likely to limit real scientific progress and waste a lot of resources. My innate scepticism receives concrete support from an article from 2018 that gives four scientific reasons for concern. Big data: the end of the scientific method?  Sauro Succi and Peter V. Coveney The article might be viewed as a response to a bizarre article in 2008 by Chris Anderson, editor-in-chief at Wired, The End of Theory: The Data Deluge Makes the Scientific Method Obsolete ‘With enough data, the numbers speak for themselves, correlation replaces causation, and science can advance even without coherent models or unified theories’. Here are the four scientific reasons for caution about such claims given by Succi and Coveney. (i)   Complex systems are strongly correlated, hence they do not