Saturday, December 23, 2023

Niels Bohr on emergence

Until this week, I did not know that Bohr ever thought about emergence.

Ernst Mayr was one of the leading evolutionary biologists in the twentieth century and was influential in the development of the modern philosophy of biology. He particularly emphasised the importance of emergence and the limitations of reductionism. In the preface to his 1997 book, This is Biology: the Science of the Living World, Mayr recounts the development of his thinking about emergence.

At first I thought that this phenomenon of emergence, as it is now called, was restricted to the living world, and indeed, in a lecture I gave in the early 1950s in Copenhagen, I made the claim that emergence was the one of the diagnostic features of the of the organic world. The whole concept of emergence at the time was considered to be rather metaphysical. When Niels Bohr who was who was in the audience, stood up during the discussion, I was fully prepared for an annihilating refutation. However, much to my surprise, he did not at all object to the concept of emergence, but only to my notion that it provided a demarcation between the physical and the biological sciences. Citing the case of water whose "aquosity" could not be predicted from the characteristics of its two components, hydrogen and oxygen, Bohr stated that emergence is rampant in the inaminate world.  (page xii).

Later in the book Mayr pillars Bohr for his support of vitalism, including claims that vitalism has a "quantum" foundation.

Friday, December 1, 2023

Very Short Introductions Podcast on Condensed Matter Physics

The podcast episode where I talk about my book just came out.
It is available on a range of platforms, listed here, including SoundCloud and YouTube.

Wednesday, November 29, 2023

Emergence in nuclear physics

Nuclear physics exhibits many characteristics associated with emergent phenomena. These include a hierarchy of scales, effective interactions and theories, and universality.

The table below summarises how nuclear physics is concerned with phenomena that occur at a range of length and number scales. At each level of the hierarchy, there are effective interactions that are described by effective theories. Some of the biggest questions in the field concern how the effective theories that operate at each level are related to the levels above and below.

Moving from the bottom level to the second top level, relevant length scales increase from less than a femtometre to several femtometres.

The challenge in the 1950s was to reconcile the liquid drop model and the nuclear shell model. This led to the discovery of collective rotations and shape deformations. The observed small moments of inertia were explained by BCS theory. Integration of the liquid drop and shell models led to the award of the1975 Nobel Prize in Physics to Aage Bohr, Ben Mottelson, and Rainwater.

Since the 1980s a major challenge is to show how the strong nuclear force between two nucleons can be derived from Quantum Chromodynamics (QCD). The figure below illustrates how the attractive interaction between a neutron and a proton can be understood in terms of the creation and destruction of a down quark-antiquark pair. The figure is taken from here.

An outstanding problem concerns the equation of state for nuclear matter, such as found in neutron stars. A challenge is to learn more about this from the neutron star mergers that are detected in gravitational wave astronomy.

Characteristics of universality are also seen in nuclear physics. Landau’s Fermi liquid theory provides a basis for the nuclear shell model which starts from assuming that nucleons can be described in terms of weakly interacting quasiparticles moving in an average potential from the other nucleons. The BCS theory of superconductivity can be adapted to describe the pairing of nucleons, leading to energy differences between nuclei with odd and even numbers of nucleons. 

Universality is also evident in the statistical distribution of energy level spacings in heavy nuclei. They can be described by random matrix theory which makes no assumptions about the details of interactions between nucleons, only that the Hamiltonian matrix has unitary symmetry. Random matrix theory can also describe aspects of quantum chaos and zeros of the Riemann zeta function relevant to number theory.


Wednesday, November 22, 2023

Shape memory alloys

Recently I bought a small wire of NiTinol to have fun with and use in demonstrations to kids. This video gives a spectacular demonstration and attempts to explain how it works. I did not know about their use in stents for heart surgery.


I am still struggling to understand exactly how shape-memory alloys work. According to Wikipedia

The shape memory effect occurs because a temperature-induced phase transformation reverses deformation...Typically the martensitic (low-temperature) phase is monoclinic or orthorhombic . Since these crystal structures do not have enough slip systems for easy dislocation motion, they deform by twinning—or rather, detwinning.

Martensite is thermodynamically favored at lower temperatures, while austenite (B2 cubic) is thermodynamically favored at higher temperatures. Since these structures have different lattice sizes and symmetry, cooling austenite into martensite introduces internal strain energy in the martensitic phase. To reduce this energy, the martensitic phase forms many twins—this is called "self-accommodating twinning" and is the twinning version of geometrically necessary dislocations. 

In different words, I think the essential idea may be the following. In most metals large strains are accomodated by topological defects such as dislocations. These become entangled leading to work hardening and irreversible changes is macroscopic shapes. Shape memory alloys are different because of the low symmetry unit cell. The most natural defects are twinning domain walls and they are not topological and so their formation is reversible.

I am looking forward to reading the book chapter Shape memory alloys by Vladimir Buljak, Gianluca Ranzi



Another fascinating phenomena that is related to shape-memory is "superelasticity", which I discussed in an earlier post on organic molecular crystals, and has recently been reviewed.

I welcome clarification  of the essential physics.

Tuesday, November 14, 2023

An emergentist perspective on public policy issues that divide

How is the whole related to the parts?

Which type of economy will produce the best outcomes: laissez-faire or regulated?

Can a government end an economic recession by "stimulus" spending?  

What is the relative importance of individual agency and social structures in causing social problems such as poverty and racism?

These questions are all related to the first one. Let's look at it from an emergentist perspective, with reference to physics. 

Consider the Ising model in two or more dimensions. The presence of nearest neighbour interactions between spins leads to emergent properties: long-range ordering of the spins, spontaneous symmetry breaking below the critical temperature, and singularities in the temperature dependence of thermodynamic properties such as the specific heat and magnetic susceptibility. Individual uncoupled spins have neither property. Even a finite number of spins do not. (Although, a large number of spins do exhibit suggestive properties such as an enhancement of the magnetic susceptibility near the critical temperature). Thus, the whole system has properties that are qualitatively different from the parts. 

On the other hand, the properties of the parts, such as how strongly the spins couple to an external field and interact with their neighbours, influence the properties of the whole. Some details of the parts matter. Other details don't matter. Adding some interaction with spins beyond nearest neighbours does not change any of the qualitative properties, provided those longer-range interactions are not too large. On the other hand, changing from a two-dimensional rectangular lattice to a linear chain removes the ordered state. Changing to a triangular lattice with an antiferromagnetic nearest-neighbour interaction removes the ordering and there are multiple ground states. Thus, some microscopic details do matter.

For illustrative purposes, below I show a sketch of the temperature dependence of the magnetic susceptibility of the Ising model for three cases: non-interacting spins (J=0), two dimensions (d=2), and one dimension (d=1). This shows how interactions can significantly enhance/diminish the susceptibility depending on the parameter regime.

The main point of this example is to show that to understand a large complex system we have to keep both the parts and the whole in mind. In other words, we need both microscopic and macroscopic pictures. There are two horizons, the parts and the whole, the near and the far. There is a dialectic tension between these two horizons. It is not either/or but both/and.

I now illustrate how this type of tension matters in economics and sociology, and the implications for public policy. If you are (understandably) concerned about whether Ising models have anything to do with sociology and economics, see my earlier posts about these issues. The first post introduced discrete-choice models that are essentially Ising models. A second post discussed how these show how equilibrium may never be reached leading to the insight that local initiatives can "nucleate" desired outcomes. A third post, considered how heterogeneity can lead to qualitative changes including hysteresis so that the effectiveness of "nudges" can vary significantly.

A fundamental (and much debated) question in sociology is the relationship between individual agency and social structures. Which determines which? Do individuals make choices that then lead to particular social structures? Or do social structures constrain what choices individuals make. In sociology, this is referred to as the debate between voluntarism and determinism. A middle way, that does not preference agency or structure, is structuration, proposed by Anthony Giddens.

Social theorists who give primacy to social structures will naturally advocate solving social problems with large government schemes and policies that seek to change the structures. On the other side, those who give primacy to individual agency are sceptical of such approaches, and consider progress can only occur through individuals, and small units such as families and communities make better choices. The structure/agency divide naturally maps onto political divisions of left versus right, liberal versus conservative, and the extremes of communist and libertarian. An emergentist perspective is balanced, affirming the importance of both structure and agency.

Key concepts in economics are equilibrium, division of labour, price, and demand. These are the outcomes of many interacting agents (individuals, companies, institutions, and government). Economies tend to self-organise. This is the "invisible hand" of Adam Smith. Thus, emergence is one of the most important concepts in economics. 

A big question is how the equilibrium state and the values of the associated state variables (e.g., prices, demand, division of labour, and wealth distribution) emerge from the interactions of the agents. In other words, what is the relationship between microeconomics and macroeconomics?

What are the implications for public policy? What will lead to the best outcomes (usually assumed to be economic growth and prosperity for "all")? Central planning (or at least some government regulation) is pitted against laissez-faire. For reasons, similar to the Ising and sociology cases, an emergentist perspective is that the whole and the parts are inseparable. This is why there is no consensus on the answers to specific questions such as, can government stimulus spending move an economy out of a recession? Keynes claimed it could but the debate rages on.

An emergentist perspective tempers expectations about the impact of agency, both individuals and government. It is hard to predict how a complex system with emergent properties will respond to perturbations such as changes in government policy. This is the "law" of unintended consequences.

“The curious task of economics is to demonstrate to men how little they really know about what they imagine they can design.”

Friedrich A. HayekThe Fatal Conceit: The Errors of Socialism

I think this cuts both ways. This is also reason to be skeptical about those (such as Hayek's disciples) who think they can "design" a better society by just letting the market run free.

Thursday, November 2, 2023

Diversity is a common characteristic of emergent properties

Consider a system composed of many interacting parts. I take the defining characteristic of an emergent property is novelty. That is, the whole has a property not possessed by the parts alone. I argue that there are five other characteristics of emergent properties. These characteristics are common but they are neither necessary nor sufficient for novelty.

1. Discontinuities

2. Unpredictability

3. Universality

4. Irreducibility

5. Modification of parts and their relations

I now add another characteristic.

6. Diversity

Although a system may be composed of only a small number of different components and interactions, the large number of possible emergent states that the system can take is amazing. Every snowflake is different. Water is found in 18 distinct solid states. All proteins are composed of linear chains of 20 different amino acids. Yet in the human body there are more than 100,000 different proteins and all perform specific biochemical functions. We encounter an incredible diversity of human personalities, cultures, and languages. 

A related idea is that "simple models can describe complex behaviour". Here "complex" is often taken to mean diverse. Examples, how simple Ising models with a few competing interactions can describe a devil's staircase of states or the multitude of atomic orderings found in binary alloys.

Perhaps the most stunning case of diversity is life on earth. Billions of different plant and animal species are all an expression of different linear combinations of the four base pairs of DNA: A, G, T, and C.

One might argue that this diversity is just a result of combinatorics. For example, if one considers a chain of just ten amino acids there are 10^13 different possible linear sequences. But this does not mean that all these sequences will produce a functional protein, i.e., one that will fold rapidly (one the timescale of milliseconds) into a stable tertiary structure, and one that can perform a useful biochemical function. 

Tuesday, October 24, 2023

Condensed matter physics in 15 minutes!

Oxford University Press has a nice podcast on Very Short Introductions. 

In each episode, an author of a specific volume has 10-15 minutes to introduce themself and answer several questions.

What is X [the subject of the VSI]?

What got you first interested in X?

What are the key aspects of X that you would like everyone to know?

The ones I have listened to and particularly liked are Infinity, Philosophy of Science, Evangelicalism, Development, Consciousness, Behavioural Economics, and Modern China.

Tomorrow, I am recording an episode for Condensed Matter Physics: A Very Short Introduction.

Here is a practise version of the audio and the draft text is below. 

I welcome feedback.

VSI Podcast 

I am Ross McKenzie. I am an Emeritus professor of physics at the University of Queensland in Brisbane, Australia. I have spent the past forty years learning, teaching, and researching condensed matter physics. I really love the Very Short Introduction series and so I am delighted to share my experience by writing Condensed Matter Physics: A Very Short Introduction.

What is condensed matter physics? It is all about states of matter. At school, you were probably taught that there are only three states of matter: solid, liquid, and gas. This is wrong. There are many more states such as liquid crystal, glass, superconductor, ferromagnet, and superfluid. New states of matter are continually, and often unexpectedly, being discovered. Condensed matter physics investigates how the distinct physical properties of states of matter emerge from the atoms of which a material is composed.

What first got me interested in condensed matter physics?

After I finished an undergraduate degree in theoretical physics in Australia in 1982, I would not have been able to answer the question, “what is condensed matter physics?”, even though it is the largest sub-field of physics. I then went to Princeton University in the USA to pursue a Ph.D. in and I took an exciting course on the subject and began to interact with students and faculty working in the field. 

At Princeton was Phil Anderson, who had won a Nobel Prize in physics for work in condensed matter. At the time I did not appreciate his much broader intellectual legacy. In his recent biography of Anderson, Andrew Zangwill states “more than any other twentieth-century physicist, he [Anderson] transformed the patchwork of ideas and techniques formerly called solid-state physics into the deep, subtle, and intellectually coherent discipline known today as condensed matter physics.” Several decades later, my work became richer as Anderson gave me an appreciation of the broader scientific and philosophical significance of condensed matter physics, particularly its connection to other sciences, such as biology, economics, and computer science. When do quantitative differences become qualitative differences? Can simple models describe rich and complex behaviour? What is the relationship between the particular and the universal? How is the abstract related to the concrete?

So what are the key aspects of condensed matter physics that I would like everyone to know?

First, there are many different states of matter. It is not just solid, liquid, and gas. Consider the “liquid crystals” that are the basis of LCDs (Liquid Crystal Displays) in the screens of televisions, computers, and smartphones. How can something be both a liquid and a crystal? A liquid crystal is a distinct state of matter. Solids can be found in many different states. In everyday life, ice means simply solid water. But there are in fact eighteen different solid states of water, depending on the temperature of the water and the pressure that is applied to the ice. In each of these eighteen states, there is a unique spatial arrangement of the water molecules and there are qualitative differences in the physical properties of the different solid states.

Condensed matter physics is concerned with characterising and understanding all the different states of matter that can exist. These different states are called condensed states of matter. The word “condensed’’ is used here in the same sense as when we say that steam condenses into liquid water. Generally, as the temperature is lowered or the pressure is increased, a material can condense into a new state of matter. Qualitative differences distinguish the many different states of matter. These differences are associated with differences in symmetry and ordering.

Second, condensed matter physics involves a particular approach to understanding properties of materials. Every day we encounter a diversity of materials: liquids, glass, ceramics, metals, crystals, magnets, plastics, semiconductors, and foams. These materials look and feel different from one another. Their physical properties vary significantly: are they soft and squishy or hard and rigid? Shiny, black, or colourful? Do they absorb heat easily? Do they conduct electricity? The distinct physical properties of different materials are central to their use in technologies around us: smartphones, alloys, semiconductor chips, computer memories, cooking pots, magnets in MRI machines, LEDs in solid-state lighting, and fibre optic cables. Why do different materials have different physical properties? 

Materials are studied by physicists, chemists, and engineers, and the questions, focus, goals, and techniques of researchers from these different disciplines can be quite different. The focus of condensed matter physics is on states of matter. Condensed matter physics as a research field is not just defined by the objects that it studies (states of matter in materials), but rather by a particular approach to the study of these objects. The aim is to address fundamental questions and to find unifying concepts and organizing principles to understand a wide range of phenomena in materials that are chemically and structurally diverse. 

The central question of condensed matter physics is, how do the properties of a state of matter emerge from the properties of the atoms in the material and their interactions? 

Let’s consider a concrete example, that of graphite and diamond. While you will find very cheap graphite in lead pencils, you will find diamonds in jewelery. Both graphite and diamond are composed solely of carbon atoms. They are both solid. So why do they look and feel so different?  Graphite is common, black, soft, and conducts electricity moderately well. In contrast, diamond is rare, transparent, hard, and conducts electricity very poorly. We can zoom in down to the scale of individual atoms using X-rays and find the spatial arrangement of the carbon atoms relative to one another. These arrangements are qualitatively different in diamond and graphite.. Diamond and graphite are distinct solid states of carbon. They have qualitatively different physical properties, at both the microscopic and the macroscopic scale. 

Third, I want you to know about superconductivity, one of the most fascinating states of matter. I have worked on it many times over the past forty years. Superconductivity occurs in many metals when they are cooled down to extremely low temperatures, close to absolute zero (-273 ºC). In the superconducting state, a metal can conduct electricity perfectly; without generating any heat. This state also expels magnetic fields meaning one can levitate objects, whether sumo wrestlers or trains. 

The discovery of superconductivity in 1911 presented a considerable intellectual challenge: what is the origin of this new state of matter? How do the electrons in the metal interact with one another to produce superconductivity? Many of the greatest theoretical physicists of the twentieth century took up this challenge but failed. The theoretical puzzle was only solved 46 years after the experimental discovery. The theory turns out also to be relevant to liquid helium, nuclear physics, neutron stars, and the Higgs boson. New superconducting materials and different superconducting states continue to be discovered. A “holy grail” is to find a material that can superconduct at room temperature. 

I find superconductivity even more interesting when considering quantum effects. By 1930 it was widely accepted that quantum theory, in all its strangeness, describes the atomic world of electrons, protons, and photons. However, this strangeness does not show itself in the everyday world of what we can see and touch. You cannot be in two places at the same time. Your cat is either dead or alive. However, condensed matter physicists have shown that the boundary between the atomic and macroscopic worlds is not so clear cut. A piece of superconducting metal can take on weird quantum properties, just like a single atom, even though the metal is made of billions of billions of atoms. It is in two states at the same time, almost like Schrodinger’s famous cat.

Fourth, condensed matter physics is all about emergence; the whole is greater than the sum of the parts. A system composed of many interacting parts can have properties that are qualitatively different from the properties of the individual parts. Water is wet, but a single water molecule is not. Your brain is conscious, but a single neuron is not. Such emergent phenomena occur in many fields, from biology to computer science to sociology, leading to rich intellectual connections. Condensed matter physics is arguably the field with the greatest success at understanding emergent phenomena in complex systems, particularly at the quantitative level. This is not because condensed matter physicists are smarter than sociologists, economists, or neuroscientists. It is because the materials we study are much “simpler” than societies, economies, and brains. 

Finally, condensed matter physics is one of the largest and most vibrant sub-fields of physics. For example, in the past thirty years, the Nobel Prize in Physics has been awarded thirteen times for work on condensed matter. In the past twenty years, eight condensed matter physicists have received the Nobel Prize in Chemistry. 

I hope I have sparked your interest in condensed matter physics. I invite you to learn more about why I consider this field of science significant, beautiful, and profound. 

Friday, October 20, 2023

Opening the door for women in science

 I really liked reading Transcendent Kingdom by Yaa Gyasi. She is an amazing writer. I recently reread some of it for an extended family book club. Just check out some of these quotes. 

A colleague suggested I might like Lessons in Chemistry, a novel by Bonnie Garmus. I have not read the book yet, but I have watched the first two episodes of the TV version on AppleTV. I watched the first episode for free.

The show contains a good mix of humour, love of science, and feminism. The chemistry dialogue seems to be correct. The show chronicles just how in the 1950s how awful life was for a young woman who aspired to be a scientist. Things have improved. But there is still a long way to go... 

Tuesday, October 17, 2023

Could faculty benefit from a monastery experience?

A few months ago, the New York Times ran a fascinating Guest Opinion, Why Universities Should Be More Like Monasteries by Molly Worth, a historian at University of North Carolina. She describes a popular undergraduate course, the "monk" class, at the University of Pennsylvania. 

On the first day of class — officially called Living Deliberately — Justin McDaniel, a professor of Southeast Asian and religious studies, reviewed the rules. Each week, students would read about a different monastic tradition and adopt some of its practices. Later in the semester, they would observe a one-month vow of silence (except for discussions during Living Deliberately) and fast from technology, handing over their phones to him.

This got me wondering about whether universities and funding agencies might experiment with similar initiatives for faculty. It might be a bit like the Aspen Center for Physics and Gordon Research Conferences were before the internet. Faculty would surrender their phones and have the internet disabled on their computers. For one week they would be required to spend their time reading, writing, and thinking. Exercise, daydreaming, doodling, and just having fun are to be encouraged. No administrative work or grant writing. The emphasis would be coming up with new ideas, not finishing off old work. For one hour per day, participants could meet with others and talk about their new ideas. I think it might be refreshing, reinvigorating, and highly productive (in the true sense of the word).

After trialling the program for a week, why not then try it for month-long periods.

These are the conditions under which Newton wrote the Principia and Darwin The Origin of Species. In their case, they did it over a period of several years.

What do you think?

Friday, October 13, 2023

Emergent abilities in AI: large language models

The public release of ChatGPT was a landmark that surprised many people, both in the general public and researchers working in Artificial Intelligence. All of a sudden it seemed Large Language Models had capabilities that some thought were a decade away or even not possible. It is like the field underwent a "phase transition." This idea turns out to be more than just a physics metaphor. It has been made concrete and rigorous in the following paper.

Emergent Abilities of Large Language Models

Jason Wei, Yi Tay, Rishi Bommasani, Colin Raffel, Barret Zoph, Sebastian Borgeaud, Dani Yogatama, Maarten Bosma, Denny Zhou, Donald Metzler, Ed H. Chi, Tatsunori Hashimoto, Oriol Vinyals, Percy Liang, Jeff Dean, William Fedus

They use the following definition, "Emergence is when quantitative changes in a system result in qualitative changes in behavior," citing Phil Anderson's classic "More is Different" article. [Even though the article does not contain the word emergence]. 

In this paper, we will consider a focused definition of emergent abilities of large language models: 

An ability is emergent if it is not present in smaller models but is present in larger models.

How does one define the "size" or "scale" of a model? Wei et al., note that "Today’s language models have been scaled primarily along three factors: amount of computation, number of model parameters, and training dataset size."

The essence of the analysis in the paper is summarised as follows.

We first discuss emergent abilities in the prompting paradigm, as popularized by GPT-3 (Brown et al., 2020). In prompting, a pre-trained language model is given a prompt (e.g. a natural language instruction) of a task and completes the response without any further training or gradient updates to its parameters.

An example of a prompt is shown below 

Brown et al. (2020) proposed few-shot prompting, which includes a few input-output examples in the model’s context (input) as a preamble before asking the model to perform the task for an unseen inference-time example. 

 The ability to perform a task via few-shot prompting is emergent when a model has random performance until a certain scale, after which performance increases to well-above random. 

An example is shown in the Figure below. The horizontal axis is the number of training FLOPs for the model, a measure of model scale. The vertical axis measures the accuracy of the model to perform a task, Modular Arithmetic, for which the model was not designed, but just given two-shot prompting. The red dashed line is the performance for a random model. The purple data is for GPT-3 and the blue for LaMDA. Note how once the model scale reaches about 10^22 there is a rapid onset of ability.

The Figure below summarises recent results from a range of research groups studying five different language model families. It shows eight different emergent abilities.

Wei et al., point out that "there are currently few compelling explanations for why such abilities emerge the way that they do".

The authors have encountered some common characteristics of emergent properties. They are hard to predict or anticipate before they are observed. They are often universal, i.e., they can occur in a wide range of different systems and are not particularly sensitive to the details of the components. Even after emergent properties are observed, it is still hard to explain why they occur, even when one has a good understanding of the properties of the system at a smaller scale. Superconductivity was observed in 1911 and only explained in 1957 by the BCS theory.

On the positive side, this paper presents hope that computational science and technology are at the point where AI may produce more exciting capabilities. On the negative side, there is also the possibility of significant societal risks such as having unanticipated power to create and disseminate false information, bias, and toxicity.

Aside: One thing I found surprising is that the authors did not reference John Holland, a computer scientist, and his book, Emergence.

I thank Gerard Milburn for bringing the paper to my attention.

Thursday, September 28, 2023

Gravitational waves and ultra-condensed matter physics

In 2016, when I saw the first results from the LIGO gravitational wave interferometer my natural caution and skepticism kicked in. They had just observed one signal in an incredibly sensitive measurement. A lot of data analysis was required to extract the signal from the background noise. That signal was then fitted the results of numerical simulations of the solutions to Einstein's gravitational field equations describing the merger of two black holes. Depending on how you count about 15 parameters are required to specify the parameters of the binary system [distance from earth, masses, relative orientations of orbits, .... The detection events involve displacement of the mirrors in the interferometer by about 30 picometres!

What on earth could go wrong?!

After all, this was only two years after the BICEP2 fiasco which claimed to have detected anisotropies in the cosmic microwave background due to gravitational waves associated with cosmic inflation. The observed signal turned out to be just cosmic dust! It led to a book, by the cosmologist Brian Keating, Losing the Nobel Prize: A Story of Cosmology, Ambition, and the Perils of Science’s Highest Honor

Well, I am happy to be wrong, if it is good for science. Now almost one hundred gravitational wave events have been observed and one event GW170817 has been correlated with an x-ray observation.

But detecting some gravitational waves is quite a long way from gravitational wave astronomy, i.e, using gravity wave detectors as a telescope, in the same sense as the regular suite of optical, radio, X-ray, ... detectors. I was also skeptical about that. But it does not seem that gravity wave detectors are providing a new window into the universe.

A few weeks ago I heard a very nice UQ colloquium by Paul Lasky, What's next in gravitational wave astronomy?

Paul gave a nice overview of the state of the field, both past and future. 

A key summary figure is below. It shows different possible futures when two neutron stars merge.

The figure is taken from the helpful review

The evolution of binary neutron star post-merger remnants: a review, Nikhil Sarin and Paul D. Lasky

A few of the things that stood out to me.

1. One stunning piece of physics is that in the black hole mergers that have been observed the combined mass of the resulting black hole is three solar masses less than the total mass of the two separate black holes. The resulting loss of mass energy (E=mc^2) of three solar masses is converted into gravitational wave energy within seconds. During this time the peak radiant power was more than fifty times the power of all the stars in the observable universe combined!

I have fundamental questions about a clear physical description of this energy conversion process. First, defining "energy" in general relativity is a vexed and unresolved question with a long history. Second, is there any sense in which needs to describe this in terms of a quantum field theory: specifically conversion of neutron matter into gravitons?

2. Probing nuclear astrophysics in neutron stars. It may be possible to test the equation of state (relation between pressure and density) of nuclear matter. This determines the Tolman–Oppenheimer–Volkoff limit; the upper bound to the mass of cold, non-rotating neutron stars. According to Sarin and Lasky

The supramassive neutron star observations again provide a tantalising way of developing our understanding of the dynamics of the nascent neutron star and the equation of state of nuclear matter (e.g., [37,121,127–131]). The procedure is straight forward: if we understand the progenitor mass distribution (which we do not), as well as the dominant spin down mechanism (we do not understand that either), and the spin-down rate/braking index (not really), then we can rearrange the set of equations governing the system’s evolution to find that the time of collapse is a function of the unknown maximum neutron star mass, which we can therefore infer. This procedure has been performed a number of times in different works, each arriving at different answers depending on the underlying assumptions at each of the step. The vanilla assumptions of dipole vacuum spin down of hadronic stars does not well fit the data [37,127], leading some authors to infer that quark stars, rather than hadronic stars, best explain the data (e.g., [129,130]), while others infer that gravitational radiation dominates the star’s angular momentum loss rather than magnetic dipole radiation (e.g [121,127]).

As the authors say, this is a "tantalising prospect" but there are many unkowns. I appreciate their honesty. 

3. Probing the phase diagram of Quantum Chromodynamics (QCD)

This is one of my favourite phase diagrams and I used to love to show it to undergraduates.


Neutron stars are close to the first-order phase transition associated with quark deconfinement.

When the neutron stars merge it may be that the phase boundary is crossed.

Thursday, September 14, 2023

Listing mistakes in Condensed Matter Physics: A Very Short Introduction

Someone told me that the day after your book is published you will start finding errors. They were correct.

Here are the first errors I have become aware of.

On Page 2 I erroneously state that diamond "conducts electricity and heat very poorly."

However, the truth about conduction of heat is below, taken from the opening paragraph of this paper.

Diamond has the highest thermal conductivity, L, of any known bulk material. Room-temperature values of L for isotopically enriched diamond exceed 3000 W/m-K, more than an order of magnitude higher than common semiconductors such as silicon and germanium. In diamond, the strong bond stiffness and light atomic mass produce extremely high phonon frequencies and acoustic velocities. In addition, the phonon-phonon umklapp scattering around room temperature is unusually weak.

Figure 2 on page 4 has a typo. Diamond is "hard" not "hand".

On page 82 I erroneously state that for the superfluid transition, the "critical exponent alpha was determined to have a value of -0.0127, that is to five significant figures."  The value actually has three significant figures. 

I thank my engineering friend, Dave Winn, for pointing out the first and third errors.

Please do write other errors in the comments below. This will help with future revisions.

Monday, September 11, 2023

Amazing things about Chandrasekhar's white dwarf mass limit

This is ultra-condensed matter physics!

In 1931, Subrahmanyan Chandrasekhar published a seminal paper, for which he was awarded the Nobel Prize in 1983. He showed that a white dwarf star must have a mass less than 1.4 solar masses, otherwise it will collapse under gravity. White dwarfs are compact stars for which the nuclear fuel is spent and electron degeneracy pressure prevents gravitational collapse. 

The blog Galileo unbound has a nice post about the history and the essential physics behind the paper.

There are a several things I find quite amazing about Chandrasekhar's derivation  and the expression for the maximum possible mass. 

m_H is the mass of a proton. M_P is the Planck mass. The value of the mass limit is about 1.4 solar masses.

Relativity matters

If the electrons are treated non-relativistically then there is no mass limit. However, when the star becomes dense enough the Fermi velocity of the electrons approaches the speed of light. Then relativistic effects must be included.

A macroscopic quantum effect

Degeneracy pressure is a macroscopic quantum effect. The expression above involves Planck's constant.

Quantum gravity

The mass formula involves the Planck mass, M_P.  On the one hand, this phenomena does not involve quantum gravity because there is no quantisation of the gravitational field. On the other hand, the effect does involve the interplay of gravity and quantum physics.

A "natural" scale

Formula that involve Planck scales usually represent scales of length, energy, mass, time, and temperature that are "unreal", i.e., they are vastly different from terrestial and astrophysical phenomena. For example, the temperature of the Hawking radiation from a black hole of one solar mass is 60 nanoKelvin. 

In contrast, the limiting mass is on the same scale as that of our own sun!

It agrees with astronomical observations

Determinations of the masses of hundreds of white dwarfs show most have a mass of about 0.5 solar masses and the highest observed value is 1.3 solar masses.


Thursday, September 7, 2023

Hollywood and a Physical Review paper

 I am not sure I have seen this before. If you watched the movie Oppenheimer, you may have noticed that a one point a student excitedly showed Oppenheimer the latest issue of Physical Review and the following image flashed across the movie screen.


J. R. Oppenheimer and H. Snyder

A beautiful blog post just appeared on 3 Quarks Daily,
 

The post describes the scientific and historical significance of the paper, including how it attracted no interest for twenty years, being eclipsed by a paper in the same issue of Physical Review.

Niels Bohr and John Archibald Wheeler

Have you ever seen a Hollywood movie that explicitly showed the page of a scientific journal article.

Saturday, September 2, 2023

Condensed Matter Physics: A Very Short Introduction (hard copies) now available on Amazon USA

My book has finally been released by Amazon in the USA. I don't like Amazon but it is cheap and you can avoid shipping charges.

In Australia Amazon has listed under "Engineering and Transportation" and is currently out of stock. 

Wednesday, August 16, 2023

Majorana: mysterious disappearance of a particle and of credibility

The theoretical physicist Ettore Majorana mysteriously disappeared in 1938. Unfortunately, Majorana particles are also going to be associated in history with some mysterious disappearances: their own existence, the prospect of a topological quantum computer, some prominent scientists' reputations, and the credibility of Physical Review journals.

Nine years ago I expressed skepticism that there would ever be a quantum computer based on Majorana fermions. I wish I was wrong. It certainly would be cool. Things are even worse than I thought. The issues (scientific, ethical, technological, hype, ...) were recently highlighted in a strange incident involving a paper from Microsoft that was published by PRB.

I found the commentary of Vincent Mourik on the whole incident enlightening and disturbing. His description is "Here's the full background of my involvement with the recent Microsoft Quantum paper. APS pulled off an arcane unofficial off-the-record peer review already one year ago when it was presented at PRX. And then published it anyway at PRB..."

I do have concerns about the unusual practice used by PRX and the precedent of publishing a paper with incomplete details. However, my more significant concern is that, based on Vincent's report, this paper should never have been published in any self-respecting scientific journal. I fear that in order to "compete" with the luxury journals Physical Review has descended to their low scientific and ethical standards.

I thank Doug Natelson and a commenter on his blog for bringing this sorry saga to my attention.

Monday, July 31, 2023

What is a complex system?

What do we mean when we say a particular system is "complex"? We main have some intuition that it means there are many degrees of freedom and/or that it is hard to understand. "Complexity" is sometimes used as a buzzword, just like "emergence." There are many research institutes that claim to be studying "complex systems" and there is something called "complexity theory". Complexity seems to mean different things to different people.

I am particularly interested in understanding the relationship between emergence and complexity. To do this we first need to be more precise about what we mean by both terms. A concrete question is the following. Consider a system that exhibits emergent properties. Often that will be associated with a hierarchy of scales. For example: atoms, molecules, proteins, DNA, genes, cells, organs, people. The corresponding hierarchy of research fields is physics, chemistry, biochemistry, genetics, cell biology, physiology, psychology. Within physics a hierarchy is quarks and leptons, nuclei and electrons, atoms, molecules, liquid, and fluid. 

In More is Different, Anderson states that as one goes up the hierarchy the system scale and complexity increases. This makes sense when complexity is defined in terms of the number of degrees of freedom in the system (e.g., the size of the Hilbert space needed to describe the complete state of the system). On the other hand, the system state and its dynamics become simpler as one goes up the hierarchy.  The state of the liquid can be described completely in terms of the density, temperature, and the equation of state. The dynamics of the fluid can be described by the Navier-Stokes equation. Although that is hard to solve in the regime of turbulence, the system is still arguably a lot simpler than quantum chromodynamics (QCD)! Thus, we need to be clearer about what we mean by complexity.

To address these issues I found the following article very helpful and stimulating.

What is a complex system? by James Ladyman, James Lambert, and Karoline Wiesner 

It was published in 2013 in a philosophy journal, has been cited more than 800 times, and is co-authored by two philosophers of science and a physicist.

[I just discovered that Ladyman and Wiesner published a book with the same title in 2020. It is an expansion of the 2013 article.].

In 1999 the journal Science had a special issue that focussed on complex systems, with an Introduction entitled, Beyond Reductionism. Eight survey articles covered complexity in physics, chemistry, biology, earth science, and economics.

Ladyman et al., begin by pointing out how each of the authors of these articles chooses different properties to define what complexity is associated. These characteristics include non-linearity, feedback, spontaneous order, robustness and lack of central control, emergence, hierarchical organisation, and numerosity.

The problem is that these characteristics are not equivalent. If we do choose a specific definition for a complex system, the difficult problem then remains of determining whether each of the characteristics above is necessary, sufficient, both, or neither for the system to be complex (as defined). This is similar to what happens with attempts to define emergence.

Information content is sometimes used to quantify complexity. Shannon entropy and Kolmogorov complexity (Sections 3.1, 3.2) are discussed. The latter is also known as algorithmic complexity. This is the length of the shortest computer program (algorithm) that can be written to produce the entity as output. A problem with both these measures are they are non-computable.

Deterministic complexity is different from statistical complexity (Section 3.3). A deterministic measure treats a completely random sequence of 0s and 1s as having maximal complexity. A statistical measure treats a completely random sequence as having minimal complexity. Both Shannon and algorithmic complexity are deterministic.

Section 4 makes some important and helpful distinctions about different measures of complexity.

3 targets of measures: methods used, data obtained, system itself

3 types of measures: difficulty of description, difficulty of creation, or degree of organisation

They then review three distinct measures that have been proposed logical depth (Charles Bennett), thermodynamic depth (Seth Lloyd and Heinz Pagels), and effective complexity (Murray Gell-Mann).

Logical depth and effective complexity are complementary quantities. The Mandelbrot set is example of a system (set of data) that exhibits a complex structure that has a high information content. It is difficult to describe. It has a large logical depth.

Created by Wolfgang Beyer with the program Ultra Fractal 3. 

On the other hand, the effective complexity of the set is quite small since it can be generated using the simple equation

z_n+1 = c + z_n^2

c is a complex number and the Mandelbrot set is the values of c for which the iterative map is bounded.

Ladyman et al, prefer the definition of a complex system below, but do acknowledge its limitations.

(Physical account) A complex system is an ensemble of many elements which are interacting in a disordered way, resulting in robust organisation and memory. 

(Data-driven account) A system is complex if it can generate data series with high  statistical complexity. 

What is statistical complexity? It relates to degrees of pattern and some they refer to as causal state reconstruction. It is applied to data sets, not systems or methods. Central to their definition is the idea of the epsilon-machine, something introduced in a long and very mathematical article from 2001, Computational Mechanics: Pattern and Prediction, Structure and Simplicity, by Shalizi and Crutchfield.

The article concludes with a philosophical question. Do patterns really exist? This relates to debates about scientific realism versus instrumentalism. The authors advocate something known as "rainforest realism", that has been advanced by Daniel Dennett, Don Ross, and David Wallace. A pattern is real if one can construct an epsilon-machine that can simulate the phenomena and predict its behaviour.

I don't have a full appreciation or understanding of where the article ends up. Nevertheless, the journey there is helpful as it clarifies some of the subtleties and complexities (!) of trying to be more precise about what we mean by a "complex system".

Saturday, July 22, 2023

A few things condensed matter physics has taught me about science (and life)

We all have a worldview, some way that we look at life and what we observe. There are certain assumptions we tend to operate from, often implicitly. Arguably, our worldview is shaped by our experiences: family, friendships, education, jobs, community organisations, and our cultural context (political, economic, and social).

A significant part of my life experience has been working in universities as a condensed matter physicist and being part of a broader scientific community. Writing a Condensed Matter Physics: A Very Short Introduction crystallised some of my thoughts about what CMP might mean in broader contexts. I am more aware of how my experience in CMP has had a significant influence on the way I view not just the scientific enterprise, but also broader philosophical and social issues. Here are a few concrete examples.

Complex systems. The objects studied in condensed matter physics have many interacting components (atoms). Further, there is an incredible diversity of systems (materials and phenomena) that are studied. Many different properties and parameters are needed to characterise a system and its possible states. There are many different ways of investigating each system. Similarly, almost everything else of interest in science and life is a complex system.

Emergence. This is central to CMP. The whole is greater than the sum of the parts. The whole is qualitatively different from the parts. Related features include robustness, universality, surprises, and the difficulty of making predictions. An emergent perspective can provide insights into other complex systems: from biology to psychology to politics.

Differentiation and integration. A key aspect of describing and understanding a complex system is conceptually breaking it into smaller parts (differentiation), determining how those parts interact with one another, and determining how those interacting parts combine to produce properties of the whole system (integration).

Diversity: The value of multiple perspectives and methods. Due to the complexity of condensed matter systems, multiple methods are needed to characterise their different properties. Due to emergence, there are various scales and hierarchies present. Investigating and describing the system at these different scales provides different perspectives on the system. What does the scientist do with all these different perspectives? Interpretation and synthesis are needed. That is not an easy or clearcut enterprise.

 Navigating the middle ground. The most interesting CMP occurs in an intermediate interaction regime that is challenging theoretically. Insight can be gained by considering two extremes that are more amenable to analysis: weak interaction and strong interaction. I had fun using conservative-liberal political tensions as a metaphor for divisions in the strongly correlated electron community.

The Art of Interpretation. Everything requires interpretation: a phone text message, a newspaper article, a novel, a political event, data from a science experiment, and any scientific theory. With interpretation, we assign meaning and significance to something. How we do this is complex and draws on our worldview, both explicitly and implicitly. Regardless of our best intentions, interpretation always has subjective elements.

Synthesis. Given the diversity of data, perspectives, and interpretation, it is a challenge to synthesise them into some coherent and meaningful whole. All the pieces are rarely consistent with one another. Some will be ignored, some discarded, some considered peripheral, and others central. This synthesis is also an act of interpretation.

All models are wrong but some are useful. One way to understand complex systems is in terms of "simple" models that aim to capture the essential features of certain phenomena. In CMP significant progress (and many Nobel Prizes) has resulted from the proposal and study of such models. There is a zoo of them. Many are named after their main inventor or proponent: Ising, Anderson, Hubbard, Heisenberg, Landau, BCS,... All theories in CMP are also models since they involve some level of approximation, at least in their implementation. These models are all wrong, in the sense that they fail to describe all features and phenomena of the system. But, the best models are useful. Their simplicity makes them amenable to understanding, mathematical analysis, or computer simulation. Furthermore, the models can give insight into the essential physics underlying phenomena, predict trends, or be used to analyse experimental data. 

The autonomy of academic disciplines. Reality is stratified. At each level of the hierarchy, one has unique phenomena, methods, concepts, and theories. Most of these are independent of the details of what happens at lower levels of the hierarchy. Given the richness at each level, I do not preference one discipline as more fundamental or important than the others.

Pragmatic limits to knowledge. We know so much.  We know so little. On the one hand, it is amazing to me how successful CMP has been. We have achieved an excellent understanding, at least qualitatively of many emergent phenomena in systems that are chemically and structurally complex (e.g., liquid crystals and superconductivity in crystals involving many chemical elements). On the other hand, there are systems such as glasses and cuprate superconductors that have been incredibly resistant to understanding. Good research is very hard, even for the brilliant. Gains are often incremental and small. This experience leads me to have sober expectations about what is possible, particularly as one moves from CMP to more complex systems such as human societies, national economies, and brains.

Science is a human endeavour. Humans can be clever, creative, insightful, rational, objective, cooperative, fiercely independent and capable of great things. The achievements of science are a great testimony to the human spirit. Humans can also be stubborn, egotistical, greedy, petty, irrational, ruthlessly competitive, and prone to fads, mistakes and social pressures. Science always happens in a context: social, political, cultural, and economic. Context does not determine scientific outcomes but due to human nature, it can corrupt how science is done.

The humanity of scientists leads to a lack of objectivity captured in Walter Kauzmann's maxim: people will tend to believe what they want to believe rather than what the evidence before them suggests that they should believe. My decades of experience working as a scientist leads me to scepticism about extravagant claims that some scientists make, particularly hype about the potential significance (scientific, technological, or philosophical) of their latest discovery or their field of research. Too often such claims do not stand the test of time.

Humility. This brings together practically everything above. The world is complex, people are complex, and human-world interactions are complex. It is easy to be wrong. We often have a pretty limited perspective of what is going on. 





Tuesday, July 4, 2023

Are gravity and spacetime really emergent in AdS-CFT?

There is an interesting Scientific American article by Adam Becker

What Is Spacetime Really Made Of?

Spacetime may emerge from a more fundamental reality. Figuring out how could unlock the most urgent goal in physics—a quantum theory of gravity

It considers two different approaches to quantum gravity (loop quantum gravity and AdS-CFT beloved by string theorists). Compared to some Scientific American articles it is moderately balanced and low on hype. The article has a nice engagement with some philosophers of physics. It is clear to me how loop quantum gravity has a natural interpretation that gravity and space-time are emergent. However, that is not clear for AdS-CFT.

 The following paragraph is pertinent.

But there are other ways to interpret the latest findings. The AdS/CFT correspondence is often seen as an example of how spacetime might emerge from a quantum system, but that might not actually be what it shows, according to Alyssa Ney, a philosopher of physics at the University of California, Davis. 
“AdS/CFT gives you this ability to provide a translation manual between facts about the spacetime and facts of the quantum theory,” Ney says. “That’s compatible with the claim that spacetime is emergent, and some quantum theory is fundamental.” 
But the reverse is also true, she says. The correspondence could mean that quantum theory is emergent and spacetime is fundamental—or that neither is fundamental and that there is some even deeper fundamental theory out there. Emergence is a strong claim to make, Ney says, and she is open to the possibility that it is true. “But at least just looking at AdS/CFT, I’m still not seeing a clear argument for emergence.”

Monday, June 26, 2023

What is really fundamental in science?

What do we mean when we say something in science is fundamental? When is an entity or a theory more fundamental or less fundamental than something else? For example, are quarks and leptons more fundamental than atoms? Is statistical mechanics more fundamental than thermodynamics? Is physics more fundamental than chemistry or biology? In a fractional quantum Hall state, are electrons or the fractionally charged quasiparticles more fundamental?

Answers depend on who you ask. Physicists such as Phil Anderson, Steven Weinberg, Bob Laughlin, Richard Feynman, Frank Wilczek, and Albert Einstein have different views.

In 2017-8, the Foundational Questions Institute (FQXi) held an essay contest to address the question, “What is Fundamental?” Of the 200 entries, 15 prize-winning essays have been published in a single volume. The editors give a nice overview in the Introduction.

This post is mostly about the essay, Fundamental? of the first prize winner, Emily Adlam, a philosopher of physics. She contrasts two provocative statements.

Fundamental means we have won. The job is done and we can all go home.

Fundamental means we have lost. Fundamental is an admission of defeat.

This raises the question of whether being fundamental is objective or subjective.

Examples are given from scientific history to argue that what is considered to be fundamental has changed with time. The reductionism has led to the drive to explain everything in terms of smaller and smaller entities, that are deemed 'more fundamental". But we find that smaller does not always mean simpler.

Perhaps we should ask what needs explaining and what constitutes a scientific explanation. For example, Adlam asks whether explaining the fact that the initial state of the universe had a low entropy [the "past hypothesis"] is really possible or should be an important goal.

She draws on the issue of the distinction between objective and subjective probabilities. Probabilities in statistical mechanics are subjective: they are a statement about our own ignorance about the details of the motion of individual atoms and not any underlying randomness in nature. In contrast, probabilities in quantum theory reflect objective chance.

as realists about science we must surely maintain that there is a need for science to explain the existence of the sorts of regularities that allow us to make reliable predictions... but there is no similarly pressing need to explain why these regularities take some particular form rather than another. Yet our paradigmatic mechanical explanations do not seem to be capable of explaining the regularity without also explaining the form, and so increasingly in modern physics we find ourselves unable to explain either. 

It is in this context that we naturally turn to objective chance. The claim that quantum particles just have some sort of fundamental inbuilt tendency to turn out to be spin up on some proportion of measurements and spin down on some proportion of measurements does indeed look like an attempt to explain a regularity (the fact that measurements on quantum particles exhibit predictable statistics) without explaining the specific form (the particular sequence of results obtained in any given set of experiments). But given the problematic status of objective chance, this sort of nonexplanation is not really much better than simply refraining from explanation at all. 

Why is it that objective chances seem to be the only thing we have in our arsenal when it comes to explaining regularities without explaining their specific form? It seems likely that part of the problem is the reductionism that still dominates the thinking of most of those who consider themselves realists about science

In summary, (according to the Editors) Adlam argues that "science should be able to explain the existence of the sorts of regularities that allow us to make reliable predictions. But this does not necessarily mean that it must also explain why these regularities take some particular form." 

we are in dire need of another paradigm shift. And this time, instead of simply changing our attitudes about what sorts of things require explanation, we may have to change our attitudes about what counts as an explanation in the first place. 

Here, she is arguing that what is fundamental is subjective, being a matter of values and taste.

In our standard scientific thinking the fundamental is elided with ultimate truth: getting to grips with the fundamental is the promised land, the endgame of science. 

She then raises questions about the vision and hopes of scientific reductionists. 

In this spirit, the original hope of the reductionists was that things would get simpler as we got further down, and eventually we would be left with an ontology so simple that it would seem reasonable to regard this ontology as truly fundamental and to demand no further explanation. 

But the reductionist vision seems increasingly to have failed. 

When we theorise beyond the standard model [BSM] we usually find it necessary to expand the ontology still more: witness the extra dimensions required to make string theory mathematically consistent.

It is not just strings. Peter Woit has emphasised how BSM theories, such as supersymmetry, introduce many more particles and parameters.

... the messiness deep down is a sign that the universe works not ‘bottom-up’ but rather ‘top-down,’ ... in many cases, things get simpler as we go further up.

Our best current theories are renormalisable, meaning that many different possible variants on the underlying microscopic physics all give rise to the same macroscopic physical theory, known as an infrared fixed point. This is usually glossed as providing an explanation of why it is that we can do sensible macroscopic physics even without having detailed knowledge of the underlying microscopic theories. 

For example, elasticity theory, thermodynamics and fluid dynamics all work without knowing anything about atoms, statistical mechanics, and quantum theory.

But one might argue that this is getting things the wrong way round: the laws of nature don’t start with little pieces and build the universe from the bottom up, rather they apply simple macroscopic constraints to the universe as a whole and work out what needs to happen on a more fine-grained level in order to satisfy these constraints.

This is rather reminiscent of Laughlin's views about what is fundamental.

Finally, I mention two other essays that I look forward to reading as I think they make particularly pertinent points.

Marc Séguin (Chap. 6) distinguishes "between epistemological fundamentality (the fundamentality of our scientific theories) and ontological fundamentality (the fundamentality of the world itself, irrespective of our description of it)."

"In Chap. 12, Gregory Derry argues that a fundamental explanatory structure should have four key attributes: irreducibility, generality, commensurability, and fertility."

[Quotes are from the Introduction by the Editors].

Some would argue that the Standard Model is fundamental, at least on some level. But it involves 19 parameters that have to be fixed from experiment. Related questions about the Fundamental Constants, have been explored in a 2007 paper by Frank Wilczek.

Again, I thank Peter Evans for bringing this volume to my attention.

Saturday, June 17, 2023

Why do deep learning algorithms work so well?

I am interested in analogues between cognitive science and artificial intelligence. Emergent phenomena occur in both, there have been some fruitful cross-fertilisation of ideas, and the extent of the analogues is relevant to debates on fundamental questions concerning human consciousness.

Given my general ignorance and confusion on some of the basics of neural networks, AI, and deep learning, I am looking for useful and understandable resources.

Related questions are explored in a nice informative article from 2017 in Quanta magazine, New Theory Cracks Open the Black Box of Deep Learning by Natalie Wolchover.

Like a brain, a deep neural network has layers of neurons — artificial ones that are figments of computer memory. When a neuron fires, it sends signals to connected neurons in the layer above. During deep learning, connections in the network are strengthened or weakened as needed to make the system better at sending signals from input data — the pixels of a photo of a dog, for instance — up through the layers to neurons associated with the right high-level concepts, such as “dog.” 

After a deep neural network has “learned” from thousands of sample dog photos, it can identify dogs in new photos as accurately as people can. The magic leap from special cases to general concepts during learning gives deep neural networks their power, just as it underlies human reasoning, creativity and the other faculties collectively termed “intelligence.” 

Experts wonder what it is about deep learning that enables generalization — and to what extent brains apprehend reality in the same way.

The article describes work by Naftali Tishby and collaborators that provides some insight into why deep learning methods work so well. This was first described in purely theoretical terms in a 2000 preprint

The information bottleneck method, Naftali Tishby, Fernando C. Pereira, William Bialek 

The idea is that a network rids noisy input data of extraneous details as if by squeezing the information through a bottleneck, retaining only the features most relevant to general concepts.

Tishby was stimulated in new directions in

2014 after reading a surprising paper by the physicists David Schwab and Pankaj Mehta

 An exact mapping between the Variational Renormalization Group and Deep Learning 

[They] discovered that a deep-learning algorithm invented by Geoffrey Hinton called the “deep belief net” works, in a particular case, exactly like renormalization [group methods in statistical physics... When they]. applied the deep belief net to a model of a magnet at its “critical point,” where the system is fractal, or self-similar at every scale, they found that the network automatically used the renormalization-like procedure to discover the model’s state. 

Although this connection was a valuable new insight, the specific case of a scale-free system, is not relevant to many deep learning situations.

Tishby and Ravid Shwartz-Ziv discovered that 

Over the course of training, common patterns in the training data become reflected in the strengths of the connections, and the network becomes expert at correctly labeling the data, such as by recognizing a dog, a word, or a 1.

...layer by layer, the networks converged to the information bottleneck theoretical bound: a theoretical limit derived in Tishby, Pereira and Bialek’s original paper that represents the absolute best the system can do at extracting relevant information. At the bound, the network has compressed the input as much as possible without sacrificing the ability to accurately predict its label...

...deep learning proceeds in two phases: a short “fitting” phase, during which the network learns to label its training data, and a much longer “compression” phase, during which it becomes good at generalization, as measured by its performance at labeling new test data.

What these new discoveries teach us about the relationship between learning in humans and in machines is contentious and explored briefly in the article. Although neural nets were inspired by the structure of the human brain the connection with the neural nets used today is tenuous.

The mystery of how brains sift signals from our senses and elevate them to the level of our conscious awareness drove much of the early interest in deep neural networks among AI pioneers, who hoped to reverse-engineer the brain’s learning rules. AI practitioners have since largely abandoned that path in the mad dash for technological progress, instead slapping on bells and whistles that boost performance with little regard for biological plausibility.