Tuesday, February 24, 2026

Information theoretic measures for emergence and causality

The relationship between emergence and causation is contentious, with a long history. Most discussions are qualitative. Presented with a new system, how does one identify the microscopic and macroscopic scales that may be most useful for understanding and describing the system? Can Judea Pearl’s seminal ideas about causality be implemented practically for understanding emergence?

Broadly speaking, a weakness of discussions of emergence and causality is that it is hard to define these concepts in a rigorous and quantitative manner that makes them amenable to empirical testing, with respect to theoretical models and to experimental data. 

Fortunately, in the past decade, there have been some specific proposals to address this issue, mostly using information theory. A helpful recent review is by Yuan et al. 

“Two primary challenges take precedence in understanding emergence from a causal perspective. The first is establishing a quantitative definition of emergence, whereas the second involves identifying emergent behaviors or phenomena through data analysis.

To address the first challenge, two prominent quantitative theories of emergence have emerged in the past decade. The first is Erik Hoel et al.’s theory of causal emergence [19] whereas the second is Fernando E. Rosas et al.’s theory of emergence based on partial information decomposition [24].

Hoel et al.’s theory of causal emergence specifically addresses complex systems that are modeled using Markov chains. It employs the concept of effective information (EI) to quantify the extent of causal influence within Markov chains and enables comparisons of EI values across different scales [19,25]. Causal emergence is defined by the difference in the EI values between the macro-level and micro-level."

One perspective on causal emergence is that it occurs when the dynamics of a system at the macro-level is described more efficiently by macro-variables than by the dynamics of variables from the micro-level.

Klein et al. used Hoel’s information-theoretic measures of causal emergence to analyse protein interaction networks (interactomes) in over 1800 species, containing more than eight million protein–protein interactions, across different scales. They showed the emergence of ‘macroscales’ that are associated with lower noise and uncertainty. The nodes in the macroscale description of the network are more resilient than those in less coarse-grained descriptions. Greater causal emergence (i.e., a stronger macroscale description) was generally seen in multicellular organisms compared to single-cell organisms. The authors quantified causal emergence in terms of mutual information (between large and small scales) and effective information (a measure of the certainty in the connectivity of a network). Philip Ball (2023) (pages 218-220) gives an account of this work in terms of the emergence of multicellularity in biological evolution. He introduced the term causal spreading (pages 225-7), arguing that over the history of evolution the locus of causation has changed.

Yuan et al. continue

"However, in Hoel’s theory of causal emergence, it is essential to establish a coarse-graining strategy beforehand. Alternatively, the strategy can be derived by maximizing the effective information (EI) [19]. However, this task becomes challenging for large-scale systems due to the computational complexity involved. To address these problems, Rosas et al. introduced a new quantitative definition of causal emergence [24] that does not depend on coarse-graining methods, drawing from partial information decomposition (PID)-related theory. PID is an approach developed by Williams et al., which seeks to decompose the mutual information between a target and source variables into non-overlapping information atoms: unique, redundant, and synergistic information [29]…"

The Figure below is taken from Rosas et al. Xt^j (j=1,…,n) are microscopic variables that define a Markov chain. Vt is a macroscopic variable that is completely determined by the microscopic variables.

“Diagram of causally emergent relationships. Causally emergent features have predictive power beyond individual components. Downward causation takes place when that predictive power refers to individual elements; causal decoupling when it refers to itself or other high-order features.”

Rosas et al. applied the method to specific systems, including Conway’s Game of Life, Reynolds’ flocking model, and neural activity as measured by electrocorticography. More recently, it was used to describe emergence in computer science, including the identification of modular structures. Calculations were performed for specific examples, including Ehrenfest’s urn model for diffusion, the Ising model with Glauber dynamics, a Hopfield neural network model for associative memory.

Yuan et al. also state the following:

"The second challenge pertains to the identification of emergence from data. In an effort to address this issue, Rosas et al. derived a numerical method [24]. However, it is important to acknowledge that this method offers only a sufficient condition for emergence and is an approximate approach. Another limitation is that a coarse-grained macro-state variable should be given beforehand to apply this method."

Sas et al. recently stated

“Empirical applications of this framework to study emergence … including the study of gene regulatory networks [22], the dynamics of the human brain [23], the internal dynamics of reservoir computing [24], and the formation of useful internal representations in machine learning [25].”

Yuan et al. also discuss two significant connections between causal emergence and machine learning. First, machine learning can be used to improve calculations of causal emergence. Second, causal emergence measures can be used to better understand how machine learning works and improve it.

The work described above built on earlier work by Crutchfield, who claimed that the identification of emergence and hierarchies could be made operational, stating that “different scales are delineated by a succession of divergences in statistical complexity at lower levels.” More recently, Rupe and Crutchfield have reported progress towards identifying emergent self-organisation in a system.

Although this work on quantitative measures of emergence based on information theory represents significant progress, there are many open problems. Examples include the extension to non-Markovian systems and the development of computationally feasible methods for large systems. The latter is particularly important in physical systems where spontaneous symmetry breaking occurs, as this only happens in the thermodynamic limit of an infinite system.

There is an unrecognised similarity between the work described above and techniques recently developed to characterise phase transitions in statistical mechanics models such as the Ising model and classical dimer models. Coarse-graining (CG) is optimised by maximising the Real-Space Mutual Information (RSMI) between a spatial block and its distant environment. 

In general, maximising mutual information is notoriously hard but can be done using state-of-the-art machine learning algorithms. Gokmen et al. have developed an algorithm that they claim “can, unsupervised, construct order parameters, locate phase transitions, and identify spatial correlations and symmetries for complex and large-dimensional real-space data.” Furthermore, the optimal CG explicitly identifies the scaling operators associated with the critical point. 

The classical dimer model provides a stringent test as “the relevant low-energy degrees of freedom are profoundly different from the microscopic building blocks of the theory and change qualitatively throughout the phase diagram.” In other words, the emergent entities (quasiparticles such as vortices associated with the height field, which is described by a sine-Gordon field theory) are different from the dimers.

It is encouraging to see that two different scientific communities have developed similar ideas to address this challenging problem of making discussions about emergence and causality more concrete and quantitative.

Friday, February 13, 2026

A golden age for precision observational cosmology

Yin-Zhe Ma gave a nice physics colloquium at UQ last week, A Golden Age for Cosmology

I learnt a lot. Too often, colloquia are too specialised and technical for a general audience.

There are three pillars of experimental evidence for the Big Bang model: Hubble expansion of the universe, relative abundance of light nuclei due to nucleosynthesis in the first few minutes, and the Cosmic Microwave Background.

Ma showed Hubble's original data from 1929 for redshift versus distance of galaxies. There was a lot of noise in the data. Nevertheless, Hubble was right.

Big Bang Nucleosynthesis

This was first proposed in 1948 by Ralph Alpher and George Gamow. (Hans Bethe was an honorary author of the paper as a joke so that the author list would sound like the first three letters of the Greek alphabet. Gamow had a mischievous sense of humour.)

The chain of nuclear reactions that will produce the lightest elements and isotopes is shown below.

Because the binding energy of 4He is so large, it could have only been formed at an extremely high temperature of about 10^10 K. (Or is the issue activation energy for formation, not binding energy?)

Detailed calculations using parameters from terrestrial nuclear physics give the observed relative abundances of the elements. In particular, the universe is 74% hydrogen and 24 per cent helium.

The astrophysicist's periodic table showing the origin of the different chemical elements is rather cute.


Giving credit to George Gamow

Gamow, who died in 1968, made impressive contributions to theoretical physics. His Wikipedia page is worth reading. He claimed that he predicted the Cosmic Microwave Background in the late 1940s and did not receive sufficient credit when it was discovered in 1964. The 2019 Nobel Prize citation for James Peebles also minimises Gamow's early contributions. Whether this is fair or not can be debated.

Anisotropies in the Cosmic Microwave Background.

The past two decades have seen amazing advances in precision measurements of these anisotropies. The radiation is isotropic to one part in 25000, with a temperature of 2.72548±0.00057 K.

Measurements of the anisotropies have allowed precise determinations of key cosmological parameters by fitting theoretical predictions to the data shown below from the 2018 Planck collaboration. Different peaks have different physical origins. 

The level of precision in the data is truly amazing.


The solid line is a fit to theory involving six parameters. What would Enrico Fermi say? This is not "making the tale of an elephant wiggle" because the fit parameters are all consistent with independent determination of the cosmological parameters from Hubble expansion and the relative abundance of the light elements.

Aside. The paper from the Planck 2 collaboration has been cited 19000 times, but has almost 200 authors. How does one use that information in evaluating individual authors in job and promotion applications? How are they to be compared to a single-author paper with 100 citations or a five-author paper with 500 citations?

Is this a golden age for cosmology? 

Yes, in terms of precision measurements. 

On the theoretical side, the golden age may have passed. It is not clear that new concepts or theories will emerge. The outstanding questions are:

What is the nature and origin of dark matter? of dark energy? 

Why is the cosmological constant so small? Why is it so fine-tuned?

Can the validity of inflation be pinned down?

Does quantum gravity matter?

A lot of smart people have spent decades on these problems and made little progress. That fact does not preclude the possibility of a theoretical breakthrough. However, it does not make me optimistic. I hope I am wrong.

Thursday, February 5, 2026

The legacy of 40 years of cuprate superconductivity

In February 1986, Bednorz and Müller made a stunning discovery: superconductivity at a temperature of 35 K in a doped copper oxide (cuprate). Arguably, this discovery changed condensed matter physics. In April 1986, they submitted their results to Z. Phys. B. Only nineteen months later, they were awarded the Nobel Prize in Physics, the shortest time ever between a discovery and the award. A nice and short review of the history is here.

One measure of my estimate of the influence of this discovery is that it received about 5 pages of coverage in my Condensed Matter Physics: A Very Short Introduction. (See Chapter 5, Adventures in Flatland).

How things have developed over the past forty years, for better and worse, may be representative of how science advances: discovery by serendipity, hype about applications, unexpected secondary benefits, foundational questions, new concepts, unification, and incremental advances.

Hype about technological applications

On March 20, 1987, The New York Times had a front-page article, DISCOVERIES BRING A 'WOODSTOCK' FOR PHYSICS, by James Gleick. This followed the 1987 APS March meeting. It began 

"Physicists from three continents converged on the New York Hilton for a hastily scheduled special conference on a string of discoveries that seem certain to produce a rapid cascade of commercial applications in electricity, magnetism and electronics.There are many things we know and understand that we did not when they were first discovered."

This has largely been unfulfilled. There are a few niche applications, but cuprates are not used in electricity distribution or even in the superconducting magnets in hospital MRI machines, which are probably the main commercial application of superconductors. One of the significant obstacles is that it is hard to make wires from these materials, as they are ceramics. This is an example of the common gap between research laboratory science and commercially viable technology.

After 40 years, do we have a successful theory?

It depends on who you ask. But I would say there is a lot we do understand.

We have a phenomenological theory for all the macroscopic phenomena associated with the superconducting state: Ginzburg-Landau theory!

Properties of the superconducting state are well-described by a BCS wavefunction with a d-wave order parameter and the associated Bogoliubov quasiparticles. [This is somewhat puzzling, as in the metallic state quasi-particles are not well defined].

Although not everyone agrees, I think it is fair to say that the essential physics is in a one-band Hubbard model, and the key physics is:

strong electronic correlations,

a doped antiferromagnetic Mott insulator,

d-wave pairing that is "mediated"/caused from some mixture/variant of antiferromagnetic spin fluctuations or RVB spin singlets,.....

We certainly don't understand the cuprates at the same level as elemental superconductors. But we do understand the essential physics.

What is harder to describe and understand are the states adjacent to the superconducting state in the phase diagram: the pseudogap state and the strange metal.


Strongly correlated electron materials became a large, vibrant and unified field

Before 1986, there were small, disconnected communities intermittently interested in transition metal oxides, rare earths, Kondo impurities, Mott metal-insulator transitions, organic superconductors, heavy fermions, and quantum antiferromagnets.

The discovery of the cuprates brought together these communities as they found common interests, challenges, questions, concepts, and techniques.

The discovery of superconductivity in strontium ruthenate, alkali fullerides, iron pnictides and chalcogenides, twisted bilayer graphene and more cuprates, organic charge-transfer salts, and heavy fermions has shown how rich these systems are. The challenge is to understand the similarities and differences between these chemically and structurally diverse systems. In many of them, superconductivity is proximate to a Mott insulating state.

The unity and excitement were probably stimulated and enhanced by the activities and ideas of high-profile theorists such as Anderson, Schrieffer, Scalapino, Pines, Rice, and Varma. On the other hand, their acrimonious disagreements probably did not help.

Secondary theoretical benefits

The things I list below were not new ideas when the cuprate discovery happened. However, interest in the cuprates led them to become major research themes and ideas.

Importance of phase diagrams, including as a function of interaction parameters in toy models

Highlighting the limitations of electronic structure methods based on Density Functional Theory with approximate Exchange-Correlation functionals (i.e., anything computational). In the presence of strong correlations, DFT methods have spectacular failures. For example, predicting a metallic state instead of the Mott insulator.

Low dimensionality leads to qualitatively different behaviour, including the possibility of new types of order and quasiparticles. This is most dramatic in one dimension, where one has Luttinger liquids and spin-charge separation.

Spin liquids. Landau was wrong. Spontaneous symmetry breaking does not always occur in antiferromagnets.

Non-Fermi liquids. Landau was wrong. Not all metals are Fermi liquids.

Quantum criticality. Although this is a robust concept for certain toy models, whether it is relevant to the cuprates remains contentious.

Systematic improvements in approximation schemes and numerical techniques - exact diagonalisation, DMRG, DMFT, quantum Monte Carlo,...

Emergence. Chemical complexity and strong interactions can lead to new states of matter.

Secondary experimental benefits

Better probes. The desire to characterise the cuprates helped drive significant improvements in the resolution of ARPES (Angle-Resolved PhotoEmission Spectroscopy), STM (Scanning Tunnelling Microscopy), and inelastic neutron scattering. These advances have born fruit in the study of a wide range of other materials, beyond the cuprates.

Growth of single crystals. The early days of the cuprates produced a lot of junk experimental results because of the poor quality of the samples produced by "shake and bake". However, the involvement of solid-state chemists has improved things. The techniques have also led to the production of single crystals for a wide range of strongly correlated materials.

Why is there so little research on cuprates today?

Today, there is little research directly on cuprates, both theoretically and experimentally. It is hard to get funding to work on them, even though there is a lot we don't understand really well.

This is because of the problem of fashion in science. The low-lying fruit has been picked. There is a continuous new stream of materials being discovered with exotic properties, the latest being twisted bilayer van der Waals compounds.

Information theoretic measures for emergence and causality

The relationship between emergence and causation is contentious, with a long history. Most discussions are qualitative. Presented with a new...