Tuesday, February 24, 2026

Information theoretic measures for emergence and causality

The relationship between emergence and causation is contentious, with a long history. Most discussions are qualitative. Presented with a new system, how does one identify the microscopic and macroscopic scales that may be most useful for understanding and describing the system? Can Judea Pearl’s seminal ideas about causality be implemented practically for understanding emergence?

Broadly speaking, a weakness of discussions of emergence and causality is that it is hard to define these concepts in a rigorous and quantitative manner that makes them amenable to empirical testing, with respect to theoretical models and to experimental data. 

Fortunately, in the past decade, there have been some specific proposals to address this issue, mostly using information theory. A helpful recent review is by Yuan et al. 

“Two primary challenges take precedence in understanding emergence from a causal perspective. The first is establishing a quantitative definition of emergence, whereas the second involves identifying emergent behaviors or phenomena through data analysis.

To address the first challenge, two prominent quantitative theories of emergence have emerged in the past decade. The first is Erik Hoel et al.’s theory of causal emergence [19] whereas the second is Fernando E. Rosas et al.’s theory of emergence based on partial information decomposition [24].

Hoel et al.’s theory of causal emergence specifically addresses complex systems that are modeled using Markov chains. It employs the concept of effective information (EI) to quantify the extent of causal influence within Markov chains and enables comparisons of EI values across different scales [19,25]. Causal emergence is defined by the difference in the EI values between the macro-level and micro-level."

One perspective on causal emergence is that it occurs when the dynamics of a system at the macro-level is described more efficiently by macro-variables than by the dynamics of variables from the micro-level.

Klein et al. used Hoel’s information-theoretic measures of causal emergence to analyse protein interaction networks (interactomes) in over 1800 species, containing more than eight million protein–protein interactions, across different scales. They showed the emergence of ‘macroscales’ that are associated with lower noise and uncertainty. The nodes in the macroscale description of the network are more resilient than those in less coarse-grained descriptions. Greater causal emergence (i.e., a stronger macroscale description) was generally seen in multicellular organisms compared to single-cell organisms. The authors quantified causal emergence in terms of mutual information (between large and small scales) and effective information (a measure of the certainty in the connectivity of a network). Philip Ball (2023) (pages 218-220) gives an account of this work in terms of the emergence of multicellularity in biological evolution. He introduced the term causal spreading (pages 225-7), arguing that over the history of evolution the locus of causation has changed.

Yuan et al. continue

"However, in Hoel’s theory of causal emergence, it is essential to establish a coarse-graining strategy beforehand. Alternatively, the strategy can be derived by maximizing the effective information (EI) [19]. However, this task becomes challenging for large-scale systems due to the computational complexity involved. To address these problems, Rosas et al. introduced a new quantitative definition of causal emergence [24] that does not depend on coarse-graining methods, drawing from partial information decomposition (PID)-related theory. PID is an approach developed by Williams et al., which seeks to decompose the mutual information between a target and source variables into non-overlapping information atoms: unique, redundant, and synergistic information [29]…"

The Figure below is taken from Rosas et al. Xt^j (j=1,…,n) are microscopic variables that define a Markov chain. Vt is a macroscopic variable that is completely determined by the microscopic variables.

“Diagram of causally emergent relationships. Causally emergent features have predictive power beyond individual components. Downward causation takes place when that predictive power refers to individual elements; causal decoupling when it refers to itself or other high-order features.”

Rosas et al. applied the method to specific systems, including Conway’s Game of Life, Reynolds’ flocking model, and neural activity as measured by electrocorticography. More recently, it was used to describe emergence in computer science, including the identification of modular structures. Calculations were performed for specific examples, including Ehrenfest’s urn model for diffusion, the Ising model with Glauber dynamics, a Hopfield neural network model for associative memory.

Yuan et al. also state the following:

"The second challenge pertains to the identification of emergence from data. In an effort to address this issue, Rosas et al. derived a numerical method [24]. However, it is important to acknowledge that this method offers only a sufficient condition for emergence and is an approximate approach. Another limitation is that a coarse-grained macro-state variable should be given beforehand to apply this method."

Sas et al. recently stated

“Empirical applications of this framework to study emergence … including the study of gene regulatory networks [22], the dynamics of the human brain [23], the internal dynamics of reservoir computing [24], and the formation of useful internal representations in machine learning [25].”

Yuan et al. also discuss two significant connections between causal emergence and machine learning. First, machine learning can be used to improve calculations of causal emergence. Second, causal emergence measures can be used to better understand how machine learning works and improve it.

The work described above built on earlier work by Crutchfield, who claimed that the identification of emergence and hierarchies could be made operational, stating that “different scales are delineated by a succession of divergences in statistical complexity at lower levels.” More recently, Rupe and Crutchfield have reported progress towards identifying emergent self-organisation in a system.

Although this work on quantitative measures of emergence based on information theory represents significant progress, there are many open problems. Examples include the extension to non-Markovian systems and the development of computationally feasible methods for large systems. The latter is particularly important in physical systems where spontaneous symmetry breaking occurs, as this only happens in the thermodynamic limit of an infinite system.

There is an unrecognised similarity between the work described above and techniques recently developed to characterise phase transitions in statistical mechanics models such as the Ising model and classical dimer models. Coarse-graining (CG) is optimised by maximising the Real-Space Mutual Information (RSMI) between a spatial block and its distant environment. 

In general, maximising mutual information is notoriously hard but can be done using state-of-the-art machine learning algorithms. Gokmen et al. have developed an algorithm that they claim “can, unsupervised, construct order parameters, locate phase transitions, and identify spatial correlations and symmetries for complex and large-dimensional real-space data.” Furthermore, the optimal CG explicitly identifies the scaling operators associated with the critical point. 

The classical dimer model provides a stringent test as “the relevant low-energy degrees of freedom are profoundly different from the microscopic building blocks of the theory and change qualitatively throughout the phase diagram.” In other words, the emergent entities (quasiparticles such as vortices associated with the height field, which is described by a sine-Gordon field theory) are different from the dimers.

It is encouraging to see that two different scientific communities have developed similar ideas to address this challenging problem of making discussions about emergence and causality more concrete and quantitative.

No comments:

Post a Comment

Information theoretic measures for emergence and causality

The relationship between emergence and causation is contentious, with a long history. Most discussions are qualitative. Presented with a new...