Monday, May 30, 2016

A basic but important skill: critical reading of experimental papers

Previously, I highlighted the important but basic skill of being skeptical. Here I expand on the idea.

An experimental paper may make a claim, "We have observed interesting/exciting/exotic effect C in material A by measuring B."
How do you critically assess such claims?
Here are three issues to consider.
It is as simple as ABC!

1. The material used in the experiment may not be pure A.
Preparing pure samples, particularly "single" crystals of a specific material of know chemical composition is an art. Any sample will be slightly inhomogeneous and will contain some chemical impurities, defects, ... Furthermore, samples are prone to oxidation, surface reconstruction, interaction with water, ... A protein may not be in the native state...
Even in a ultracold atom experiment one may have chemically pure A, but the actual density profile and temperature may not be what is thought.
There are all sorts of checks one can do to characterise the structure and chemical composition of  the sample. Some people are very careful. Others are not. But, even for the careful and reputable things can go wrong.

2. The output of the measurement device may not actually be a measurement of B.
For example, just because the ohm meter gives an electrical resistance does not mean that is the electrical resistance of the material in the desired current direction. There are all sorts of things that can go wrong with resistances in the electrical contacts and in the current path within the sample.
Again there are all sorts of consistency checks one can make. Some people are very careful. Others are not. But, even for the careful and reputable things can go wrong.

3. Deducing effect C from the data for B is rarely straightforward.
Often there is significant theory involved. Sometimes, there is a lot of curve fitting. Furthermore, one needs to consider alternative (often more mundane) explanations for the data.
 Again there are all sorts of consistency checks one can make. Some people are very careful. Others are not. But, even for the careful and reputable things can go wrong.

Finally, one should consider whether the results are consistent with earlier work. If not, why not?

Later, I will post about critical reading of theoretical papers.

Can you think of other considerations for critical reading of experimental papers?
I have tried to keep it simple here.


  1. Some more considerations:

    1) Too few data points to make significant claims regarding having observed a particular effect. A non-critical reader might miss this and jump directly to the claimed high-impact result.

    2) History-dependence: Was a particular surprising measurement tried more than once? For (own) lab-based measurement this is possible and an important test. In other cases, the authors may have had only one shot, e.g. x-ray or neutron scattering experiments, and if a referee asks for an experiment to be repeated it may lead to significant delays.

    3) Papers aiming to demonstrate that effect C (determined by measurements of B) are correlated with effect E (determined by measurements of D) are particularly difficult. In both cases there can be theory involved. Moreover, it is often the case that the correlations are not "sharp" in the sense that B and D do not correspond to, for example, well-defined onset temperatures, but rather are slow temperature-dependences. One example could be evidence of "nematicity" in Fe-based superconductors.

    1. NBC,
      Thanks for the comment. I agree these are all important considerations. Related to history dependence is sample dependence and apparatus and group member dependence. Were the measurements repeated on different samples and using different apparatus (e.g. different cryostats or spectrometers) and by different group members. To the naive, these can seem like trivial checks, but if you talk to honest experimentalists they will tell you they can matter.

  2. Good points from NBC.

    Regarding #2: even if people are doing their due diligence in reproducibility, doing it exactly the same every time risks mistaking a path function (variable) for a state function (variable) - with obvious repercussions for the interpretation.

    How many times has the effect been reproduced, and were these reproduction experiments deliberately varied...?

    And now I'm curious to see the critical reading of theoretical papers.
    Especially if it is more general than the often used "what approximations were used in electronic structure calculations and were they valid in hindsight".

  3. Related to the points mentioned above:

    1. How was the plotted "experimental data" actually derived from the real output of the measurement device? There are many quantities (such as optical conductivity for example) that require some set of approximations or curve-fitting to obtain from the raw experimental data. In some cases, data on multiple samples, experimental apparatuses, temperature ranges, etc. are merged to construct a plot. What is the merging process?

    2a. Is the effect the right order of magnitude? This is rather simple, but always something I ask. Usually thinking about this helps to understand the results.

    2b. How big would the background be? In some cases, the desired experimental data is obtained only after subtraction of a very large background contribution. It is always a question how systematic errors in this subtraction would affect the result - and these errors may not appear simply as noise.