Thursday, May 8, 2014

Resisting the temptation to make the best looking data plot

It is a fallible human tendency to want to include in a paper the most favourable comparison between your pet theory and experiment. My collaborators and I were recently confronted with this issue when writing our recent paper on Quantum nuclear effects in hydrogen bonding.

We calculated a particular vibrational frequency for both hydrogen and deuterium isotopes. Experimentalists had previously reported that this ratio has large and non-monotonic variations as a function of the donor-acceptor distance R. The plot below shows a comparison of our calculations [curves] to experimental data on a wide range of chemical complexes [each point is a separate compound].
I was quite happy with this result, particularly because getting the frequency ratio down to values as small as one was significant [Aside: this is an amazing thing because in most compounds the isotope frequency ratio is close to 1.4 = sqrt(2), as expected from a simple harmonic oscillator analysis].

It was tempting just to publish this plot.
But, there is a problem. Most previous plots by experimentalists did not use R as the horizontal axis but Omega_H, the frequency for the H case. [For example, see the plot I featured in a post  back in 2011 when I started thinking about this problem].
Below is the corresponding plot.

It is much less impressive!
Why? The problem is that for R ~ 2.5 Angstroms our theory does not give values of the frequency, that agree very well with experiment, as shown in a earlier Figure in the paper. We discuss some possible reasons for that.

So we decided that the best thing to do was to publish both figures and readers can make their own decisions about the strengths and weaknesses of our work.

Now here is another slant. The data above is for O-H...O bonds, which we focussed on in our paper. The data below is for N-H...N bonds [taken from here] and shows much clearer correlations than the data above. Again it would have been tempting to focus on that case.

I will also illustrate my point with a historically much more important example.
The figures below are also discussed in an earlier post. [It led to a Nobel Prize]. The upper version shows a moderately impressive comparison of data with a theoretical curve. However, the main point of the paper [and the Nobel Prize for cosmic acceleration] is not the linear component [Hubble constant] but the non-linear component [expansion]. The lower part of the figure has the linear part subtracted out and looks far less impressive. Nevertheless, it stood the test of time and complementary measurements, as discussed in the earlier post.

In conclusion, I think it is important that we not always present our work so it appears in the best possible light.

1 comment:

  1. Quantum effect is a interesting phenomenon in hydrogen bonding systems. Last year I also published a paper titled with 'Quantum Effects on Global Structure of Liquid Water'. In our paper, I used Raman spectra in the OH stretching region to discuss the quantum effect in liquid light and heavy water. I notice the largest frequency in the figure 8 in your published paper is ~3400cm-1, however in experimental Raman/IR/SFG spectra, the largest frequency is above 3700 cm-1, why you do not calculate the data above 3400 cm-1, may be the data in the range 2800~3700 cm-1 can help our experimental worker.
    Actually some years ago, I used Gaussian 03 to calculate the OH or OD frequency in various water clusters, I also found the similar phenomenon in the first figure in this blog. This phenomenon is a evidence of the quantum effect in hydrogen bonding system?
    I need to read your paper carefully, your paper may help me understand my Raman spectra of light and heavy water.