Category Archives: Scientific miscellanea

The top 100 most-cited papers of all time

I wrote earlier about the 50th anniversary of the Science Citation Index. Recently, Nature got together with Thomson-Reuters, the publishers of the Science Citation Index (now usually known as the Web of Science), to come up with a list of the 100 most-cited papers of all time.1 It’s an interesting list, which I encourage you to take a look at. Let’s face it: top-100 lists are always fun. Who is in there? Who is not? The Nature article provides a few reflections on this. For my part, I’m going to look at what this list tells us about citation patterns in different areas of science, focusing particularly on an area of science I know well, namely density functional theory, and one with which I have a tangential acquaintance, NMR.

There are, as the Nature article pointed out, a large number of papers in the top 100 from the field of density-functional theory (DFT). I may have missed some, but here are the ones I noticed: Lee, Yang and Parr (1988)2 at #7, Becke (1993)3 at #8, Perdew, Burke and Ernzerhof (1996)4 at #16, Becke (1988)5 at #25, Kohn and Sham (1965)6 at #34, Hohenberg and Kohn (1964)7 at #39, Perdew and Wang (1992)8 at #93, and Vosko, Wilk and Nusair (1980)9 at #96.

So what is DFT, anyway? One of the great problems in electronic structure calculations for molecules is electron correlation. Electrons repel, so they tend to stay away from each other. Classic methods of electronic structure calculation don’t properly take electron correlation into account. There are ways to put electron correlation back in after the fact, but they’re either not very accurate, or they take a huge amount of computing. Another problem arises because of exchange, a strange quantum mechanical effect that causes identical electrons with the same spin to stay away from each other moreso than is the case due to simple electrostatics (i.e. more than would be the case for electrons with opposite spin). DFT is based on some theory developed by Kohn in the 1960s (in papers #34 and 39 from Nature‘s list) that essentially states that there is a functional of the electron density that describes electron correlation and the exchange interaction exactly. Modern DFT is based on approximating this functional (usually using separate correlation and exchange parts) semi-empirically. Using good DFT exchange and correlation functionals allows us to do very accurate electronic structure calculations much more quickly than is the case with older methods. The one catch is that we don’t really know what the exchange and correlation functionals should be, so there’s a lot of work to be done coming up with good functionals and validating them. Nevertheless, the current crop of functionals does a pretty good job in many cases of chemical interest.

To understand the DFT citation patterns a bit better, I used the Web of Science to count up the number of times each of these papers was cited with one of the others. Here’s what I found:

LYP 88 Becke 93 PBE 96 Becke 88 KS 65 HK 64 PW 92 VWN 80
LYP 88 48653 33303 3498 17608 3305 2917 2114 5320
Becke 93 48041 3266 11118 2718 2499 2469 4284
PBE 96 38281 2948 5405 5040 2576 1647
Becke 88 27370 2734 2332 2246 5821
KS 65 23840 15129 2028 1955
HK 64 22608 1750 1656
PW 92 13173 1260
VWN 80 12862

Hopefully the code I’m using here is clear enough: LYP 88, for example, is Lee, Yang and Parr (1988). The entries on the diagonal are the total numbers of citations to the corresponding papers. This matrix is necessarily symmetric about its diagonal, so I didn’t fill in the entries below the diagonal. Note that the total citations for each paper differ somewhat from those reported in Nature‘s spreadsheet because I performed my analysis at a later point in time, and these papers continue to accumulate citations at an astonishing rate.

A few numbers jump out from this table: The top two DFT papers, Lee, Yang and Parr (1988) and Becke (1993), are cited together with very high frequency: 68% of the papers citing Lee, Yang and Parr (1988) also cite Becke (1993). Although cited together slightly less often, Becke (1988) is also frequently co-cited with Lee, Yang and Parr (1988): 36% of the papers citing the latter also cite Becke (1988). Now if we ask how many of the papers citing Lee, Yang and Parr (1988) also cite at least one of the Becke papers, we find that an astonishing 85% do. This is, of course, not a random occurrence. One of the most popular exchange-correlation functionals around, B3LYP, combines Becke’s 1988 exchange functional, which was further studied in his 1993 paper, with the Lee, Yang and Parr correlation functional. People who use the B3LYP functional in calculations will usually cite Lee, Yang and Parr (1988) along with at least one of the Becke papers. So if one of these papers was to appear in the top-100 list, it was likely that all three would, as they do. The appearance of these papers in the top-100 list is therefore a testament to the heavy use made of the exchange-correlation functionals developed by these authors in the chemical literature. In fact, all of the DFT papers in the top-100 list describe functionals that are heavily used in applications, except for the Kohn papers which provided the underlying theory.

One of the points made by the authors of the Nature article is that papers that describe methods get cited much more than papers that introduce new ideas into science. So why do the Kohn papers appear in this list? I would argue that this is due to a quirk of citation among people who do DFT calculations. The vast majority of citations to these papers are by people who do DFT calculations, not by people further developing the Hohenberg-Kohn-Sham theory. To fully understand how strange this is, we have to consider that the overwhelming majority of people doing DFT calculations and citing these papers use software written by someone else, usually commercial software like Gaussian. Ordinary users of a computational method don’t usually “dig down” to the theory layer in their citations in this way. For example, the vast majority of modern quantum chemical calculations (including most DFT calculations) are based on Roothaan’s classic work on self-consistent-field calculations.10 These papers have been cited, respectively, 4535 and 1828 times. This is an extremely high citation rate, but it’s a tiny fraction of the literature reporting calculations based on Roothaan’s algorithms. So it’s a bit strange that Kohn’s work gets cited by DFT users at this high rate, particularly since we can find other foundational papers in quantum chemistry, such as Roothaan’s that are not as routinely cited.

Now let’s contrast the citation record of DFT with that of NMR. NMR is nuclear magnetic resonance. NMR spectroscopy is used on a daily basis by every synthetic chemistry group in the world, and by many physical and analytical chemistry laboratories as well. Although they will typically back up NMR measurements with other techniques, NMR is how chemists identify the compounds they have made, and determine their structures. One would think that we would see papers that describe fundamental NMR techniques or popular experiments make this list. They don’t. There is a single NMR-related paper in the list, one that describes a software program for analyzing both crystallography and NMR data, showing up at #69. That’s it. So why is that? It’s certainly not that there are more DFT papers than there are papers that use NMR. In fact the reverse is certainly true. However, when experiments become sufficiently common, chemists stop citing their original sources. I was just looking at a colleague’s paper in which he mentioned six different NMR experiments in addition to the usual single-nucleus spectra. A literature reference was given for only one of these experiments, presumably because he felt the others were sufficiently well-known that they didn’t need references. The equivalent practice in DFT would be not to cite anything when using the B3LYP functional, on the basis that everybody knows this functional. That’s quite a difference in citation practices between two different areas of chemistry! And the fascinating thing is that these two fields have overlapping membership: There are lots of synthetic chemists who do DFT calculations to support their experimental work. And for some reason, they behave differently when describing DFT methods than when describing NMR methods.

To understand the vast difference in citation practices between these two areas, let’s look at a specific example. In many ways, two-dimensional NMR experiments, in which signals are spread along a second dimension that encodes additional molecular information, very much parallels DFT: These methods were developed at about the same time, and hardware that could carry out these operations routinely became available to ordinary chemists around the same time in both fields, and they both opened up what could be done in their respective fields. The first two-dimensional NMR experiment, COSY, was first proposed in 1971 by Jean Jeener.11 It’s not entirely trivial to hunt down citations to papers in conference proceedings in the Web of Science because they are not cited in any consistent format. However, after doing a bit of work, and including the reprinting of these lecture notes in a collection a few decades later, I found approximately 352 citations to Jeener’s epoch-making paper. Compare that to the 23840 citations to the Kohn-Sham (1965) paper. One could argue that Jeener’s paper was published in an obscure venue, and that this depressed the number of citations to this paper, which is certainly plausible.  Jeener’s proposal was implemented by Aue, Bartholdi and Ernst in 1976.12 That paper was cited 2919 times, which is a far cry from the number of citations accumulated by the Kohn papers, or by the “applied” DFT papers in which practical functionals are described. Kohn shared the 1998 Nobel Prize in Chemistry. Ernst was awarded the 1991 Nobel Prize in Chemistry. There are a lot of ways in which the two contributions are comparable. But not in citation counts. And clearly, it’s not a matter of the popularity of the methods: I used the ACS journal web site to see how many papers in the Journal of Organic Chemistry mentioned the COSY experiment. The Journal of Organic Chemistry is a journal that, by its nature, contains mostly papers reporting the synthesis and characterization of compounds, so it’s a good place to gauge the extent to which an experimental method is used. In that one journal alone, 6351 papers mention COSY. To be fair, some of these references will be to descendants of the original COSY experiment (of which there are many), but the very large number of COSY papers and the relatively small number of citations to the early papers on COSY still speaks to wildly different citation cultures between NMR and DFT practitioners.

None of this is intended to denigrate the work of the excellent scientists whose papers have made the top-100 list. They clearly deserve a very large pat on the back. However, it does show that we have to be extraordinarily careful in comparing citation rates even between very closely related fields. And these rates will of course also affect citation-based metrics like the h-index, perhaps not in extreme cases like the highly cited papers mentioned here, but certainly in the case of authors whose papers are well cited, if not insanely well cited.

In the interests of full disclosure: Axel Becke, whose name features so prominently in the top-100 list and in this blog post, supervised my senior research project when I was an undergraduate student at Queen’s. My first scientific paper was coauthored with Axel.13 In fact, I may have benefited from the higher citation rates in DFT as this paper is by far my most cited paper. I sometimes joke that my career has all been downhill since this very first scientific contribution. But to figure out if that was true, we would have to take the citation practices of the various areas I’ve worked in into account…

1R. van Noorden, B. Maher and R. Nuzzo (2014) The top 100 papers. Nature 514, 550–553.

2C. Lee, W. Yang and R. G. Parr (1988) Development of the Colle-Salvetti correlation-energy formula into a functional of the electron density. Phys. Rev. B 37, 785–789.

3 A. D. Becke (1993) Density-functional thermochemistry. III. The role of exact exchange. J. Chem. Phys. 98, 5648–5652.

4J. P. Perdew, K. Burke and M. Ernzerhof (1996) Generalized gradient approximation made simple. Phys. Rev. Lett. 77, 3865–3868.

5A. D. Becke (1988) Density-functional exchange-energy approximation with correct asymptotic behaviour. Phys. Rev. A 38, 3098–3100.

6W. Kohn and L. J. Sham (1965) Self-consistent equations including exchange and correlation effects. Phys. Rev. 140, A1133–A1138.

7P. Hohenberg and W. Kohn (1964) Inhomogeneous electron gas. Phys. Rev. 136, B864–B871.

8J. P. Perdew and Y. Wang (1992) Accurate and simple analytic representation of the electron-gas correlation-energy. Phys. Rev. B 45, 13244–13249.

9S. H. Vosko, L. Wilk and M. Nusair (1980) Accurate spin-dependent electron liquid correlation energies for local spin-density calculations — a critical analysis. Can. J. Phys. 58, 1200–1211.

10C. C. J. Roothaan (1951) New developments in molecular orbital theory. Rev. Mod. Phys. 23, 69–89; (1960) Self-consistent field theory for open shells of electronic systems. Rev. Mod. Phys. 32, 179–185.

11J. Jeener (1971) “Lecture notes from Ampere Summer School in Basko Polje, Yugoslavia. Reprinted in NMR and More in Honour of Anatole Abragam, Eds. M. Goldman and M. Porneuf, Les editions de physique (1994).

12W. P. Aue, E. Bartholdi and R. R. Ernst (1976) Two-dimensional spectroscopy. Application to nuclear magnetic resonance. J. Chem. Phys. 64, 2229–2246.

13A. D. Becke and M. R. Roussel (1989) Exchange holes in inhomogeneous systems: A coordinate-space model. Phys. Rev. A 39, 3761–3767.

The Science Citation Index turns 50

The Science Citation Index, which many of you will now know as the Web of Science, turned 50 this year.1 Hopefully, you are already familiar with this tool, which allows you to, in essence, search forward in time from a particular paper. This is an important way to search the literature, and has a myriad of uses, good and bad.

I first discovered the Science Citation Index when I was a student in the 1990s. Back then, the Science Citation Index existed as a paper index. Every year, a new set of volumes would come out that contained all the citations indexed in that year. Every five years, a five-year cumulation would come out that would replace the annual volumes. You would look up a cited paper in these volumes by the lead author. If the paper had been cited in the range of time covered by a volume set, that paper would appear under this author’s name with the citing papers listed under it. If the paper was more than a few years old, you often had to go through several volumes to pick up all the citations, but it was still well worth doing when you wanted to track what had happened to an idea over time. This process is described in some detail in one of Eugene Garfield’s articles.

Of course, the Science Citation Index has other uses. It occurred to me fairly early on that I could use it to check if people were citing my papers. This was often directly useful by making me aware of related work that might not otherwise have come to my attention. Of course, there’s also an ego-stroking aspect to this exercise, at least if your papers get cited. My own work took a while to catch on, so citations were few and far between for several years.

Over the years, paper gave way to CD-ROMs, and eventually to a web interface. Computer-based tools allow for more sophisticated analyses, but the core functionality is the same: the ability to track ideas through time, and to discover inter-related papers. One of the most intriguing (and under-used) features of the web system is the “View Related Records” link, which shows papers that share citations with a chosen paper. If people are working on related problems but aren’t aware of each other (which happens quite a lot) and are therefore not citing each other, this is often a useful way to discover another branch of research on a given problem since they are likely to be citing many of the same references. If you’re a young scientist starting out in a field, I would strongly suggest that you use this feature starting from a few  papers that are key to your research.

What is perhaps the most remarkable aspect of the Science Citation Index is that it was largely the brainchild of one person, Eugene Garfield. No idea is without its precedents, but there is no doubt that Garfield was the prime mover behind the Science Citation Index. We all owe Dr Garfield a huge debt.

So happy 50th Science Citation Index, and many happy returns!

1Eugene Garfield (1964) “Science Citation Index”—A New Dimension in Indexing. Science 144, 649–654.

Sharing data: a step forward

You would think that scientists would be eager to share data. After all, the myth of science that is taught to students is that we build on each other’s work, so of course if we have an interesting data set, we will let anyone have it who wants it, right?

It turns out that the truth is somewhat other than we would like it to be. There are both bad and worse reasons why data is not routinely shared. Probably the worst reason of all is wanting to sit on the data so that one extracts the maximum benefit from the data set while shutting out others. A variation on this theme is only allowing people access to your data if they will agree to make you a coauthor. I once collaborated with a scientist who wanted to use a crystal structure obtained by another lab. (I will leave the names out of it since they’re not relevant.) This was in the mid-1990s when the requirement to deposit structures with the Protein Data Bank (PDB) prior to publication was not yet universal. She was told by her colleagues (and I use the word loosely here) that she could only have their coordinate files if she agreed to include them as coauthors on any paper in which those coordinates were used for the following five years. My colleague and I were astonished by this. Beyond providing coordinates from an already-published structure, they would have made no intellectual contribution to her work, and yet they wanted to be treated as coauthors for an extended period of time. This is no longer possible with protein structures due to the now-universal requirement to deposit structures at the PDB as a condition of publication, but clearly people who hold data sometimes feel this gives them power they can use to further their careers. This is just wrong.

A not-so-good reason for not sharing data is that doing so takes time. A data set that may be perfectly OK for your use may not be suitable for sharing as is. I won’t get into issues of confidentiality with human subjects because I’m not an expert in this area, but clearly anonymizing medical data prior to sharing is important, and then there’s the tricky issue of consent: If the participants in a study did not explicitly agree to have their data used in other studies, is it OK to share the data set with others, even with suitable safeguards in place to protect the privacy of the study participants? Even for data not involving human subjects, sharing data takes time because you have to make sure you provide enough information about the data set for users to be able to make sense of it. This includes (obviously) a full description of what the various data fields represent, but also the conditions under which the data were obtained, any post-processing of the data, etc. Many scientists opt to just keep their data to themselves rather than generating all the necessary metadata. This situation is made worse by the fact that one gets very little credit for putting together a usable data set: It doesn’t count as a publication, so it won’t help a student land a scholarship, tenure and promotion committees are unlikely to give a data set much weight in their deliberations, and granting agencies won’t give you a grant solely because you generate high-quality reusable data.

A significant step forward has been taken with the launch of a new online journal by the Nature Publishing Group entitled Scientific Data. (Incidentally, I learned about this new journal from an article in The Scientist.) This journal is dedicated to the publication of data sets with proper metadata so that they can be used widely. Hopefully, the clout of the Nature Publishing Group will make the various bodies that make decisions about what scientific activities are valued pay attention, and will lead to an increase in the sharing of data sets.

In case you’re wondering whether I put my money where my mouth is: My web site includes a small section of data sets that I have generated that others might find of interest. Could I do more? Sure. Making it worth my while to do so via a journal like Scientific Data might be just the push I need.

The big five-oh, Part 2

When I arrived in Lethbridge in the summer of 1995, my first job was to write an NSERC research grant proposal. This proposal used delay systems, a theme I had first explored in detail while I was a postdoc at McGill, as a connecting theme. It’s interesting to go back to this first proposal, because some of the ideas are recognizable in my current research program, but others were dropped long ago. It included a proposal to develop a detailed model of the lac operon, something I never quite got around to doing, but which is clearly related to my current research interests in gene expression. There was a proposal to work on the equivalence between various types of differential equations, including master equations, which I’m still working on. There were also some ideas for stochastic optimizers, which led to some work on the structures of ion clusters,1 but which I didn’t pursue for long.

So what did I busy myself with? My very first paper in Lethbridge was on competitive inhibition oscillations,2 a phenomenon I had first discovered in the final stages of my Ph.D. This line of thought eventually led to the discovery of sustained stochastic oscillations in this system many years later.3

I’m not going to go through all the work I’ve done since those early years, so maybe I’ll just mention a few major themes that emerged over time, and take the opportunity to formulate a bit of advice to young scientists.

I continued to be interested in model reduction, a topic I continue to work on to this day. After leaving Toronto, I had thought that I would stop working on these problems. I wasn’t sure that I had all that much more to say about the theory of slow invariant manifolds. But colleagues in the field encouraged me to keep working on these problems, and from time to time I had some new idea that I thought would contribute something to the field. I am no longer under any illusion that I’m going to stop working on these problems anytime soon. What is the lesson to young scientists here? If you work on a sufficiently interesting set of problems during your Ph.D., this work is likely to follow you throughout your career, and that’s not a bad thing.

While I was finishing my Ph.D., I remember having a talk with Ray Kapral in which I said, with the certainty that only a young, inexperienced scientist can muster, that the problems involved in modelling chemical systems with ordinary differential equations were sufficient to keep me occupied, and that I would never (I actually remember using this word) work on partial differential equation or stochastic models. By 2002, I was studying reaction-diffusion (partial differential equation) models with my then postdoc, Jichang Wang. By 2004, I was working on stochastic models with Rui Zhu, also a postdoc at the time. In fact, most of my research effort is currently directed to stochastic systems. It was silly of me to say I would never work in one modelling framework or another. What I had the wisdom to do as I matured was to pick the correct modelling paradigm at the appropriate moment to tackle the problems I wanted to solve.

One of the things that, I think, has kept my research program relevant and vital over the years is that we’ve done a lot of different things: in addition to the topics mentioned above, there were projects on dynamical systems with stochastic switching, on stochastic modelling of gene expression, on photosynthesis, and on graph-theoretical approaches to bifurcation theory, to name just a few. Most of these topics connect to each other in some way, or at least they do in my head.

Looking back on my first 50 papers, much as it’s fun to think about the research, it’s the people that stand out. I’ve worked with many fine supervisors, colleagues, postdocs and students. I have learned something from each and every one of them. In fact, if I have one piece of advice for young scientists, it’s to find good people to work with, and to pay attention to what they do and how they do it. You can’t necessarily do things exactly the same way as someone else does, but you ought to be able to derive some general lessons you can use to guide your own research career and interactions with other scientists.

Be brave in choosing research topics. Work hard. Find good people to work with. I can’t guarantee that doing these things will lead to success, but not doing them will, at best, lead to mediocrity.

1Richard A. Beekman, Marc R. Roussel and P. J. Wilson (1999) Equilibrium configurations of systems of trapped ions. Phys. Rev. A 59, 503–511. Taunia L.L. Closson and Marc R. Roussel (2009) The flattening phase transition in systems of trapped ions. Can. J. Chem. 87, 1425–1435.
2Lan G. Ngo and Marc R. Roussel (1997) A new class of biochemical oscillator models based on competitive binding. Eur. J. Biochem. 245, 182–190.
3Kevin L. Davis and Marc R. Roussel (2006) Optimal observability of sustained stochastic competitive inhibition oscillations at organellar volumes. FEBS J. 273, 84–95.

The big five-oh, Part 1

No, I haven’t turned 50 yet. However, my 50th refereed paper has now appeared in print. This therefore seems like an appropriate time to look back on some of the research I have done since I started out as a graduate student at the University of Toronto. (I had two prior papers from my undergraduate work, but these were both in areas of science I didn’t pursue.) My intention here isn’t to write a scholarly review paper, so you won’t find a detailed set of citations here. My full list of publications is, in any event, available on my web site.

My M.Sc. and Ph.D. theses were both on the application of invariant manifold theory to steady-state kinetics. I was introduced to these problems by my supervisor at the University of Toronto, Simon J. Fraser. Simon is a great person to work for. He is supportive, and full of ideas, but he also lets you pursue your own ideas. I had a great time working for him, and learned an awful lot of nonlinear dynamics from him.

One way to think of the evolution of a chemical system is as a motion in a phase space, typically a space whose axes are the concentrations of the various chemical species involved, but sometimes including other relevant variables like temperature. The  phase space of a chemical system is typically very high-dimensional. The reactions that transform one species into another occur on many different time scales. The net result is that we can picture the motion in phase space as involving a hierarchy of collapse processes onto surfaces of lower and lower dimension, the fastest processes being responsible for the first collapse events, followed by slower and slower processes.1 These surfaces are invariant manifolds of the differential equations, and we developed methods to compute them. Given the equation of a low-dimensional manifold, we obtain a reduced model of the motion in phase space, i.e. one involving the few variables necessary to describe motion on this manifold.

Invariant manifold theory has been a fertile area of research for me over the years. I continue to publish in this area from time to time. In fact, one of my current M.Sc. students, Blessing Okeke, is working on a set of problems in this area. Expect more work on these problems in the future!

Toward the end of my time in Toronto, my supervisor, Simon J. Fraser, allowed me to spend some time working with Carmay Lim who, at the time, was cross-appointed to several departments at the University of Toronto, and worked out of the Medical Sciences Building. This was a very productive time, and I learned a lot from Carmay, particularly about doing research efficiently.

We worked on a set of applied problems on the lignification of wood using an interesting piece of hardware called a cellular automata machine. This was a special-purpose computer built to efficiently simulate two-dimensional cellular automata. The machine was programmed in Forth, a programming language most of you have probably never heard of, with some bits written in assembly language for extra efficiency. For a geek like me, programming this machine was great fun. I think we did some useful work, too, as our work on lignification kinetics still gets cited from time to time.

I had been to the 1992 SIAM Conference on Applications of Dynamical Systems in Snowbird which, I think, was just the second of what would become a long-lived series of conferences. There, I had discovered that there was a lot of interest in delay-differential equations (DDEs), as the tools necessary to analyze these equations were being sharpened. I had thought about the possibility of applying DDEs to chemical modelling, and decided to apply to work with Michael Mackey at McGill University, who was an expert on the application of DDEs in biological modelling. McGill was a great environment, and I learned a lot from Michael and his students. The most significant outcome of my time in Montreal was a paper published in the Journal of Physical Chemistry on the use of DDEs in chemical modelling.2

I pursued this style of modelling in a handful of papers. Eventually, I got interested in the use of delays to simplify models that can’t be described by differential equations, namely stochastic systems.3 This is another one of those ideas that I have kept following down through the years.

In my next blog post, I will reflect on some of the work I have done since arriving in Lethbridge.

1Marc R. Roussel and Simon J. Fraser (1991) On the geometry of transient relaxation. J. Chem. Phys. 94, 7106–7113.
2Marc R. Roussel (1996) The use of delay-differential equations in chemical kinetics. J. Phys. Chem. 100, 8323–8330.
3Marc R. Roussel and Rui Zhu (2006) Validation of an algorithm for delay stochastic simulation of transcription and translation in prokaryotic gene expression. Phys. Biol. 3, 274–284.

Chemists and ethics

A few weeks ago, I read a very interesting paper surveying graduate students about the ethical behavior of people around them.1 The paper is a little old, but it’s still worth a read, if only to remind ourselves that students do see things, and that what they see affects them in various ways.

Basically, the survey asked students if they had seen various kinds of misconduct. As a chemist, the following passage really struck a nerve:

Chemistry students are most likely to be exposed to research misconduct, chemistry and microbiology students are most likely to observe employment misconduct… [p. 337]

Now that bothered me. Why should there be more misconduct in chemistry than in other fields of research included in this survey? Now I know enough to understand the limitations of surveys with smallish samples, particularly surveys that rely on voluntary participation, but still, it bothered me. The paper does go on to discuss departmental characteristics that explain much of the variance between disciplines, so that these results may reflect a small-sample fluctuation at the department level. Still, it bothered me.

Reading this paper started me thinking about other things. A few years ago, I had the privilege to teach a course entitled Contemporary Chemistry. This is a regular course offering in our Department, which all undergraduate chemistry majors have to take as part of their degrees. We do a number of different things there: The departmental seminar program runs in the time slot of this course; we work on the students’ writing and oral presentation skills; and when I taught it, I introduced an ethics module. It was interesting getting students to grapple with various ethical dilemmas, including their responsibilities as members of a profession, which was a new idea for them.

While preparing for this course, I bought a book entitled The Ethical Chemist by Jeffrey Kovac.2 There are a lot of things I really liked about this book, particularly its emphasis on ethics as a practical matter: Like it or not, you’re going to run into ethical conundra, so you need the knowledge and skills to deal with them, just as you need to know how to recrystallize compounds or interpret NMR spectra. The book includes a rich variety of case studies, some of which are less straightforward than others. If you’re going to teach an ethics module to chemists, I highly recommend this book.

I did find myself in disagreement with this book in its discussion of the reporting of yields, presented as a set of case studies. Here is an excerpt from one of these case studies:

You […] have just finished the synthesis and characterization of a new compound and are working on a communication to a major journal reporting the work. While you have made and isolated the new compound, you have not yet optimized the synthetic steps, so the final yield is only 10%. From past experience, you know that you probably will be able to improve the yield to at least 50% by refining the procedure. Therefore, when writing the communication, you report the projected yield of 50% rather than the actual figure. [p. 29]

Then, when discussing this case study, Kovac makes the point that scientific papers tell a linear story, without all the twists, turns and dead-ends of laboratory research. So far, so good. However, he then writes

[…] an experienced researcher is convinced based on past history that the yield can and will be improved. Why not report the higher figure? By the time anyone reads the article it will be true. [p. 30]

I have a bunch of problems with this suggestion:

  • It may be a high probability statement that the yield can be improved to 50%, but it’s not a certainty. What if you can only get the yield up to 30%? It wouldn’t be research if we knew ahead of time what the outcome was going to be, and you really don’t know that you can get a 50% yield until you actually get a 50% yield.
  • Let’s say you are eventually successful in reaching a 50% yield. You won’t get there with the reaction conditions you put into the manuscript. Those conditions will get you 10%. Every chemist I know has, at some point or other, complained that their colleagues withhold important details when writing up their syntheses. Here, you’re not withholding information you have in hand, but the effect is the same: You can’t get a 50% yield with the conditions disclosed in your paper. You might be giving your lab an edge by optimizing the synthesis after the paper has been sent out, but if you can’t correct the synthetic conditions before the paper is published, you are going to be wasting the time of every other lab that wants to follow up on your work.
  • If you can just put a larger number in the paper without doing the work, why would you optimize the synthesis at all? To me, making up numbers because you “know” you can get there is an extraordinarily slippery slope.

It’s an excellent book, and I really don’t want to beat up on Kovac. However, I wonder how other chemists feel about this? At what point have you crossed a line from presenting your data in its best possible light to fabricating? Even if we intend to correct the reactions conditions in the galleys prior to final publication, what are we teaching our students if we tell them that we’ll embellish the results now to maximize the probability of acceptance, or to make sure we win the race with another lab, or to pad our CVs before some grant or scholarship competition?

At some point, we have to say that we’re going to hold ourselves to the highest possible standards so that our students don’t grow up in an environment where they routinely observe misconduct.1 I think we owe it to them, and we owe it to the society that pays us to do research and to teach.

1Anderson, M. S.; Louis, K. S. & Earle, J. Disciplinary and Departmental Effects on Observations of Faculty and Student Misconduct. J. Higher Ed., 1994, 65, 331-350

2Kovac, J. The Ethical Chemist. Pearson Prentice Hall, 2004

Heat and energy

A few months ago, I had a very interesting email interaction with Stephen Rader of the University of Northern British Columbia about heat, energy, and related concepts, prompted by his reading of my book. With Stephen’s permission, here is a transcript of that exchange, slightly edited so that it makes sense as a dialog:

Stephen: As I read the intro chapter to thermodynamics, I was interested to see that you define heat as an interaction between systems. I have always thought of heat (and explained it to my students) as energy — something that a system can contain different amounts of. Am I mistaken? Your definition makes it sound as though there cannot be any heat without having more than one system, which I am having a hard time wrapping my head around.

Marc: This is a subtle question. The short answer is yes, you have been mistaken, but that a lot of very bright people have also struggled with this question.

The proof that bodies don’t contain heat (which was the basis of the caloric theory) is contained in Rumford’s canon-boring experiments. Rumford was a British physicist and professional soldier who was assigned the task of overseeing the boring of canons in Munich at one point in his colorful career. He became interested in the amount of heat liberated by this process, and made some rough measurements of this heat. The amount of heat liberated was extremely large, and incompatible with the existence of the metal in solid form should this amount of heat have been held in the body prior to the act of boring. (The caloric theory would have said that boring the canon released the caloric it previously contained. I realize that you wouldn’t have explained this experiment this way, but bear with me.) We now understand this experiment as follows: During the boring of Rumford’s canons, work is done by the boring tool on the metal blank. The work represents a transfer of energy from some energy source turning the tool to the tool/blank/cuttings system. This energy shows up in the “products” in various ways: Cutting, which breaks metallic bonds and also tends to create crumpled cuttings, obviously takes energy, which increases the potential energy of the products in various ways (mostly, the potential energy associated with the ability of the cut surfaces to form new bonds is increased). Much of the energy not used for cutting per se ends up being stored in the tool, blank and cuttings. How is it stored? Basically, it is stored by populating higher energy levels of the (in this case) metallic lattice, which is to say that it raises the temperature of the object. If this excess energy (relative to the thermal population at ambient temperature) is subsequently transferred to another material by contact alone (to the air or to the water typically used to cool the tool and workpiece in these processes), then we can talk about heat transfer. However, the excess energy could also come out (in part) as work: I could use it to operate a heat engine or a thermoelectric generator. There is therefore no way to identify how much of the excess energy held by the body is heat. While it’s stored in a body, it’s just energy.

Now consider an ideal gas compressed reversibly and adiabatically. We’re doing work on the gas, but you’re probably familiar with the fact that the temperature of the gas will increase during this operation. The work done has been stored as energy in the gas, resulting in a temperature increase. (We could make a similar argument for a non-ideal gas, but there we would have the added complication that the energy depends on both the temperature and the volume.) One possible way for this energy to come out is as heat: If we put the gas in thermal contact with a body at a lower temperature, heat will “flow” (bad language that is a holdover from the caloric theory, along with heat “capacity”). However, because the process was reversible, we could get all the original work out by just reversing the path, returning the gas to its original state. The key point here is that energy was stored. Whether we get out heat or work or a combination of both depends on how we allow the gas to interact with its surroundings after the energy has been stored.

I hope that helps a little. As you noted yourself, it’s a hard issue to wrap your head around, especially since we have inherited a lot of unhelpful language and consequent imagery from the caloric theory. Many otherwise fine textbooks still describe these issues using inappropriate language. The idea that bodies store heat seems to be hard to extricate from the literature, and has become deeply embedded in our way of thinking. I had to write the paragraph on Rumford’s canons above extremely carefully to avoid falling into misleading language or imagery myself.

Stephen: Do I understand correctly that the major flaw in the caloric theory is that, when we put energy into a system, we don’t know how it partitions between kinetic motions (that can be measured as an increased temperature) and other types of internal energy?

Are you, in effect, saying that we should not talk about heat except when it is transferred from one object or system to another? In other words, since the amount of available energy in a system is not defined until one tries to get it out, and how we get it out determines how much there is, that we can only talk about the energy of an object or system, rather than how much heat is in it?

I tend to think about thermodynamic properties in very concrete terms (what the atoms are doing), which probably hinders my understanding of some of these concepts.

Marc: These questions raise several interesting issues. I will deal with them one at a time.

Do I understand correctly that the major flaw in the caloric theory is that, when we put energy into a system, we don’t know how it partitions between kinetic motions (that can be measured as an increased temperature) and other types of internal energy?

I wouldn’t say so. The real problem was that various lines of evidence acquired by Rumford, Joule and others show that heat can be made (in sometimes impressive quantities) by processes that cannot be explained as involving the liberation of heat already contained within a body.

It’s important to try to tease apart the concepts of heat and temperature in our heads. They have some important connections, but they are intrinsically different. By bringing up “kinetic motions”, I suspect that you are thinking of the definition we’ve all heard of heat as kinetic energy. The trouble is that there isn’t a clean distinction between kinetic and other forms of energy in quantum mechanics, and that if we’re out of thermal equilibrium, a system can have several different “temperatures”, or none at all.

Temperature is a surprisingly difficult quantity to define, but I think that most people in the field would tend to define it these days in terms of the Boltzmann distribution: The temperature is the value T such that, at thermal equilibrium, the energy of the system is distributed among the available energy levels according to a Boltzmann distribution. Now consider a monatomic gas. Such a gas can store energy in two significant ways: translational (kinetic) energy and electronic energy. At room temperature, the gap between the highest occupied and lowest unoccupied orbitals is so large that essentially all the atoms are in their electronic ground state. From a macroscopic, thermodynamic point of view, we would tend to say that no energy is stored in the electronic energy levels (give or take the arbitrary nature of what we call zero energy, which isn’t relevant to energy storage, the latter only involving energy I could somehow extract from the system). Now imagine that I have my gas inside a perfectly optically reflective container. At some point in time, I open a small shutter, and fire a laser into the container whose wavelength is tuned to match match an absorption wavelength of the atoms. Some of the atoms will absorb some of the photons and reach an excited state. If, before this system has a chance to equilibrate, I ask the question what is the temperature of the system?, I now have a problem. The translational energy will still obey a Boltzmann distribution for the pre-flash temperature T. However, the electronic energy distribution does not correspond to a Boltzmann distribution with temperature T. It may correspond to a Boltzmann distribution with a much higher temperature Te. If the laser was sufficiently intense, I might even have created a population inversion and have an electronic energy distribution that corresponds to a negative absolute temperature. (Negative absolute temperatures are a strange consequence of the way we have defined our temperature scale. They are hotter than any temperature that can be described by a normal Boltzmann distribution with a positive temperature.) The system therefore has, at best, two distinct temperatures. It’s also possible to come up with distributions that can’t be described by a temperature at all. (This might require two laser pulses at two different wavelengths.) Now if we wait long enough, we will return to an equilibrium (i.e. Boltzmann) distribution in this particular system. Getting back to heat now, it’s clear that the system we have prepared with our laser pulse is “hot”: I could certainly extract energy from this system as heat. However, the system doesn’t, for the time being, have a single temperature, and the translational energy temperature grossly underestimates the energy available in this system. (Electronic energy can’t be teased apart into kinetic and potential energy contributions due to the way these quantities appear in quantum mechanics.) This is another way in which the connection between heat and temperature is problematic. Note also that in a nonequilibrium situation like this one, thermometers of different constructions would register different temperatures, unlike the situation for matter in equilibrium.

I’m cheating a little bit by describing a system out of equilibrium, because the classic thermodynamic theory is a theory of equilibrium states. Nevertheless, the basic problem remains that from a statistical thermodynamic standpoint, temperature measures (loosely speaking) the energy levels accessible to a body, from which we can (if we have enough information about the energy levels) compute the total energy (relative to some arbitrarily selected zero, often the ground-state energy). There is no useful microscopic construct that corresponds to heat stored.

Are you, in effect, saying that we should not talk about heat except when it is transferred from one object or system to another?

Yes.

In other words, since the amount of available energy in a system is not defined until one tries to get it out, and how we get it out determines how much there is, that we can only talk about the energy of an object or system, rather than how much heat is in it?

I’m not sure that I would say that the amount of available energy in a system is not defined since we should (give or take third-law limitations) be able to extract any energy above the ground-state energy. What I would say is that we can’t specify how much heat is in a body because you can extract energy as heat or work. Let’s go back to my monatomic gas. If we allow it to come to equilibrium (which might take a long time, but we’re patient), the translational energy will increase and the electronic energy decrease until both obey a Boltzmann distribution. At that point, I will see that that system has a single, well-defined temperature larger than the original temperature T. (This is assuming that my container is insulated and rigid.) I could extract energy from this system by putting it in thermal contact with another system at a lower temperature. Since p = nRT/V, the pressure will also have gone up, so I could also get out some of the energy I put in as work by allowing the gas to escape into a piston and using the motion of the piston to push something (e.g. turn a motor). Note that I could have achieved the same effect by heating the gas with a torch. Just because I put heat into a system doesn’t mean that I can only get heat out. Really, I’ve just increased the molecular energy, whose mean value (assuming there is a single temperature, as discussed above) is related to the observable temperature.

I tend to think about thermodynamic properties in very concrete terms (what the atoms are doing), which probably hinders my understanding of some of these concepts.

I always tell my students that it doesn’t even matter if atoms exist in pure, classical thermodynamics. In fact, Ernst Mach, who wrote some very influential material on thermodynamics, went to his grave saying that atoms were an unnecessary theoretical construct. Now, that being said, atomic theory enriches our understanding of what U and S mean since it allows us to talk in reasonably concrete terms about where the energy has gone, or what exactly it is that S measures. I’m therefore not sure that it’s your “concrete” thinking that is getting in the way. Rather, it’s the language we use to describe heat that is the problem. This language puts incorrect images into our heads that are incredibly difficult to get rid of. Worse yet, we may have had educational experiences that reinforced those images rather than pointing out their rather severe limitations.

Naturwissenschaften’s 100 most cited papers (continued)

One very interesting area of application for ideas and techniques from nonlinear dynamics is the study of biological cycles. The circadian rhythm, the internal 24 hour clock that a very large number of organisms have, has been a particular object of study over the years. Naturwissenschaften‘s list of 100 most cited papers includes a classic paper by Aschoff and Pohl on phase relations between a circadian rhythm and a zeitgeber (a stimulus that entrains the clock). The most prominent zeitgeber is of course the day-night cycle, but other stimuli can reset your circadian clock, including meal times and social interaction. In this particular paper, the authors examine the relationship between the circadian phase (e.g. the time of maximum observed activity relative to start of day) and the day length. Studies like these often use ideas from nonlinear dynamics on the entrainment of oscillators to derive insights into the workings of the clock based on how the phase changes as the difference between the natural frequency of the clock and the entraining (zeitgeber) frequency increases. In this case however, the authors focused on quantitative differences between the phase responses of different groups of organisms. We now know that there are several, evolutionarily distinct circadian oscillators operating in different groups of organisms to which the results of Aschoff and Pohl could likely be correlated.

Coupled oscillators are a recurring theme in nonlinear dynamics, and the Naturwissenschaften list also includes a paper by Hübler and Lüscher on the possibility of controlling one oscillator using a driving signal derived from a second oscillator. Although this is very much a fundamental study, this kind of work has found a number of applications over the years. I have already mentioned the use of such studies to understand biological oscillators. Coupled oscillators show up all over the place, both in natural and in engineered systems. One example that has been the focus of a lot of research is the use of coupled oscillators in secure communications. The problem here is that you want an authorized receiver to get your message, but you don’t want anyone else to be able to eavesdrop. I’m not sure what the current status of this research is, but there have been a number of proposals over the years to use coupled chaotic oscillators for this purpose. The original idea was (relatively) simple: If two chaotic oscillators have the same parameters, they can be made to synchronize by introducing a driving signal that increases with the difference between the transmitted signal and the computed signal at the receiver. Even small differences in parameters are enough to ruin the synchronization because of the sensitive dependence property of chaotic systems. If you add a low-amplitude signal to the transmitted signal, the receiver will still synchronize to the transmitted chaotic “carrier”. The computed chaotic trajectory at the receiver can be subtracted from the incoming signal, the difference then being the superimposed message. The key to make this work is to share a set of parameters for the chaotic system via a private channel. Easy in principle, but there are lots of technical conditions that have to be met to make this work, and lots of variations to be explored to find the most secure means of encoding the message within the transmitted signal.

The control of a process by an oscillator sometimes also has spatial manifestations. An example of this is provided in the paper on slime-mold aggregation by Gerisch in the Naturwissenschaften top-100 list. When they are well fed, slime molds live as single cells. Starve them, and they aggregate, form a fruiting body, and disperse spores. How do they know where to go during the aggregation process? The answer turns out to involve periodic cyclic AMP (cAMP) signaling. Starving cells put out periodic pulses of cAMP. The cells don’t all signal at the same rate, and they tend to synchronize to and move toward the fastest signaler. Note again the importance of coupled oscillators: This works in part because the cells “listen” to each other’s cAMP signals and adjust the frequency of their own oscillator to match the fastest frequency they “hear”.

Well, those are the things that caught my attention in the Naturwissenschaften list. There is lots of other wonderful stuff in there. I would love to hear what caught everyone else’s fancy.

100 most cited papers from Naturwissenschaften

The journal Naturwissenschaften turns 100 this year. Naturwissenschaften translates as “The Science of Nature”. It’s a journal that publishes papers in all areas of the biological sciences, broadly conceived. As many other journals have done, Naturwissenschaften is celebrating its 100th anniversary by posting a list of its 100 most cited papers. As with all such lists, especially with generalist journals like this one, what you find interesting may be different from what I find interesting, so it’s worth taking a look at the list yourself. However, if you’re reading this blog, perhaps we share some interests.

The first thing I noticed was that the list contained several of Manfred Eigen’s papers on biological self-organization, including his classic papers on the hypercycle. These papers were intended to address the problem of how biological organisms may have gotten started. The emphasis of this work tended to be on self-replicating molecular systems, such as the hypercycle, which is a family of models consisting of networks of autocatalytic units coupled in a loop. I’m not sure how large a contribution these papers made to the problem of the origin of life, but they sure caught people’s imaginations when they were written, and they led to interesting questions about the dynamics of systems with loops which are still being actively studied. If you have never read anything about hypercycles and have an interest either in theories on the origin of life or in nonlinear dynamics, you should track down these papers and read them. They will likely seem a little dated—they were written in the 1970s—but I think they’re still interesting.

Also near the top of the list, we see a paper by Karplus and Schulz on the “Prediction of chain flexibility in proteins”. Protein dynamics is all the rage these days. Everybody wants to think about how their favorite protein moves. This wasn’t always so. In the 1980s when this paper was published, we were starting to see a steady flow of high-quality x-ray protein structures. People were making very good use of these structures to understand protein function, and of course that is still the case. However, there was a tendency for biochemists back then to think of protein structure as an essentially static thing. This tendency was so pronounced that I remember attending a seminar in the mid-1990s at which the speaker made a point of talking about how cool it was that part of his enzyme could be shown to have a large-scale motion as part of its working cycle! The Karplus and Schulz paper therefore has to be understood in this context. At the time it was written, it wasn’t so easy to recognize flexible parts of proteins, and there was a lot of skepticism that flexibility was important to protein function. Needless to say, things have changed a lot.

The Naturwissenschaften list also includes a paper by Bada and Schroeder on the slow racemization of amino acids and its use for dating fossils. Living organisms mostly use the L isomers of the amino acids. Over time though, amino acids tend to racemize to a mixture of the L and D forms. While an organism is alive, this process is, in most tissues, completely insignificant since proteins are turned over relatively rapidly. After an organism dies, turnover starts, and we can use the D to L ratio to date fossil materials. There are other interesting applications of this technique, including its use to determine the ages of recently deceased organisms using the eye lens nucleus, a structure formed in utero. I wrote about this dating technique in my book.

I’ll come back to Naturwissenschaften‘s list in a few days. There are a number of other papers in there that I think are interesting.