A few months ago, I had a very interesting email interaction with Stephen Rader of the University of Northern British Columbia about heat, energy, and related concepts, prompted by his reading of my book. With Stephen’s permission, here is a transcript of that exchange, slightly edited so that it makes sense as a dialog:
Stephen: As I read the intro chapter to thermodynamics, I was interested to see that you define heat as an interaction between systems. I have always thought of heat (and explained it to my students) as energy — something that a system can contain different amounts of. Am I mistaken? Your definition makes it sound as though there cannot be any heat without having more than one system, which I am having a hard time wrapping my head around.
Marc: This is a subtle question. The short answer is yes, you have been mistaken, but that a lot of very bright people have also struggled with this question.
The proof that bodies don’t contain heat (which was the basis of the caloric theory) is contained in Rumford’s canon-boring experiments. Rumford was a British physicist and professional soldier who was assigned the task of overseeing the boring of canons in Munich at one point in his colorful career. He became interested in the amount of heat liberated by this process, and made some rough measurements of this heat. The amount of heat liberated was extremely large, and incompatible with the existence of the metal in solid form should this amount of heat have been held in the body prior to the act of boring. (The caloric theory would have said that boring the canon released the caloric it previously contained. I realize that you wouldn’t have explained this experiment this way, but bear with me.) We now understand this experiment as follows: During the boring of Rumford’s canons, work is done by the boring tool on the metal blank. The work represents a transfer of energy from some energy source turning the tool to the tool/blank/cuttings system. This energy shows up in the “products” in various ways: Cutting, which breaks metallic bonds and also tends to create crumpled cuttings, obviously takes energy, which increases the potential energy of the products in various ways (mostly, the potential energy associated with the ability of the cut surfaces to form new bonds is increased). Much of the energy not used for cutting per se ends up being stored in the tool, blank and cuttings. How is it stored? Basically, it is stored by populating higher energy levels of the (in this case) metallic lattice, which is to say that it raises the temperature of the object. If this excess energy (relative to the thermal population at ambient temperature) is subsequently transferred to another material by contact alone (to the air or to the water typically used to cool the tool and workpiece in these processes), then we can talk about heat transfer. However, the excess energy could also come out (in part) as work: I could use it to operate a heat engine or a thermoelectric generator. There is therefore no way to identify how much of the excess energy held by the body is heat. While it’s stored in a body, it’s just energy.
Now consider an ideal gas compressed reversibly and adiabatically. We’re doing work on the gas, but you’re probably familiar with the fact that the temperature of the gas will increase during this operation. The work done has been stored as energy in the gas, resulting in a temperature increase. (We could make a similar argument for a non-ideal gas, but there we would have the added complication that the energy depends on both the temperature and the volume.) One possible way for this energy to come out is as heat: If we put the gas in thermal contact with a body at a lower temperature, heat will “flow” (bad language that is a holdover from the caloric theory, along with heat “capacity”). However, because the process was reversible, we could get all the original work out by just reversing the path, returning the gas to its original state. The key point here is that energy was stored. Whether we get out heat or work or a combination of both depends on how we allow the gas to interact with its surroundings after the energy has been stored.
I hope that helps a little. As you noted yourself, it’s a hard issue to wrap your head around, especially since we have inherited a lot of unhelpful language and consequent imagery from the caloric theory. Many otherwise fine textbooks still describe these issues using inappropriate language. The idea that bodies store heat seems to be hard to extricate from the literature, and has become deeply embedded in our way of thinking. I had to write the paragraph on Rumford’s canons above extremely carefully to avoid falling into misleading language or imagery myself.
Stephen: Do I understand correctly that the major flaw in the caloric theory is that, when we put energy into a system, we don’t know how it partitions between kinetic motions (that can be measured as an increased temperature) and other types of internal energy?
Are you, in effect, saying that we should not talk about heat except when it is transferred from one object or system to another? In other words, since the amount of available energy in a system is not defined until one tries to get it out, and how we get it out determines how much there is, that we can only talk about the energy of an object or system, rather than how much heat is in it?
I tend to think about thermodynamic properties in very concrete terms (what the atoms are doing), which probably hinders my understanding of some of these concepts.
Marc: These questions raise several interesting issues. I will deal with them one at a time.
Do I understand correctly that the major flaw in the caloric theory is that, when we put energy into a system, we don’t know how it partitions between kinetic motions (that can be measured as an increased temperature) and other types of internal energy?
I wouldn’t say so. The real problem was that various lines of evidence acquired by Rumford, Joule and others show that heat can be made (in sometimes impressive quantities) by processes that cannot be explained as involving the liberation of heat already contained within a body.
It’s important to try to tease apart the concepts of heat and temperature in our heads. They have some important connections, but they are intrinsically different. By bringing up “kinetic motions”, I suspect that you are thinking of the definition we’ve all heard of heat as kinetic energy. The trouble is that there isn’t a clean distinction between kinetic and other forms of energy in quantum mechanics, and that if we’re out of thermal equilibrium, a system can have several different “temperatures”, or none at all.
Temperature is a surprisingly difficult quantity to define, but I think that most people in the field would tend to define it these days in terms of the Boltzmann distribution: The temperature is the value T such that, at thermal equilibrium, the energy of the system is distributed among the available energy levels according to a Boltzmann distribution. Now consider a monatomic gas. Such a gas can store energy in two significant ways: translational (kinetic) energy and electronic energy. At room temperature, the gap between the highest occupied and lowest unoccupied orbitals is so large that essentially all the atoms are in their electronic ground state. From a macroscopic, thermodynamic point of view, we would tend to say that no energy is stored in the electronic energy levels (give or take the arbitrary nature of what we call zero energy, which isn’t relevant to energy storage, the latter only involving energy I could somehow extract from the system). Now imagine that I have my gas inside a perfectly optically reflective container. At some point in time, I open a small shutter, and fire a laser into the container whose wavelength is tuned to match match an absorption wavelength of the atoms. Some of the atoms will absorb some of the photons and reach an excited state. If, before this system has a chance to equilibrate, I ask the question what is the temperature of the system?, I now have a problem. The translational energy will still obey a Boltzmann distribution for the pre-flash temperature T. However, the electronic energy distribution does not correspond to a Boltzmann distribution with temperature T. It may correspond to a Boltzmann distribution with a much higher temperature Te. If the laser was sufficiently intense, I might even have created a population inversion and have an electronic energy distribution that corresponds to a negative absolute temperature. (Negative absolute temperatures are a strange consequence of the way we have defined our temperature scale. They are hotter than any temperature that can be described by a normal Boltzmann distribution with a positive temperature.) The system therefore has, at best, two distinct temperatures. It’s also possible to come up with distributions that can’t be described by a temperature at all. (This might require two laser pulses at two different wavelengths.) Now if we wait long enough, we will return to an equilibrium (i.e. Boltzmann) distribution in this particular system. Getting back to heat now, it’s clear that the system we have prepared with our laser pulse is “hot”: I could certainly extract energy from this system as heat. However, the system doesn’t, for the time being, have a single temperature, and the translational energy temperature grossly underestimates the energy available in this system. (Electronic energy can’t be teased apart into kinetic and potential energy contributions due to the way these quantities appear in quantum mechanics.) This is another way in which the connection between heat and temperature is problematic. Note also that in a nonequilibrium situation like this one, thermometers of different constructions would register different temperatures, unlike the situation for matter in equilibrium.
I’m cheating a little bit by describing a system out of equilibrium, because the classic thermodynamic theory is a theory of equilibrium states. Nevertheless, the basic problem remains that from a statistical thermodynamic standpoint, temperature measures (loosely speaking) the energy levels accessible to a body, from which we can (if we have enough information about the energy levels) compute the total energy (relative to some arbitrarily selected zero, often the ground-state energy). There is no useful microscopic construct that corresponds to heat stored.
Are you, in effect, saying that we should not talk about heat except when it is transferred from one object or system to another?
Yes.
In other words, since the amount of available energy in a system is not defined until one tries to get it out, and how we get it out determines how much there is, that we can only talk about the energy of an object or system, rather than how much heat is in it?
I’m not sure that I would say that the amount of available energy in a system is not defined since we should (give or take third-law limitations) be able to extract any energy above the ground-state energy. What I would say is that we can’t specify how much heat is in a body because you can extract energy as heat or work. Let’s go back to my monatomic gas. If we allow it to come to equilibrium (which might take a long time, but we’re patient), the translational energy will increase and the electronic energy decrease until both obey a Boltzmann distribution. At that point, I will see that that system has a single, well-defined temperature larger than the original temperature T. (This is assuming that my container is insulated and rigid.) I could extract energy from this system by putting it in thermal contact with another system at a lower temperature. Since p = nRT/V, the pressure will also have gone up, so I could also get out some of the energy I put in as work by allowing the gas to escape into a piston and using the motion of the piston to push something (e.g. turn a motor). Note that I could have achieved the same effect by heating the gas with a torch. Just because I put heat into a system doesn’t mean that I can only get heat out. Really, I’ve just increased the molecular energy, whose mean value (assuming there is a single temperature, as discussed above) is related to the observable temperature.
I tend to think about thermodynamic properties in very concrete terms (what the atoms are doing), which probably hinders my understanding of some of these concepts.
I always tell my students that it doesn’t even matter if atoms exist in pure, classical thermodynamics. In fact, Ernst Mach, who wrote some very influential material on thermodynamics, went to his grave saying that atoms were an unnecessary theoretical construct. Now, that being said, atomic theory enriches our understanding of what U and S mean since it allows us to talk in reasonably concrete terms about where the energy has gone, or what exactly it is that S measures. I’m therefore not sure that it’s your “concrete” thinking that is getting in the way. Rather, it’s the language we use to describe heat that is the problem. This language puts incorrect images into our heads that are incredibly difficult to get rid of. Worse yet, we may have had educational experiences that reinforced those images rather than pointing out their rather severe limitations.