Category Archives: For graduate students (mostly)

How to read a scientific paper

OK, so you’ve identified some papers that you should read (or just want to read). Now what? A scientific paper isn’t like a novel. You can’t usually make one quick pass through it and expect to get all the details. However, you may not need to. In fact, you may not need to read most of it at all. How do you decide where to focus your reading? That is the question I will tackle in this blog post.

Why are you reading this paper?

That’s the very first question you need to answer. You will read a paper very differently depending on what you are trying to get from it. Here are some reasons for reading a paper, and some notes on how that might affect your approach to the paper:

  • You’re “reading around” your research topic, to get some perspective. If that’s the case, you might be able to skip a lot of the technical stuff and focus on the parts of the paper (introduction, discussion, conclusions) where the authors explain what they have learned and how it fits into the bigger picture.
  • You’re trying to look up a particular piece of information. Depending on the nature of that piece of information, you might be able to just quickly scan the paper for it. In the end, you might look at the figures and tables, and maybe read a few paragraphs where a particular result is presented and discussed.
  • You want to find out about a technique. Unless the whole paper is about that technique, you might be able to focus relatively narrowly on the methods section and the few paragraphs where the authors explain how they used this technique and what they got from it.

Types of papers

There are lots of different types of papers, but a useful first division is to separate the primary research literature from reviews and commentaries. The primary research literature provides reports of (hopefully) new scientific studies. Reviews and commentaries on the other hand try to put previously published research into some sort of context. Typically, if you’re going to read a review paper, it’s because you want to get a well-rounded picture of some area of research. You would usually read a review from front to back. Because they are intended to explain and summarize some area of research, review papers are usually easier to read than the primary literature. The trade-off is that you’re getting one person’s (or a  group’s) view of the field, which at times can be quite biased. In any event, reading review papers doesn’t pose the kinds of challenges nor require the kinds of decisions you have to make when reading the primary literature. The rest of this post focuses on issues that arise when reading papers from the primary literature.

Anatomy of a paper

A scientific paper consists of several, more-or-less standard parts. There’s a bit more variation in theoretical papers than in experimental papers, but at least some of this structure remains in the former. It’s important to understand the purpose of each of these sections in order to read papers efficiently.

All papers start with a title and authorship information. A good title ought to tell you a lot about the main message of the paper, but the truth is that there are plenty of good and useful papers with bad or useless titles. A good title might catch your attention, but an uninformative title shouldn’t necessarily stop you from reading on. In other words, don’t judge a paper by its title.

The title and authors are generally followed by an abstract (not always labeled as such) which is a (usually) one-paragraph summary of the article. Here, the authors are telling you what they think is important in their article. Always read the abstract, no matter why you’re reading the paper. The abstract helps you understand the authors’ perspective on their work, and may point out additional points of interest you can look for in the paper other than what you were hunting for.

Almost all papers start with an introduction which, again, may not be titled as such. The introduction sets the stage for the rest of the paper. It provides background, and therefore typically contains a lot of references, which can be useful when you’re coming up to speed on a topic. It explains the problem the authors hope to solve. Most introductions also provide a brief overview of the paper, right at the end before the main body of the paper starts. As a rule, you want to read the introduction, if for no other reason than that it often functions as a mini-review of the field. The one exception might be the case where you’re looking for a very specific piece of technical information, in which case you might skip the introduction. Even in the latter case, you might want to look for the overview at the end of the introduction since this might help you find what you are looking for a bit faster.

Traditionally, the introduction of an experimental paper was followed by a materials and methods section. Some journals put this section at the end of the paper rather than after the introduction, and some only contain a summary of the methods, with the details appearing in online supplementary materials. Theory papers often don’t have a methods section. Rather, the methods are discussed in the body of the paper since they’re not easily separated from other aspects of the work. You will often skip the methods section, unless you’re there to learn one of the methods used in the paper. Why? If the authors have done a good job of writing the rest of the paper, they will tell you (at least in general terms) what experiments they did as they are describing their results. Moreover, the methods often don’t make all that much sense disconnected from the reason for performing the experiment. If you find that the descriptions of the experiments in the text are not sufficient for your purposes, you can always go look at the methods later.

The main part of the paper will be taken up with a description of what was done and what we can conclude from that, possibly intermixed with a broader discussion of significance. While some journals insist on a clearly labeled Results section, others allow more latitude in dividing up this part of the paper into sections with titles that describe the topic(s) they attack. Depending on why you are reading a paper, this may be a section you read really carefully, one that you skim to find a specific fact, or one that you skip altogether. Nine times out of ten you will read most of this section, since this is where the new science is described, and reading about the advance made in a paper is the usual reason for reading the primary literature. When would you skip the results section(s)? Mostly this would be the case when you’re just acquiring background information, in which case you might just read the introduction and conclusions. Of course, a good review paper or two might be a better way to pick up background information, but there isn’t review paper to cover every possible topic.

The conclusions section is where the authors wrap up: They present their major findings, indicate how those findings do or don’t support one or another hypothesis, and often present some idea of how they think the area of research might evolve from there. Like the introduction, this section usually tries to give us a broader context, so it typically contains quite a few references. The conclusions are often worth reading for that reason alone. More than that, they summarize the technical work that came before, often in much simpler language, so the conclusions can help you understand what, from the authors’ perspective, their work contributes to the field. You will almost always read this section carefully.

Hard papers

Many scientific papers are hard to read, for a variety of reasons. One of the things that often hampers readability is that the work presented is highly technical, so it’s hard for a non-expert to follow for lack of relevant expertise. It’s important not to get discouraged when you feel a bit lost while reading a paper. It’s a common feeling and, unless you really need to understand a paper in depth, which is sometimes necessary for key papers very close to your area of research, it’s often just fine not to get every detail. If you’re really having trouble following the detailed argument in the results section, at least try to understand the general flow of the argument. Failing that, concentrate on the introduction and conclusions, which will tell you what the authors wanted to do and what they claim to have accomplished.

Papers containing mathematical content are a very special subset of hard papers. Here’s a little secret: Unless you are reading a paper to find out how someone proved a particular theorem, you can almost always skip all the proofs. Moreover, you can often skim the equations rather than reading them carefully. A well-written mathematical paper will tell you why certain equations are being derived, which ones are important, what they tell you, and so on. There may only be one or two particularly important equations you need to look at carefully. There may not be any equations worthy of careful scrutiny. Again, it all depends on what you’re hoping to get from the paper. The conclusions obtained from the equations, which will be described in text, may be all you need to know. Alternatively, you may be interested in the final equation obtained, but not in the detailed derivation or proof. We thus come back to the point that you have to have a fairly clear idea of why you are reading a paper.

Figures and tables

Figures and tables, especially if they are accompanied by a good caption, are sometimes worthy of your attention, and sometimes much less so. In some papers, the figures practically tell the whole story: how variable A is related to variable B, the results of key experiments, and so on. They’re generally worth a look, with the caveat that it may happen that you hit a figure you simply don’t understand, perhaps because the interpretation of the figure requires technical knowledge you simply don’t have. When that happens, deal with it as you do with text you don’t understand: If it’s really important to you, work at it or find someone to explain it to you. If not, skip it.

Results tables (as opposed to tables that give the parameters of an experiment) are usually worth a look, too. Papers contain many more figures than tables. If the authors included a table, it’s probably because the data in that table are inherently important or, at least, interesting. Even if you don’t need it now, make a note of papers that contain data tables since you may need those data at some point in the future.

Summary

I would summarize this blog post as follows:

  • Think about what you want to get from a paper before you sit down to read it, and focus your reading on the part(s) of the paper most likely to give you what you want.
  • Don’t worry about parts of the paper you don’t understand, unless the part you don’t understand specifically addresses your reason for reading the paper.
  • The parts of the paper where the authors give you their perspective, mostly the introduction and conclusions, but possibly other sections with extensive interpretive comments, are often the most valuable parts of a paper.

Keeping up with the literature

If you’re a researcher, whether you’re a graduate student starting out or a seasoned scientist, keeping up with the literature is hard work. Especially in today’s world, you probably need to keep an eye not only on your own narrow specialty, but on a number of related areas of research. The problem is particularly bad for those of us who engage in multidisciplinary research. In this blog post, I’m going to suggest some techniques you can use to keep up with relevant developments. This post is mostly intended for my student readers. The experienced scientists who read this blog will have their own strategies for keeping abreast of the latest developments, and I would invite them to share their ideas here.

Before I get into the mechanics however, let me address an important question: Why should you read scientific papers? Why not just lock yourself into your lab and devote yourself completely to your research work? Well, you could try that, but you probably wouldn’t turn out to be a very effective scientist. By reading relevant papers, you can of course avoid repeating work that has already been done. You can learn techniques that you can apply in your own research. You can learn about ideas that might impact the interpretation of your results, or even open up new directions in your own research. On a purely selfish level, having a broad awareness of the science going on around you will help you get through your comprehensive exam and oral thesis defence. So you need to read.

How, then, do you find relevant papers to read?

The references in papers

When you started your research, your supervisor probably handed you a handful of papers and told you to read those. And of course, you will read many other papers as your program unfolds. When you read a paper, you are not always expected to master every detail, but you should be on the lookout for points that are particularly relevant to your research. Often, the things that will catch your attention in a paper are not the primary results, but points that are brought up in the introduction or discussion relating the current work to some earlier research, or a method borrowed from an earlier study. You should be on the lookout for particularly relevant references, and read these. To be blunt: your supervisor expects this of you, although many won’t say so directly.

Searches and alerts

You will often need to search the literature for information on specific topics. That’s probably obvious to you. You may also discover a key author whose work is particularly close to yours, and you will likely want to see what else this person has published, which you can find out using an author search. There’s another useful type of search that you need to know about called a citation search. Suppose that you have identified a key paper, and you want to know if anyone has followed up on the ideas in this paper. A citation search tells you about any papers that cite the paper you started with, i.e. papers that include your starting paper in their list of references. A citation search is, in essence, a mechanism for following an idea forward in time.

There are a few different systems that allow you to do citation searches. I like the Web of Science, but it’s hardly the only game in town. Talk to your librarian about what tools are available on your campus.

A lot of the time, you will do a topic, author or citation search just once, and then look through the results to find a few papers that look particularly interesting. However, you may at some point have a topic that is so central to your research, or an author whose work is so relevant, or a paper that is so important to the field, that you want to know anytime a paper appears meeting one of these search criteria. In these cases, you would set up an alert in a relevant search engine. (Again, talk to your librarian about what alerting systems are available on your campus.) There are some variations on this theme, but usually an alert would be a search that is run automatically on a weekly schedule, with results (if any) emailed to you. In order to set up an alert, you usually need to set up an account with the database provider. If your institution subscribes to the database, this would normally be free.

Reading journals

When I was a graduate student, I used to go to the departmental library on Fridays and see what new journals had arrived. In the Chemistry library at the University of Toronto, new journals were piled on a table in the reading area. I would flip through the tables of contents of a few journals that were particularly relevant to me, and maybe browse one or two others each week as the spirit moved me. As a result, I would sometimes run into useful articles that I might not have found any other way.

I doubt that very many people browse physical journals in quite this way anymore. I certainly don’t. However, it’s useful to browse a few journals to facilitate the serendipitous discoveries of interesting work.

The best way to “browse” journals now is probably to have the journals email you their tables of contents. Almost every journal has some mechanism for this. Just go to the journal’s home page and look around for a link to their email alerting service. These services are always free. I would encourage you to get the tables of contents of a few important generalist journals (e.g. ScienceNatureProceedings of the National Academy of Sciences) as well as a few specialist journals in your area. You don’t want to have a hundred journals send you their tables of contents, so how do you decide which ones to get? Well, what have you been reading? Your supervisor’s suggestions as well as the results of your searches will likely have turned up a few key journals in which a significant amount of work in your area is published. Get those journals to send you their tables of contents. It may only be one or two journals at first, but as you read more, you will find additional journals whose tables of contents are worth adding to your list.

Hopefully these suggestions will help you locate relevant papers. In my next blog post, I will write about how you should read a scientific paper.

“Following the treatment of”: How to avoid reinventing the wheel

In my last blog post, I wrote about clean-room writing as a way of avoiding plagiarism. In today’s post, I will talk about how you can avoid charges of plagiarism while providing background information that follows a plan established by someone else.

Here’s a common writing problem: You’re writing background material for a larger work. The background material runs to several paragraphs, and you have found a source (often a book) that explains the issue you need to include in your background material particularly well. In science, we don’t quote long passages. As we discussed earlier, plagiarism is unacceptable, and copying someone’s organization is generally a form of plagiarism. Note the word generally in the last sentence. There’s a small loophole, which you have to use carefully, but which is available to you for cases like this one.

Here’s the loophole: If you explicitly say that you’re presenting something the way it was presented elsewhere, it’s OK. To do this, we often use words like “Following the treatment of…”, or “The organization of the material in this section follows…”. The explicit acknowledgment that you are borrowing someone else’s way of organizing a certain topic (and perhaps their notation and/or terminology) makes this OK. Note that you have to be very clear that you are doing this. A simple citation won’t do here.

Now we have to be clear about something else: Explicit acknowledgment does not provide a blanket exemption from the normal rules of plagiarism.

  • You still have to write your own text, i.e. you can’t just use someone else’s words, even if you have borrowed their organization. I would still write text like this using the clean-room technique described in my previous post. The only difference is that my notes in a case like this might be a little more detailed, laying out the logical sequence of ideas to be covered. I would still avoid writing notes in complete sentences to avoid inadvertent reuse of the original author’s wording.
  • You can only do this for a well-defined portion of your work, e.g. a few paragraphs or, at most, a short section on a specific topic. You can’t compose (for example) entire chapters of a thesis this way.

If you use this loophole, it becomes much more difficult to avoid other forms of plagiarism, and of course there’s always the question of whether you have used too much of someone else’s work in constructing your own. This is therefore something that is to be done with considerable caution, and likely with someone else going through your work to see that it has been done properly. However, it’s often useful to avoid reinventing the wheel by simply acknowledging that it has been done, and then just using the darned thing.

Clean-room writing

How to commit plagiarism

Over the last few years, I have noticed more and more problems with student plagiarism. I’ve spent a lot of time thinking about where these problems come from. I don’t generally think that they are due to deliberate attempts to cheat. Rather, I think that modern tools create situations where plagiarism becomes almost inevitable unless you are both conscious of the issue and careful. In this blog post, I am going to suggest a method for avoiding plagiarism that I think most of us should adopt. It’s not a panacea, but it’s better than what most people are doing right now.

First of all, we should all agree on what plagiarism is. There are lots of sources on plagiarism, of which my favorite was produced by the Office of Research Integrity of the U.S. Department of Health and Human Services. Roughly speaking, I tend to think of plagiarism in terms of a hierarchical classification, which goes from the grossest to the most subtle forms.

The first form involves simply cutting and pasting from a source. Most people would agree that is wrong, although a remarkable number of students appear not to entirely get that. When we think of plagiarism as cheating, this is generally what we are thinking about.

The second form, which has a very similar effect, is using someone else’s words in your text. While this sounds like cutting and pasting, it often happens in a different way, when we write something while we are looking at a source, maybe so we get some details right. People who commit this form of plagiarism are often not even aware they are doing it. The net effect though is text that generally looks an awful lot like the original source, with only a few words changed here and there. If you don’t believe that you are prone to this problem, try reading the Wikipedia’s historical summary of the third law of thermodynamics. (I picked this topic because few people know much about it. Plagiarism becomes all the more likely when writing about topics with which one is not intimately familiar.) Then immediately try to write your own text on this topic. While you are writing, keep the Wikipedia article open and look back at it for details from time to time. Most people find it very difficult to write sentences and paragraphs that differ significantly from the original text under these conditions. This is plagiarism.

If you somehow avoid writing sentences that look like those in the original text, you are likely to at least mimic the structure of the original text, with the same facts presented in the same order. This is the third and most insidious form of plagiarism. It is hard to detect, and people who commit this form of plagiarism will often deny vehemently that they have done anything wrong. However, when you do this, you have not told us what you think about a subject. You have just told us what the writer of, in this case, the Wikipedia article thinks. You may have done it in different words, but you did not organize the facts yourself, which is the key difference between original writing and plagiarism. (There are occasions when it’s appropriate to write something that is organized like someone else’s coverage of a topic. I will come back to this in a later blog post.)

Thinking about the exercise I proposed above, I would suggest that there are at least three distinct issues leading to plagiarism:

  1. Writing while looking at a source, or immediately after reading a source. It is almost impossible to come up with your own words and an original way to organize the facts when you are doing this. The perfectly good words and organization chosen by the original writer become “the obvious way” to write about a topic, and it’s virtually impossible to break out of that.
  2. Excessive reliance on a single source. If you use multiple sources to inform your thinking, the particular way any one author organized his or her writing is much less likely to have a dominant effect on how you write about something.
  3. Not having clearly distinguished research, outline, writing and revision phases in the writing process. This is perhaps the greatest failing of modern students in their approach to writing. If you start by doing some research, taking brief notes as you go, then write an outline as a way of organizing your thoughts, then write text based on your outline, and finally go through several rounds of revisions, it becomes difficult to commit plagiarism because you will really be writing about a topic from your perspective, and not from that of another writer.

How not to commit plagiarism

Having talked about these issues with many students, and given the pervasive nature of information technology, which puts sources at our fingertips almost anywhere, anytime, I have concluded that the best way to fight plagiarism is to adopt what I call clean room writing techniques. Much of what I am going to describe is essentially the research, outline, writing, revision cycle described above. However, I think that we need to go a little farther given how easy it is to unconsciously plagiarize material.

There is a similar problem in the software industry. Let’s say you want to write a piece of software that does the same thing as another existing piece of software. Because software is protected by copyright, you’re not allowed to copy someone else’s software. You may want to peek, but in the end you have to write your own, original implementation. The way this is done is to have people write the software (even if they previously peeked at the other company’s software) in a “clean room”, which is a room where you have everything you need to do your work except the other company’s software. Depending on how paranoid you are, such a room might not have a direct connection to the Internet. Sometimes, the people who peek are different from the people who write the new software. Sometimes, peeking just isn’t allowed.

To avoid plagiarism, you need to write in something like a clean-room environment. What I mean by this is that looking at your sources and writing should not occur at the same time. Your research for whatever you are writing (term paper, thesis introduction, article manuscript, …) should happen at a different time than the actual composition of text. Text should be written from notes which were not generated by simple cutting-and-pasting from a source. Yes, I know, cutting-and-pasting is quick and efficient. The problem, as explained above, is that you’re almost certain to use someone else’s words if you do that. When you take notes, write in your own words what you thought was important or interesting in some particular text you are reading.

Once you have composed a draft of a text, you can of course fact-check against the original sources. Having given the ideas your own form by writing a draft without direct access to the sources, you are much less likely to unintentionally borrow someone’s words during revision.

Note that clean-room writing does not lift the responsibility of citing your sources. Your notes should clearly link content to references, so you should be able to cite your sources as you write. Occasionally, you will need to make a note to yourself to chase down a reference later. You can still add references during the revision stage if you at least note the places in your text that will need to cite sources.

This may seem a bit radical, but I’ve seen too many students get in trouble for plagiarism over the last few years, and I know that many of them did not intend to plagiarize. It just happened, for the reasons I explained in the first part of this blog post. Eventually, you can relax this strict approach a little, but if you’re an inexperienced writer, it’s best to go into the metaphorical clean room anytime you’re writing new text.