Course Icon

Natural Science - Year I

Unit 2: Methods of Science

SO Icon

History Weblecture for Unit 2

This Unit's Homework Page History Lecture Science Lecture Lab Parents' Notes

History Lecture for Unit 2: Data, Evidence, and Experimental Design

For Class

Observational Methods

We need to describe the processes and methods that are appropriate to scientific investigation and analysis. Our quest for scientific knowledge will be dictated by what we define and accept as science, so for the sake of argument, let's start with this for a working definition:

Science is that endeavor that seeks to study, or obtain knowledge about, phenomena through the use of a systematic approach that is based on the evaluation of evidence through reason. —Joseph J. Carr, The Art of Science

This definition implies that we must first obtain data about phenomena, [singular: phenomenon. This term comes from the Greek verb φαινειν, or phainein, to shine or appear. A φαινόμενον is anything that manifests itself and can be observed or shown. If we observe it through some system or established method to collect objective information or data about it, then evaluate that data using some rational and accepted methods, the resulting information, appropriately organized and applied, eventually becomes knowledge. Notice that we have not limited the scope of the phenomena we can observe "objectively" in this definition! Here we are focussing on the methods we will use.

Now we have to look at the rest of this definition. A common systematic approach is to gather data about phenomena and then analyze the data in an attempt to find patterns or relationships that allow us to make general statements and predictions about the characteristic behavior of the phenomena. The three main types of data gathering are direct observation, surveying, and experiment. In most investigation projects, all three will play an important part. Once we have the data, we use a variety of mathematical and logical analytical methods to derive relationships. These include error analysis, statistical analysis, and dimensional analysis. For now, we will concentrate on methods of observation.

Direct Observation


Casual observation is the starting point for most scientific investigations. Anyone can do it. No matter how sophisticated the research eventually becomes, all investigations are driven to find the answer to some question that forces itself into the conscience of one or more people from some unrelated source. Sometimes the question is general: why is grass green? how do clouds form? how old are stars? what causes cancer? Sometimes it comes from the realization that two apparently unrelated phenomena may somehow have something to do with one another: if there are more ducks on the lake in the morning, and it is easier to catch fish in the morning, is there a relationship between the ducks and the behavior of the fish? Sometimes the question comes from watching a repeatable event that captures our fancy. For example, does mixing vinegar with baking soda always cause a foaming action? Why does my family always gain weight during the Christmas holidays?

Once the question has been asked, the scientist in us can start looking for answers. We may speculate on possible solutions: grass is green because it produces chlorophyll, clouds form when there is warm, moist air available, the ducks—like us—are looking for fish in the morning, we gain weight because there are more munchies available during the Christmas season than the rest of the year. The process of recognizing that separate, specific instances share common characteristics that can be summed up in a generalization is an example of induction, reasoning from the specific case to the general rule. But induction is not enough to produce a theory. Casual observation must be followed (in science, at least) by observations specifically planned to test our generalization of natural behavior.

Planned or directed observations play an important part of several scientific endeavors, especially where the two other methods of data gathering—surveying and experimentation—are expensive or impossible. Biologists who are involved in animal behavior research or in environmental studies rely heavily on field observations, where they must alter the subject of their investigation as little as possible. Astronomers work almost exclusively with telescope observations under circumstances where they cannot increase the frequency or accuracy of observations beyond a certain level, and where controlled experiments are impossible. In such cases, data gathering becomes very difficult, because it is not always possible to establish which datum is the significant one. Moreover, because we generally do not have unlimited resources for scientific investigation, determining what to observe and how much to record requires balancing our resources against the type of question we are asking. Resource questions affect all forms of scientific research, and are the heart of many scientific policy decisions made by government and industry.

A good rule during initial investigations, and especially during direct observation, is to record as much as possible, not only of the data involved, but of the data collection itself, or anything that occurs during the course of the investigation. It is generally easier to eliminate irrelevant data than to repeat the observations, and keeping our observation base as broad as possible may prevent us from excluding important information too soon. We do not want to miss the visit of the gardener who sprays the grass with chemicals to make it greener. We should bother to take observations during cold weather, rather than miss the realization that clouds form when the air is cold and moist as well as warm and moist. We probably want to note that ducks like fish food, and that both ducks and fish come out in the morning to the side of the lake because the game warden feeds the fish at that time. As we learn more about the phenomenon we are studying, we may decide that recording certain kinds of information (temperature, wind direction, humidity) are more important than others (time of day, number of crows on the telephone wire) in collecting information about cloud formations, and thus be able to direct our resources toward improving our observation of those particular factors—but the elimination of some factors in favor of others should be the result of experience, not simple guessing.

Preparation and a willingness to investigate anything unusual are essential to scientific advances. There are occasional cases of serendipity—where the scientist realized what he happened to be looking at was important, even if he had actually set out to observe something else. The discovery of radiation from uranium, the realization that electric current affected magnets, and the first sighting of the 1987A supernova all occurred when their observers were looking for something else. In each case, the scientist involved was trained to be alert to the unusual, and to look carefully at it. They remembered Milligan's law: When you are taking data, if you see something funny, Record Amount of Funny. Eventually, they were all able to correlate their "something funny" with other data and publish their discoveries, which have had a profound impact on our understanding of the nature of the universe.


White Spider on Red Rose

Another method of data gathering, which is extremely important when experimentation cannot be done and direct observation will not give enough information, is surveying. Surveys can collect data by direct questioning, by looking at historical sources, and by following (if time permits) a particular population.

Social scientists use direct surveys to study how people respond to different situations; these studies often have practical applications to marketing products. For example, when The Return of the Jedi was originally filmed, there were two endings: one in which Lando Calrissian survived the attack on the Death Star, and one in which both he and the Millenium Falcon failed to clear the blast zone in time. Audience surveys at "Sneak Previews" convinced the producers to release the film with the first and happier (but dramatically less satisfying) ending.

Historical surveys are often done when direct surveys are insufficient to establish patterns. Obviously, you must know what to look for and which facts might be needed to make a correlation between cause and effect. A review of a century-worth of newspaper articles for a riverside town might show that heavy flooding apparently occurred randomly, without a corresponding heavy rainfall. A second, more thorough survey may show that heavy flooding occurred during warm spells in springs preceded by winters with exceptionally heavy snowfalls. A reasonable conclusion for further testing would be that heavy rainfall was less likely to cause flooding than rapid snow melt. Similar studies are used to track outbreaks of epidemic diseases and changes in weather. Once the researcher has a set of correlated facts, for example, warm Pacific current temperatures and flooding in Mexican coastal towns, he can look for that information and see if, in fact, similar situations have existed in the past. Such studies in the last few decades reveal that the warm Pacific current called El Nino appears to occur about every 20 years—information that will be useful in preparing for the next round of extreme weather.

Creating and following a clearly identified population through a particular course of events is an especially important type of survey for environmental and medical studies. Most medical treatments are based on the effectiveness measured by observing the health changes in groups of individuals, some of whom receive treatment while others do not. Insurance and finance companies base their risk assessments on carefully and deliberately collected data for different groups of individual people, stocks, and products.

As you can tell from the above examples, the collection of survey evidence is the basis of many policy decisions by commercial, industrial, government, insurance, and medical organizations. Understanding what makes a valid scientific survey (and being able to spot the fallacies in improper surveys) is an important skill for you to learn. Your legislators, your doctor, your boss, and you yourself will probably base decisions which affect your ability to work, live, and obtain medical care on information you learn from different "scientific surveys."



You have probably all run across the prescription for a perfect experiment:

  1. Make an observation, and then propose a hypothesis (a statement) about some phenomenon.
  2. Devise an experiment to test the hypothesis.
  3. Run the experiment, and collect data.
  4. Evaluate the data and determine whether or not your hypothesis was true.
  5. Refine and rerun the experiment if necessary to get better data.

The first step of this process involves induction, recognizing the possibility that some general rule might govern the behavior of particular phenomena. The rest of the process involved deduction: if we assume the general rule, can we correctly predict the behavior of particular objects in the situations covered by the rule? A large part of experimental design involves eliminating other factors that could affect the outcome, so that we focus only on the cause-and-effect relationship or pattern described in our general rule. If we are trying to determine how much energy a chemical reaction produces, we have to isolate the reaction so that no matter or energy — heat, light, or electrical — can affect the outcome. That's why the calorimetry "bomb" in the picture above is a thermos-like container that is sealed to prevent the escape or input of matter, insulated to prevent heat flow, and opaque to prevent light entering or escaping.

Although experimentation is the type of observation we most closely associate with the scientific enterprise, it is actually only a small part of the data gathering scientists do. Experiments are limited in their scope and applicability, since most experiments require artificial situations where all the contributing factors can be easily controlled by the experimenter. These limitations are not always recognized, and sometimes the results of scientific experiments assume more importance than they should.

The problem with proof

A survey (!) of the advertisements during the 1960s (after the successes of the polio vaccine, the development of the atomic bomb, and the launches of Sputnik and Echo) reveal many examples of statements claiming that experiments proved one product better than another for a particular purpose. In the mid-1960s, after the publication of the Surgeon General's report that smoking might cause cancer but could not conclusively be proven to be the cause of cancer in any specific case, the claims in advertisements for scientific certainty fell off. Some advertisements emphasized common experience rather than scientific research, with the key character claiming "I don't need a scientist to tell me..." whatever it was that the advertisers wanted you to believe. It is of course true that experiment and observation can never prove a particular claim is true for all cases at all times, although a single well-attested observation can disprove a claim.

In another case, rats given large amounts of saccharine developed cancers at a higher frequency than rats who were not given saccharine supplements in their diet. As a result, saccharine was labeled a "carcinogen" (cancer-causing agent), and many products which contained saccharine switched to other sweeteners or disappeared from the store shelves altogether. A closer reading of the study reveals, however, that the dosages of saccharine were thousands of times higher than what would be received by a normal adult human using a range of saccharine-containing products. There were also questions about the different ways in which rat digestive and human digestive systems process saccharine. The experiment only really indicated that saccharine in very large dosages appeared to increase the risk of cancer for rats. It did not prove anything about the "normal" risk to humans, although it was interpreted that way by many people.

Isolating or eliminating significant factors by using controls

Still another problem with experimental design is the possibility of ignoring an important factor in the environment. In 1989, the researchers Stanley Pons and Martin Fleischmann announced that elevated temperatures and levels of radiation measured in their laboratory apparatus proved that they has achieved cold fusion, or nuclear fusion at room temperatures. When others tried to reproduce their experimental results, they failed. Further investigation showed that the "elevated levels of radiation" were higher than the standard references said normal radiation levels should be, but not because of any cold fusion occurring in the experimental apparatus. The building in which the experiment was made happened to be situated near a radon source that raised the level of radiation above normal for the entire lab. Pons and Fleischmann never bothered to measure the radiation levels before they started their experiment.

We can reduce the possibility of having an overlooked factor influence our experimental results by creating a control, or experimental case which duplicates our experimental setup except for the single variable that we are testing. For example, if we want to see whether a solution of ammonia compounds will act as a fertilizer, we might set up an experiment in which we feed a rapidly growing plant like a radish a solution including the ammonia compound to see whether its growth rate changes. If it grows more rapidly than expected, we might be tempted to assume that the fertilizer causes the growth spurt. But without a control plant, receiving the same amount of liquid but without the fertilizer, we would not know for certain whether the fertilizer is causing the sudden growth, or some other cause, such as additional sunlight or a warmer-than-normal room. If we have two plants which differ only in the amount of fertilizer given, we are more justified in assuming that differences in their growth is a result of the amount of fertilizer and not some other cause.

Having an agenda to "prove" a particular outcome

Another problem with experiments is that we often prejudice the design or the outcome of the experiment by the way we phrase the hypothesis we are testing. How would you devise an experiment to test each of the following statements? What factors might you ignore?

  1. Spring floods are caused by heavy rainfall in the preceding 24-48 hours.
  2. Spring floods are not caused by heavy rainfall in the preceding 24-48 hours.
  3. Spring floods are influenced by the phase of the moon and/or the position of Mercury relative to the sun.
  4. Spring floods are caused by rapid snowmelt during unusually warm spring weather.

As we perform experiments, we need to be aware of the limitations of this method of observation.

Qualitative and Quantitative Data

Whatever method we chose, there are two major kinds of data that we can collect. Qualitative data includes descriptive information that identifies what we are looking at. Usually we do not use instruments to collect this kind of data; we cannot reproduce our findings of qualitative data easily. Qualitative data about the weather might include a sense that the air feels "soft", or that the wind is "strong". Qualitative data is difficult to compare, but can be very important in helping us identify patterns or cycles that should be more closely investigated in a quantitative way.

Quantitative data is information that results from counting or measuring characteristics. Quantitative measurements of humidity, temperature, and air pressure would give us numerical information that we can plot or put in a table, and compare with other measurements. One of the advantages of quantitative data is that we can use some statistical methods to determine whether our measurements are accurate, or in error. We'll look in more detail at these methods in later units.

Hypothesis, Theory, and Physical Law

Despite that fact that we don't want to prejudice our observations, we almost always start a scientific investigation with a particular goal in mind: to determine whether some statement about the behavior of nature is accurate. This assumes, of course, that we can make a test about the particular statement or hypothesis. Not all statements are testable by experiment, and some are not testable by any kind of direct observation. We can test whether different concentrations of fertilizer produce healthier tomato plants by counting the number of tomatoes and weighing them for size, but we can't test whether the tomatoes are tastier, because there will probably be a difference of opinion about what makes a tomato taste good. We can test whether one band plays louder than another, but we can't test the quality of their music.

A testable statement is called an hypothesis. It describes the expected outcome of a very specific situation. If over time, our experience is that set of related hypotheses accurately predict natural behavior in a wide range of situations, we may start to talk about the set of hypothesis as a theory. The theory usually identifies some basic principle common to all the hypotheses it supports. For example, extensive experience with different forms of fertilizer lead to a theory that describes how plants absorb and use various types of minerals in their metabolic processes.

A scientific law is a statement about nature that has been extensively tested and for which no exceptions have been found. Often the law can be stated in precise mathematical terms, or by using a model. The "law" of gravity states that the force of attraction between two objects depends on their mass and the square of the distance between them:

F gravity     mass 1     mass 2 distance 2

(The "∝" sign means "is proportional to", and is used here because there is another factor called the gravitational constant, that has to be part of this equation for the math to work out right. We'll see why when we talk about Newton's discovery of the law later in the course.)

The difference between a hypothesis, a theory, and a physical law is a bit fuzzy, and can simply depend on how much testing has been done, how well-accepted the statement is among any particular group of people, how narrow or restrictive the statement is, and whether it can be formated as a simple ratio of quantities. This means that some people may refer to relativity as a theory, while others give the mathematical statement that the velocity of a body with mass is always less than the speed of light, v < c, the status of physical law. One of our tasks in this course will be learning how social and cultural forces shape our acceptance of scientific observations and analysis.

Study/Discussion Questions:

This week's lab asks you to observe the weather for one week, then use your observations to predict the weather for the next few days. Even if you are not performing the lab yourself, think about how you would need to go about doing it.

Further Study/On Your Own