We need to talk a little bit about statistical significance and what it is.
In science, when an experiment is done, there is always a chance that any given result was the result of coincidence, poorly thought out methodology, or human error. These false results can only really be compensated for in one meaningful way: quantity. You have to do the experiment over and over, and, in the case of biology, you need to look at dozens of individuals before you can claim the data has any statistical significance. Remember: Significant data is not definitive data! For sets of data to be definitive you don’t need dozens of individuals, you need dozens of experiments, and that often means thousands of individuals.
In modern physics, experiments are run millions of times before they are deemed definitive. Unfortunately, biologists aren’t always able test an experiment millions of times, and, as such, rarely have the level of certainty as physicists have. That said, we are taking about degrees of certainty, and scientists have methods for relating these levels of certainty. In a experiment, the main two factor which determine the likelihood of false results are the number of things being tested and their complexity. As the number of testing factors increases so does your need for test subjects, almost exponentially.
In an experiment, for each factor you’re testing you need a control, and, ideally, you’ll be testing the factor in a number of ways. Let me stop here to define some terms and give some proper examples, otherwise those unfamiliar with scientific experimentation will have a difficult time understanding what’s really going on. A control, or control group, is a group which closely resembles the treatment group in an experiment, they are chosen to act as a comparison group so meaningful analysis can be done. Controls are always necessary due to them being the comparison, without a control group you have no way of knowing if a particular treatment had any affect. For example; if you’re experimenting with the toxicity levels of a given compound, the control group would be treated, raised, and administered to exactly (or as closely as possible) the same as the test subjects, except that they will be given a placebo treatment in place of the potential toxin. If the control group and the test group fail to show enough difference in their health factors, then you cannot make any conclusions one way or the other. These health factors vary from test to test, it could be in life span, % of fatalities in a given time period, weight, fur sheen, presence of cancer, and basically any other sort of test you could imagine. As long as a test is relevant and gives reliable results, then it’s fair game.
What it means to be testable, as I mean it here, is that something is only testable if you’re able to effectively quantify it in a largely unbiased manner. For example, if you’re going to measure something, you must measure each group the same way. Say your measuring pea plants, it would not be correct to measure the control group by pulling up the plant and measuring from the root tip to the tallest tip of the plant, and then measure the tested group from where the stem meets the soil to the tallest tip of the plant. That lack of consistency introduces bias into an experiment. No matter what happens in the experiment above, the control group will seem on paper to be taller compared to the tested group than they actually are. The best methods are those that account for any and all factors that might affect the results, other than those besting tested for.
To keep using the pea plants as an example, if your testing a fertilizer, you need to keep the untested factors the same, so light exposure, water, soil composition, number of individuals per pot, lack or existence of pests damaging the plant, air humidity… I’m sure I’m missing something. And it will depend on the experiment and area you’re able to work in. It’s also important to inform readers if or how methodological problems may have skewed your results.
Another important term I will talk briefly about is double blind trials/experiments. These are experiments where the experimenters do their best to remove any bias in the administering of a treatment, and they don’t even know which test subject will get what treatment, or the tester doesn’t know enough to subconsciously give hints to the test subject. This is accomplished by having two or more groups of experimenters, one which sets up the experiments, in the case of a blind drug trial, they would put the drug in cups for the treatment group and an identical looking placebo in cups for the control group, only labeling names, not which drug is which. The second group would then administer the drug without having any knowledge of which is which. This way both the test subjects and the experimenters who are administering the treatment are unaware of who is getting what. That way no subtle factors like a patient’s knowing that they are on the real drug, or on the placebo, can influence the results (Search placebo effect and Double Blind trails for more).
Begin tangent: Here’s a popular example of bad research from a few years ago. Let’s be careful: I’m not blaming a child for the fear-mongering of adults, just that her experiment could use some work. Here is a link that contains the original experiment and a critique. http://www.snopes.com/science/microwave/plants.asp. Note: neither of these experiments are at all definitive. Just a useful critique of a real scare that still get brought up occasionally by some folks concerned by or outright scared of microwaves. I failed to find a good peer-reviewed source on this, but, since microwaves don’t normally produce ionizing radiation (radiation with the ability to actually change molecules), any potential harm is minimal, and shared with all cooking methods that involving heating your food. End tangent.
I’m doubtless missing many, perhaps hundreds of, important points. Scientific methodology is not something you can effectively cover in a single post, and neither is the sister discipline describing how to look at, interpreted, and draw conclusions from experimental data. It take years to learn these skills and I’ll likely be creating more posts about scientific methods beyond this series. If you have a specific question, I’ll do my best to answer it, and will likely compile them into a more elegant post later on.
Next time I’ll finally get to talk about the G.E. Séralini “affair.”