Pros and Pitfalls of Observational Research

Why the connection between beer brand and region? Climate? Tradition? Or simply distribution? Some combination of the three, plus other factors?

Observational Research

Say, for example, we observe that some beer brands are much more popular in certain parts of the country than in others. Or perhaps we find that some fashion brands are preferred by younger women and others by older women. Let's also imagine an agricultural experiment conducted to ascertain whether one fertilizer will produce higher yields than another. In this experiment, two kinds of fertilizer are applied to two varieties of soybeans under three levels of soil compaction and three levels of watering. The plants are grown under controlled greenhouse conditions.

Now back to our consumer examples. Why the connection between beer brand and region? Climate? Tradition? Or simply distribution? Some combination of the three, plus other factors? In our fashion example, income could be a factor underlying the age effect we observe. Or our result may merely stem from the fact that different brands are designed for and marketed to women of different ages.  These are examples of non-experimental research, also known as observational research.

Our agricultural experiment, on the other hand, is just that - an experiment. Experiments employ statistical designs in which subjects (e.g., soybean seeds) are experimentally assigned to one or more treatment conditions (e.g., fertilizer, soil compaction). The experimental design is created when the research is being planned and the laboratory-like conditions of the greenhouse in our illustration are intended to minimize the effects of other variables, such as temperature and soil composition, that can influence soybean yields.

In plain statisticspeak, yield is the dependent variable and type of fertilizer, soybean variety, soil compaction and watering are the independent variables. We are trying to explain or predict yield with these four independent variables. However, our principal focus is the effects of the fertilizers and if we find that plants given one of the fertilizers have higher average yields, we can be confident the fertilizer is truly more effective than had we simply visited farms and took note of how each farmer was growing his crop and measured yield after the fact. The latter would have been observational, non-experimental research. We would have had no control over variables that affect yield and would be less confident that any differences in yield we observe resulted from the type of fertilizer applied.

Experiments among consumers

Why don't we conduct experiments among consumers? We do! A taste test is one example and another is conjoint analysis in which a choice experiment is administered to a sample of consumers. In conjoint, the treatments are product features shown to respondents in experimentally-designed combinations and sequences. Respondents choose which product they prefer in each set of combinations (tasks) they are shown.

Experimental research can be expensive, however, and this reduces how often it is used. Experiments have another limitation - they are artificial and it may not be clear how well our conclusions can be generalized to real-world conditions. Furthermore, rough directional implications may be all we need to make our decisions and it might not be necessary to make strictly scientific inferences regarding causation. This is one reason a sizable chunk of marketing research is purely qualitative.

Conclusions about causation

However, there are situations in which we do draw conclusions about causation, at least implicitly, and these conclusions play a central role in our decision-making. As noted, most often we do this with observational data in marketing research, not with experiments. Examples are plentiful and include cross tabulations of selected questions with respondent demographics, preferred brand, purchase frequency and attitudes. We do this when only knowing the "what" is not enough and we want to try to understand why consumers are behaving as they do.

We make our causal deductions based upon associations in these situations. But there are risks to this. "Correlation does not imply causation" is drilled into future statisticians in the classroom when they are cautioned about the post hoc ergo propter hoc fallacy, an example of which is concluding that the rooster's crowing caused the sun to rise. Four main reasons can be responsible for an association between one variable and another: causation, chance, bias and confounding.


As the word suggests, causation means that one variable causes or influences another and that there is a cause-and-effect relationship between two variables. We may claim explicitly, for example, that we believe some consumers are buying Brand X because they trust its quality or we may only imply that such a causal relationship exists.


Chance associations are flukes (i.e., they occur by chance alone). Significance testing provides guidelines we can use to diminish the risk of making a causal connection that is actually the result of sampling error. Inferential statistics, as many readers know, is a lengthy subject and several assumptions come into play. Even when these assumptions, such as probability sampling and measurement without error, are met sample size has a major impact on our calculations. Trivial differences may be flagged as statistically significant if our sample size is very large. Conversely, with small samples, large and substantive differences may not meet conventional cutoffs (e.g., 5 percent) and be deemed insignificant. Inferential statistics can only reduce the risk of being fooled by chance.  It is also employed in experimental research.


This is a thorny topic and can influence both experimental and observational research. In survey research, for instance, our respondents may differ substantially and systematically from our target population in ways that distort conclusions we make about them. Bias can be a very serious problem and safeguards must be put in place to reduce the possibility that bias is contaminating our research.


Confounding is often very hard to spot. A confounder is associated with the true cause of another variable but does not itself actually cause or influence this second variable. As an example, imagine a hypothetical correlation between pizza consumption and traffic accidents. How could eating pizza cause traffic accidents? One plausible explanation is that pizza is frequently consumed in tandem with alcohol. A variable we hadn't thought of (alcohol) is correlated with pizza consumption and is the true cause of the increased risk of accidents. Pizza is guilty by association!

Admittedly, the foregoing is a silly example but hopefully will demonstrate how badly we can be led astray by mere associations. When experimentation is not possible or required, statistical control is often used as a compromise. Statistical control employs multivariate analysis to simultaneously adjust for the possible effects of exogenous variables such as respondent demographics and prior category usage. Propensity score analysis is an extension of this idea that is gaining popularity in marketing research, though it should be stressed that neither statistical control nor propensity score analysis should be considered substitutes for experimentation.

Interactions and multicollinearity

These topics fill many textbooks. Interactions and multicollinearity are two other subjects related to our discussion, so I'd like to briefly mention them.

An interaction is present when the relationship between two variables depends on a third variable. For instance, we may observe that category usage declines with age but much more so among women than men. This result would suggest an age-by-gender interaction is present.

Multicollinearity, highly-correlated predictors (independent variables), can lead to invalid or nonsensical results. When the correlations are very high it isn't mathematically possible to isolate the effects of the predictors and any number of solutions is possible. Multicollinearity can be a serious complication in key driver analysis, an example being customer satisfaction research where we try to uncover the variables that most impact overall satisfaction with a company.

Concluding thoughts

These concepts can be difficult to grasp at first but a basic understanding of them is essential to sound research. "Sound" does not mean perfect, however, as any research in any field will have flaws. Research, like most things in life, requires trade-offs, and we should define our objectives concretely and realistically during the planning phase. Though our discussion has highlighted quantitative consumer survey research the fundamental issues we've covered apply to any kind of research, including qualitative and Big Data analytics. It's vital that we appreciate the strengths and limitations of observational research versus experimentation when we are designing our research or interpreting data already collected.

In marketing research it usually is not obligatory to prove a causal relationship, and it can be argued this is seldom feasible. Often it will be enough to treat the results as exploratory findings that may suggest some marketing action, but we should not fall into the trap of making important decisions based on flimsy evidence. Moreover, we should remember that even if our client agrees 100% with our conclusions and recommendations, this doesn't necessarily mean we've made the right ones! While certainly encouraging, our client's endorsement does not constitute validation.

Some homework for you

To begin putting these concepts into action, give the following exercise a try.  Pull out a report you've already prepared and delivered to your client, preferably one that's at least six months old so you'll be able to look at it from a relatively fresh perspective. Read through it and take note of any causal links you've made in your analyses, even tacitly. There probably will be quite a few! For each one you've spotted, ask yourself these questions:

  1. Was the association real and not just a fluke?
  2. Was it strong enough to be meaningful from a business point of view?
  3. Did I really have evidence to support the causal link I made? Were there other factors that might have caused this association instead?
  4. Is my conclusion consistent with other information I have? With theory? With common sense?
  5. Did I draw the right business implications from this finding?

I've tested this exercise on myself several times and...ouch!


  1. An earlier version of this article entitled Forget exact science: Drawing conclusions from observational research appeared in Quirk's on July 22, 2013 (
  2. While some marketing researchers may associate the term observational research with ethnography, its meaning is actually much broader.
  3. Correlational research is another term sometimes used to mean observational or non-experimental research. This term is ambiguous, however, so I personally tend to avoid it. For example, correlational research might also imply correlation analysis of experimental data.
  4. There is quasi-experimental research, in addition. See for a brief overview of quasi-experimental research and propensity score analysis.
  5. I offer a few tips and guidelines on how to think like a researcher in Quant or qual, let’s go back to the basics:

Bio: Kevin Gray is president of Cannon Gray, a marketing science and analytics consultancy.

Original. Reposted with permission.