Developing a research hypothesis

Steve Simon


This page is currently being updated from the earlier version of my website. Sorry that it is not yet fully available.

Dear Professor Mean, I want to do some research, but before the hospital won’t approve anything until I have a protocol with a research hypothesis. I’m not sure why a research hypothesis is important. Can you help? -- Little Linda

Dear Little,

Think of it as job security for your local statistician.

Short answer

A research hypothesis provides clarity. A problem has to be stated clearly before it can be solved. The research hypothesis will also provide direction for writing the rest of your protocol.

There are several steps that you should follow:

  1. Identify the four components that most research hypotheses have.
  2. Select between a one sided and a two sided hypothesis.
  3. Use your hypothesis to guide the writing of your research protocol.

Stating a hypothesis

Ideally, your research hypothesis should be specified prior to the collection of any data. An exception would be an exploratory study. For example, if you are investigating the cause of poor morale among health care providers, you may not have enough information to specify anything more specific than a whole range of factors that might influence morale.

In general, a hypothesis will have four major components. Not every hypothesis can be fit into this framework, of course, but knowledge of these four components might help you if you have an incompletely formed hypothesis.

The first component is the subject group. In other words, who are you interested in studying? Subjects could be patients, their parents, or the health care providers.

The second component is the treatment or exposure. In other words, what is being done to part or all of your subject group. A treatment implies an action on your part, such as providing information or applying a new therapy. An exposure, on the other hand, implies some action that you do not control, such as lead poisoning or premature birth.

The third component is the outcome measure. In other words, how or in what manner is the treatment or exposure going to be assessed. It is very important that the outcome measure be defined precisely and unambiguously. For example, if your outcome is breast feeding rates, you should use standard definitions of breast feeding, such as those provided by the World Health Organization.

The fourth component is the control group. In other words, who are you comparing to. It is important for the control group to be as similar as possible to those who receive a treatment or exposure.

As mentioned earlier, not every research hypotheses will have all four components. For example, a cross-over design involves applying both a new treatment and a standard treatment using the same patients. For this study, the hypothesis would not involve a separate control group. Correlational studies look at relationships within a single group, such as a study of the factors that cause medication errors. This type of study would not have a treatment/exposure. The structure mentioned here, however, is still useful for developing most research hypotheses.

[]{#OneSided}One sided versus two sided hypotheses

During the planning of your research, you need to specify whether you plan to use a one sided or two sided hypothesis. A two sided hypothesis states that there is a difference between the treatment/exposure group and the control group, but does not specify in advance what direction you think this difference will be. A one sided hypothesis states a specific direction (e.g., increase).

If you expect that a change in either direction is possible and that changes in either direction are interesting, then you should use a two sided hypothesis.

If changes in one direction are uninteresting and unpublishable, then use a one sided hypothesis. Also if a change in the unexpected direction is equivalent in practice to no change, then use a one sided hypothesis.

The best example of this is when you are comparing a new therapy to an existing therapy, where the new therapy is much more expensive, your only concern is to show that the new therapy is better. If it turns out that then new therapy is equal to or worse than the standard therapy, you will not adopt it.

Some important issues involving the control group

With a treatment, where you intervene, it is often possible to select those patients who receive the treatment through the use of randomization. Randomization ensures comparability, because the random selection ensures that, on average, subjects who receive the treatment will be comparable to subjects who do not receive the treatment.

When you have an exposure instead, it is often difficult to ensure that the subjects without the exposure are comparable to the the exposed subjects. Sometimes matching will help, but you should only use matching for very important prognostic variables. For example, birth weight plays a major role in infant mortality, so it is often helpful to match your exposure group to your control group on the basis of birth weight. Matching, however, will often present difficult logistics, especially when the pool of control subjects in not much larger than the pool of exposed subjects.

What are your next steps?

Other important issues to be considered in your protocol is

  1. determination of the sample size,
  2. identification of potential confounding variables, and
  3. what efforts at blinding will be used, if any.

Once you have a well defined research hypothesis, though, these details will fall into place. Hah, hah, did I really say that? The rest of the protocol is still pretty darn hard, but it would have been impossible if you didn’t have that research hypothesis.

To determine an appropriate sample size, you need a research hypothesis, an estimate of the standard deviation of your outcome measure, and assessment of how much change is considered clinically relevant. Hey, you’re already a third of the way there! Finding a standard deviation requires either reviewing previous research on that outcome measure or running a pilot study. The clinically relevant difference is a judgement that is made solely on medical knowledge. Your statistician cannot tell you what a clinically relevant difference would be.

Confounding variables are those variables which are related to your outcome measure and which may differ between your treatment/exposure group and your control group. Assessment of potential confounding variables is especially important when you cannot randomize.

Blinding means hiding information about the treatment/exposure from the patients, their parents, and any health care professional who interacts with the patients and their parents. Blinding is useful when it can be done, but blinding is not always possible. For example, in a comparison of a drug that is rectally administered to oral administration, the patient usually figures out quickly which group they are in. But even when the patients themselves know which group they are assigned to, you can sometimes still use blinding for laboratory personnel and for interviewers.


Little Linda needs to include a research hypothesis in her grant proposal, but doesn’t know what it should say. Professor Mean explains that you should develop a hypothesis to giveyour research clarity. There are four components in most research hypotheses:

  1. a subject group,
  2. a treatment or exposure,
  3. an outcome measure, and
  4. a control or comparison group.

Other important issues to keep in mind while developing a research hypothesis:

  1. Use a one sided hypothesis when changes in the opposite direction are uninteresting.
  2. Randomization helps ensure that you have a comparable control group.
  3. Use the research hypothesis to guide the determination of sample size, the identification of confounding variables, and the efforts to blind information.

Annotated Bibliography

This site provides information about evidence-based medicine, but much of the material is still relevant to developing research protocols. The four components to a research hypothesis come from this site.

Massey, V.H. (1995) Nursing Research, Second Edition, Springhouse PA: Springhouse Corporation

This book provides a “how to” framework for conducting research in any easy to skim outline format. The book includes topics on ethics, literature review, sampling techniques, data analysis, and presentation of research results. The sections that deal with planning are the best parts of this book.

Lang, T.A. and Secic, M. (1997) How to Report Statistics in Medicine. Annotated Guidelines for Authors, Editors, and Reviewers, Philadelphia, PA: American College of Physicians.

It seems ironic to recommend a book on writing the final results, but it helps to start out with your goal in mind. If you think about the information that belongs in your research paper, then you will have a good idea of what you need to specify during the planning stages of your research. This book also uses an easy to skim outline format, but it has significant narrative text under each outline element.

You can find an earlier version of this page on my original website.