Non-random samples

Steve Simon


[StATS]: Non-random samples (March 25, 2005)

Someone sent me an email asking about a project that involved interviews of women at higher levels of management in an organization. This is a rather small group, and might require a non-random selection process. What are the limitations of a non-random sample?

It helps to remember the definition of a random sample. This is a subset of a population where every individual in the population has an equal probability of being in the subset. This is an ideal that is almost never achieved, even with the most careful research.

If your sample is not random, you have to ask yourself “What type of women would be left out, or would have a much lower probability of being in my sample?” It could be that women who work at remote sites, work at home, or who work odd hours, would be the ones that you would have no chance, or a lower chance of selecting in your sample.

Next, you have to make an assessment (usually a subjective assessment) of how the outcome measure might change in the group of women who are underrepresented or unrepresented in your sample.

In medicine, we often require some level of stability in the lives of the patients that we study. They have to live in the same place for at least six months or a year. So patients who move a lot are much less likely to be in our sample. What sort of person moves a lot? Often (but not always) it is someone who is living on the lower rungs of the socioeconomic ladder. They are often marginally employed and they have to go where the jobs are. For children, they may have to move because they are being shuttled from relative to relative because of the difficult economic circumstances of their caregivers.

When you tend to exclude patients with lower socioeconomic circumstances, you are excluding patients who have, in general, a poorer prognosis. As a result, you may end up looking at your results through rose colored glasses.

Whether you choose a random or a non-random sample depends on the balance between the difficulty and cost associated with a random sample versus the limitations that you have to endure with a non-random sample. This is often a difficult choice and depends quite a bit on your subjective assessments and your values. Of course, when a random sample is impossible, your choice is made for you.

I have some definitions of various types of samples on my web pages:

You can find an earlier version of this page on my website.