Someone wrote into the MedStats listserv asking about a process that they had chosen to select “important” articles in a particular research area. This was, I presume, a qualitative summary of interesting results in a broad medical area rather than a quantitative synthesis of all available research addressing a specific medical treatment. The reason I suspect this is that the person mentioned that they had used the statistical significance of the studies as a filter and eliminated any negative studies from further consideration.
The normal goal in any systematic overview is to find all the available research, not just the research that meets some statistical criteria. To select only positive studies in a systematic overview is a guarantee to get biased results. In fact, the folks who do systematic overviews go to great lengths to avoid selecting studies with a particular statistical outcome, going as far as to blind the reviewers as to the results section while they are determining if a study fits the entry criteria.
If the goal is to identify important findings, then filtering out the negative results is still bad, as several people had pointed out. Sometimes the negative studies are very important as well.
So how should you go about identifying important studies. There are several definitions of important but most of them require human judgment. One possible quantitative definition would be that a study is important if a systematic overview without that study reaches a different conclusion than a systematic overview that includes that study. In other words, a study that turns the consensus from efficacy to lack of efficacy would be important (as would be the reverse). By this definition, a study with statistical significance could be important (the first definitive proof of a new treatment), or it could be trivial (yet another study supporting an already accepted new treatment). A study without statistical significance could be important (the first real evidence that a well accepted practice might be worthless) or it could be trivial (yet another nail in the coffin of a treatment that most people have already abandoned). I mention efficacy here, but you can substitute safety and get pretty much the same results.
So by that logic, it appears that using ANY filter on statistical significance is a going to be worthless. Instead, a filtering process should only incorporate scientific and medical considerations. Possibly you could include sample size in your filter, since any study with ten patients total is unlikely to shift the research consensus. You could also filter out those studies that use surrogate outcomes. Both of these filters have their own problems, of course, but they are more defensible that the filter of statistical significance.
You could also use other people’s judgments as a filter. If it gets reported in the New York Times, it is probably important. Or you could see how often the article is blogged. Or you could only include articles that are published in journals that have an impact factor above a certain threshold. This also leads to problems with bias, most notably an English language bias, but you can still mount a defense of these approaches.
I suspect, though, that there is no automated way to screen out unimportant studies from important studies, other than reading all the papers and making a subjective judgment yourself.
You can find an earlier version of this page on my original website.