PMean: And the least important variable is…

Steve Simon


Categories: Blog post Tags: Human side of statistics Linear regression

I heard a story a long time ago

A statistician was asked to analyze some data about an industrial process and there were about a dozen or so independent variables that affect the outcome. So the statistician did some sort of stepwise regression or R-squared calculation and came up with an ordering for all the independent variables. The most important variable was the one with largest correlation or the first variable entered in the stepwise model (I’m not sure which

The statistician reviewed each variable in order starting with the most important variable. It was rather dull

At this point the engineers in the room burst into laughter. It turns out that water was the most important variable. If you had even a small amount of water in the raw material

If a variable has very little variability in it by design

Now whenever I hear a story like this

How do you avoid saying something so stupid that everyone laughs at you? Well

What do you do if you recognize that have a restriction of range problem? Well

And if anyone knows the source of this story or can point me to a reference