Poisson regression model

Steve Simon

1999-09-21

This page is currently being updated from the earlier version of my website. Sorry that it is not yet fully available.

*Dear Professor Mean

Dear Denied,

I always distrust reviewers who insist on a specific statistical method. It’s probably that they used this technique for their dissertation and they think that everyone else should follow their pioneering lead. This is not unlike the saying that when your only tool is a hammer, everything looks like a nail to you.

Is your data a nail or not? Well

Poisson distributions have three special problems that make traditional (i.e.

  1. The Poisson distribution is skewed; traditional regression assumes a symmetric distribution of errors.
  2. The Poisson distribution is non-negative; traditional regression might sometimes produce predicted values that are negative.
  3. For the Poisson distribution

In contrast

Alternatives to Poisson regression

There are at least two good alternatives to the Poisson regression model. The negative binomial distribution is also a good model for counts and you can derive this distribution quite naturally as an extension to the Poisson distribution. I think you can get this distribution by placing a prior distribution on the mean parameter of the Poisson. The negative binomial distribution has a variance which is larger than the mean. In contrast

There are also models that incorporate Poisson probabilities but then allow the probability of a zero to be a bit larger or a lot larger than what Poisson might determine. These are sometimes called ZIP (Zero Inflated Poisson) models. Think of this as a mixture distribution where you choose zero with a certain probability and a Poisson random variable otherwise. This is also a quite natural extension of the Poisson distribution.

If you’re just starting out

Further reading

Poisson regression is a special case of the Generalized Linear Model. This model deserves the name “Generalized” because it also includes traditional regression and logistic regression under its umbrella. If you want to understand Poisson regression

The classic reference book is McCullagh P. and Nelder

Can you use SAS or SPSS?

SAS has a procedure GENMOD that will compute a generalized linear model. SPSS does not yet have a module for generalized linear models, but can fit a Poisson regression using the GENLOG procedure. There are a few tricks that you need to worry about in SPSS if your independent variable is continuous or if you have zero counts for some of your data. Details can be found at the SPSS web site:

http://www.spss.com/tech/answer/result.cfm?tech_tan_id=100006204

What if my data is a rate and not a count?

Poisson regression can also be used to analyze rate data. Rates are simply counts divided by a measure like area or time. For example, infection rates are often measured as a number per patient day of exposure. To fit a model using rates

Single group count

Suppose we have a hospital floor and we count a total of 25 nosocomial infections in a month. Our best estimate

For the Poisson distribution

Second

The are at least two other approaches based on research by Daly [PubMed citation]](http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=1424580&dopt=Citation) and Byar (I could not find a good citation).

Single group rate

Let’s make the example a bit more complicated. Suppose there were a total of 500 patient days of exposure during that month. Then the rate of nosocomial infections would be 0.05 per patient day or 50 per thousand patient days. We can get confidence intervals for this rate by simply adjusting the confidence intervals above in proportion. So the exact confidence interval for the rate would be

16.2 / 500 = 0.0324

and

36.9 / 500 = 0.0738 or 74 per thousand patient days.

Two group counts

Suppose we have two groups and we measure a count on both groups. How would you test whether these groups are similar? To make this question more precise

and that these two variables are independent. We want to test the hypothesis

There are several approaches that work well. The normal approximation statistic

tests the hypothesis that

You can also get a normal approximation for the term

which has an approximate standard deviation of

so the test statistic

tests the hypothesis that

You can also use a conditional argument to show that

is a binomial proportion and the hypothesis

is equivalent to the hypothesis that the two counts come from the same Poisson distribution.

Two group rates

Suppose you have two counts

You can rely on the normal approximation to the Poisson distribution again.

or you can use a log transformation. Interestingly

The last term is a constant and does not affect measures of variability. Thus

Finally

implies that

Conditioning on the total

Example: In a study of ankle sprains

You can compute the rates per thousand exposures as

41 / 30.724 = 1.33 and

27 / 13.767 = 1.96.

The difference in rates is 0.63 and the standard error is 0.43.

http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&pubmedid=18523571

Determining sample sizes

[To be added]

Further reading

Exact confidence interval for Poisson count. Tomas Aragon and Travis Porco. Accessed on October 29

Confidence Intervals for the Mean of a Poisson Distribution. P.D. M. Macdonald. Accessed on October 29

Summary

Denied Denise had a manuscript rejected. The reviewers suggested that she use Poisson regression. Professor Mean explains that you should consider using Poisson regression when you are trying to predict a count or a rate.

Resources

Determining the size of a total purchasing site to manage the financial risks of rare costly referrals: computer simulation model. M. O. Bachmann

Influence of changing travel patterns on child death rates from injury: trend analysis. C. DiGuiseppi

Generalized Linear Models. McCullagh

Risk ratio and rate ratio estimation in case-cohort designs: hypertension and cardiovascular mortality. E. G. Schouten

Criticism of a hierarchical model using Bayes factors. J. H. Albert. Statistics in Medicine 1999: 18(3); 287-305. [Medline]](http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=10070675&dopt=Abstract)

Permutation Tests for Joinpoint Regression with Applications to Cancer Rates. Hyune-Ju Kim

Exact confidence interval for Poisson count. Tomas Aragon

Coping with extra poisson variability in the analysis of factors influencing vaginal ring expulsions [letter; comment]. CG Demetrio, MS Ridout. Statistics in Medicine 1994: 13(8); 873-76. [Medline]](http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=8047741&dopt=Abstract)

The comparison of two poisson-distributed observations. K Detre. Biometrics 1970: ?(?); 851-54.

**Regression analyses of counts and rates: poisson

Poisson regression analysis in clinical research. F Kianifard

Maximum (Max) and Mid-P Confidence Intervals and p Values for the Standardized Mortality and Incidence Ratios. Pandurang M. Kulkarni, Ram C. Tripathi

Estimating the ratio of two Poisson rates. Robert Price. Computational Statistics & Data Analysis 2000: 34345-56.

The application of poisson random-effects regression models to the analyses of adolescents; current level of smoking. Ohidul Siddiqui. Preventive Medicine 1999: 2992-101. [Medline]](http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=10446034&dopt=Abstract)

Negative binomial and mixed Poisson regression. JF Lawless. The Canadian Journal of Statistics 1987: 15(3); 209-25.

Linear and nonlinear techniques for the deconvolution of hormone time-series. G. De Nicolao

Additional power computations for designing comparative Poisson trials. C. C. Brown

Power computations for designing comparative poisson trials. M Gail. Biometrics 1974: 30(?); 231-37.

A more powerful test for comparing two Poisson means. K. Krishnamoorthy

Power in comparing Poisson means: I. One-sample test. LS Nelson. Journal of Quality Technology 1991: 23(1); 68-70.

Power in comparing Poisson means: II. Two-sample test. LS Nelson. Journal of Quality Technology 1991: 23(2); 163-66.

Sample size for Poisson regression. DF Signorini. Biometrika 1991: 78(2); 446-50.

Application of sample survey methods for modelling ratios to incidence densities. L. M. Lavange

You can find an earlier version of this page on my original website.