Displaying tables of percentages

Steve Simon

2002-11-06

Dear Professor Mean, My colleagues and I argue over the most appropriate way for displaying tables of percentages. Must the row or column always add to 100%? Also in cases where it is difficult to know which variable is dependent how does one decide the best way to present the results? – Garrulous Gail

Dear Garrulous,

When you are deciding how to display two by two (or larger) tables, you have a variety of ways to do this. No way is correct all the time, and some of choices reflect subjective judgment. But here are some rules I use.

1. Never display more than one type of number in a table.

Statistical software like SPSS can produce counts

Present a single summary statistic in the table if at all possible. If you need to display two summary statistics (for example

2. Row percentages are usually best.

Row percentages are the percentages you compute by dividing each count by the row total. Row percentages place the comparison between two numbers within a single column

If you find that cell percentages make the most sense

3. Place the treatment/exposure variable as rows and outcome variable as columns.

This relates to the above item. You usually are interested in the probability of an outcome like death or disease

4. If one variable has a lot more levels than the other variable, place that variable in row.s

A table that is tall and thin is usually easier to read than a table that is short and wide. It is easier to scroll up and down rather than left and right. For a really large number of levels

**5. Whenever you report percentages

A change on the order of tenths of a percent are almost never interesting or important. Displaying that tenth of a percent makes it harder to manipulate the numbers to see the big picture.

6. Don’t worry about whether your percentages add up to 99% or 101%.

First of all

**7. When in doubt

Pick out the one that gives the clearest picture of what is really happening. Don’t rely on the first draft of your table

Examples

A simple fictitious example will help illustrate these points.

We classify people by their income (rich/poor) and also by their attitude (happy/miserable). There are

This figure shows column percentages. We compute this by dividing each number by the column total.

We see for example that only 25% of all happy people are rich. This is a conditional probability and is usually written as P[Rich | Happy]. Read the vertical bar as “given.” So this probability is read as the probability of being rich given that you are happy.

This figure shows row percentages. We compute this by dividing each number by the row total.

We see

Notice the distinction between the two probabilities. Only a few happy people are rich

This figure shows cell percentages. We compute this by dividing each number by the grand total. Each percentage represents the probability of having two conditions. For example

The table above shows a good format for combining two numbers in a single table.

This is an alternate way of displaying cell percentages.

If we had a six categories for attitude rather than just two

Notice that this table would not require any sideways scrolling.

Summary

  1. Never display more than one type of number in a table.
  2. Row percentages are usually best.
  3. Place the treatment/exposure variable as rows and outcome variable as columns.
  4. If one variable has a lot more levels than the other variable
  1. **Whenever you report percentages
  1. Don’t worry about whether your percentages add up to 99% or 101%.
  2. **When in doubt

You can find an earlier version of this page on my original website.