I’ve spent a bit of time trying to learn how to run a program called BUGS. The acronym stands for Bayes Using Gibbs Sampling. Here is my first serious attempt to run a BUGS program.

There are several web pages about this program, most notably The BUGS Project Welcome Page at

That page had links to several resources about BUGS, including a recently published book by George G. Woodworth, Biostatistics: A Bayesian Introduction. The author placed a PDF version of the Appendix describing how to use WinBUGS, the Windows version of BUGS on the web at

There is also a version of BUGS that works under R, called BRugs. A nice description of BRugs appears at the website devoted to the book by Andrew Gelman, John B. Carlin, Hal S. Stern, and Donald B. Rubin, Bayesian Data Analysis.”

In particular, the page

offers some very useful advice.

Dr. Woodworth starts with the simplest of possible examples, the beta-binomial model. To fit this, you need to prepare a model statement

`MODEL { p ~ dbeta(a,b) x ~ dbin(p,n) }`

This model specifies that the number of successes x is represented by a binomial distribution with probability of success p and number of trials n. It also specifies a prior distribution for p which is beta with parameters alpha and beta equal to a and b, respectively.

You also need a data statement

`DATA list(a=1,b=1,x=11,n=25)`

This data list provides information about the observed data. You collected data on 25 subjects and 11 of them were classified as success. It also provides specifications for the prior distribution. The values a=1 and b=1 produce a beta(1,1) distribution which is also known as a uniform distribution. This represents a “flat” or uninformative prior. Data analysts select a flat prior when they have little information prior to the start of the study about the true value of p or if they prefer to let the observed data provide most of the weight in the resulting posterior distribution.

The steps in running BUGS are

- check your syntax,
- load the data set,
- compile your model,
- generate initial values,
- monitor the nodes,
- update the model,
- exclude the early samples,
- calculate summary statistics.

Here are the eight steps in a BUGS model using the beta-binomial model..

**Step 1. check your syntax**. In WinBUGS, you need to open the
* Specification Tool* dialog box

by selecting * Model | Specification* from the menu. Highlight any
set of characters inside the curly braces of the MODEL statement and
then click on the

*button. In BRugs, you need to store the model as a text file and then call the function:*

**check model**`checkModel("BBModel.txt")`

In either case, you should get a message that reads

`model is syntactically correct`

**Step 2. load the data set**. In WinBUGS, highlight the word “list”
and click on the * load data* button. In BRugs, you need to store
the data as a text file and then call the function

`modelData("BBData.txt")`

In either case, you should get a message that reads

`data loaded`

**Step 3. compile your model**. In WinBUGS, click on the * compile*
button. In BRugs, call the function

modelCompile()

You will see a message

model compiled

**Step 4. Generate initial values**. The simulation methods in BUGS
require starting values. For simple models, BUGS can generate those
starting values for you, but for more complex models, you have to
provide starting values. The beta-binomial model is simple enough that
BUGS can do this.

In WinBUGs, click on the * gen inits* button and in BRugs, call the
function

`modelGenInits()`

You will see a message

`initial values generated, model initialized`

**Step 5. monitor the nodes**.

In WinBUGS, open the * Sample Monitor Tool* dialog box

by selecting * Inference | Samples* from the menu. Type “p” in
the

*field and click on the*

**node***button. In BRugs, call the function*

**set**`samplesSet("p")`

You will get the diagnostic message

`monitor set for variable 'p'`

**Step 6. update the model**. In WinBUGS, open the * Update Tool*
dialog box

by selecting * Model | Update* from the menu. For this analysis,
5,000 is a reasonable value for the

*field. Click on the*

**updates***button after you have made this change. In BRugs, call the function*

**update**`modelUpdate(5000)`

You will see the diagnostic message

`5000 updates took 0 s`

That was fast!

**Step 7. exclude the early values**. The Markov Chain is dependent on
the initial starting value and sometimes takes a while to settle down
to a process that reflects the true probability distribution (this is
called a steady state). The early values in the Markov Chain may thus
exhibit some artefactual patterns that can provide misleading results.
You will often get better simulation results if you ignore the first
portion of the chain. In WinBUGS, adjust the * Sample Monitor Tool*
dialog box to exclude the first thousand samples by changing the

*field from 1 to 1000. In BRugs, call the function*

**beg**`samplesSetBeg(1000)`

**Step 8. calculate summary statistics**. In WinBUGS, click on the
* stats* button in the

*dialog box. In BRugs, call the function*

**Sample Monitor Tool**samplesStats(“p”)

This produces the output

` mean sd MC_error val2.5pc median val97.5pc start sample p 0.4421 0.09435 0.001237 0.2634 0.441 0.6286 1000 5000`

**What do these numbers mean**? In this example, you observed x=11
successes in a sample of size n=25. The classical estimate of the
probability of success would be 11/25=0.44. In the Bayesian example,
your prior distribution is beta(1,1). A simple algebraic solution for
the posterior distribution is easy enough to find (it is a beta
distribution with alpha=12 and beta=15), but using WinBUGS or BRugs,
you can also illustrate how a simulation could produce roughly the
same answer.

The value labeled “mean” represents the mean of the posterior distribution. The simulation estimates this value to be 0.4421. If you used the algebraic solution, you would get a posterior mean of 0.4444. The value labeled “sd” represents the standard deviation of the posterior distribution.

The value labeled “MC_error” is the uncertainty associated with this particular simulation. If this value is much smaller than the standard deviation of the posterior distribution, then your simulation has done well. If it is not at least an order of magnitude smaller, then you should consider changing the simulation (e.g., increasing the number of samples).

The values labeled “val2.5pc” and “val97.5pc” represent an interval that contains 95% of the posterior distribution. This interval is called the Credible Interval (CrI) and represents the Bayesian alternative to a confidence interval. The simulated values 0.2634 and 0.6286 show that there is a fair amount of uncertainty about the true proportion p, which is hardly surprising given the small sample size. These values for the CrI computed by the simulation compare nicely with the exact values from the algebraic solution (0.2659 and 0.6308).

The value labeled “median” represents the median of the posterior distribution. In this example, the posterior distribution is almost symmetric, so this value differs only slightly from the mean of the posterior distribution. The simulated value of the median (0.4410) and the exact value from the algebraic solution (0.4431) are very close.

The values labeled “start” and “sample” simply remind you what you set for the beginning and end of the Markvo chain simulation

**Combining these steps**. In BRugs, you can combine these steps by
using the function BRugsFit. Here’s an example:

`BBOutput <- BRugsFit(modelFile="BBModel.txt",data="BBData.txt", numChains=1,parametersToSave="p",nBurnin=1000,nIter=5000) `

The diagnostic messages produced by this function are shown below:

`model is syntactically correct data loaded model compiled initial values generated, model initialized 1000 updates took 0 s deviance set monitor set for variable 'p' 4000 updates took 0 s`

The output from this function is stored in BBOutput. When you list BBOutput, you get the following information:

`> BBOutput Stats mean sd MC_error val2.5pc median val97.5pc start sample p 0.4428 0.09491 0.001376 0.2626 0.4416 0.6296 1001 5000 $DIC Dbar Dhat DIC pD x 4.641 3.677 5.605 0.9639 total 4.641 3.677 5.605 0.9639`

Notice that the output here is slightly different than the output shown above. The mean, for example, is 0.4428 rather than 0.4421. This should not be too surprising. The BUGS software computes estimates using a simulation, and these change each time you run the program. The difference between the two estimates is about the same order of magnitude as the Monte Carlo error.

Also notice that the BRgusFit function, by default, includes an additional set of statistics known as the Deviance Information Criterion. This statistic is useful for comparing models of differing complexity. There is no way to interpret this value for a single model in isolation.

**Further steps.** There are additional graphical and statistical
summaries available in both WinBUGS and BRugs. There are also
diagnostic tools to help you determine whether the markov chain has
settled down sufficiently to provide good estimates.