10.4 Poisson Quantities of Interest

10.4.1 Expected Counts

Our primary quantity of interest is the expected count, in this case the expected number of executive orders, given certain values of the covariates

\(\mathbf E(Y | X) = \lambda = exp(\mathbf x_i' \beta)\)

  • This means to get our quantities of interest, we exponentiate \(exp(\mathbf x_i' \hat \beta)\) after setting specific values for \(X\), such as at the means of the covariates or at the observed values.

Like other glm models, we can also use predict to do this for us by setting type = response or prediction.

  • For example, let’s find the averaged executive orders expected for unified vs. divided government, holding other covariates at their observed values. We will do this manually and using prediction.
X <- model.matrix(fit)
X[, "divided"] <- 0
B <- coef(fit)
eta <- X %*% B
expcount <- exp(eta)
avg.exp.count.0 <- mean(expcount)
avg.exp.count.0 
[1] 265.7753
X <- model.matrix(fit)
X[, "divided"] <- 1
B <- coef(fit)
eta <- X %*% B
expcount <- exp(eta)
avg.exp.count.1 <- mean(expcount)
avg.exp.count.1
[1] 414.1552
avg.exp.counts <- prediction(fit, at=list(divided=c(0, 1)), type = "response")
summary(avg.exp.counts)
 at(divided) Prediction     SE     z          p lower upper
           0      265.8  2.862 92.85  0.000e+00 260.2 271.4
           1      414.2 15.759 26.28 3.224e-152 383.3 445.0

We can also find the differences in expected counts by subtracting the above estimates from each other, or computing this directly through margins:

avg.exp.count.1 - avg.exp.count.0
[1] 148.3798
avg.diff <- margins(fit, variables="divided", change = c(0, 1), vce = "delta")
summary(avg.diff)
  factor      AME      SE      z      p    lower    upper
 divided 148.3798 16.6768 8.8974 0.0000 115.6938 181.0658

Just like in logit and probit, we have the same options for calculating uncertainty: Delta Method, Simulation, and Bootstrap.

10.4.2 Sidenote: Multiplicative coefficient interpretation

For Poisson, changes in \(x\) will have a multiplicative change in \(y\):

Recall a math rule for exponents to follow the below: \(z^{a+b} = z^a * z^b\)

\[\begin{align*} \mathbf E(Y | X) &= e^{\alpha + x_1 \beta_1 + x_2 \beta_2}\\ &= e^{\alpha} * e^{x_1 \beta_1} *e^{x_2 \beta_2} \end{align*}\]

For example, compare \(x_1\) to \(x_1 + 1\)

\[\begin{align*} \mathbf E(Y | X) &= e^{\alpha + (x_1 + 1) \beta_1 + x_2 \beta_2}\\ &= e^{\alpha} * e^{x_1 \beta_1}* e^{\beta_1} *e^{x_2 \beta_2} \end{align*}\]

We’ve now multiplied the outcome by \(e^{\beta_1}\).

Here’s an example using a bivariate model

fit.biv <- glm(allnoncerm_eo ~ divided ,
           family = "poisson", 
           data = subset(bolton, year < 1945))

## Let's calculate yhat using predict for divided = 0 or 1
yhats <- predict(fit.biv, data.frame(divided = c(0, 1)), type="response")
yhats
       1        2 
280.6471 282.8333 
## Manual
yhatx0 <- exp(coef(fit.biv)[1] + coef(fit.biv)[2]*0)
yhatx1 <- exp(coef(fit.biv)[1] + coef(fit.biv)[2]*1)
yhatx1
(Intercept) 
   282.8333 
## Multiplicative interpretation 
yhatx0*(exp(coef(fit.biv)["divided"]))
(Intercept) 
   282.8333 

10.4.3 Incidence Rate Ratios

Relatedly, similar to logistic regression where we could exponentiate the coefficients to generate estimated odds ratios, in poisson, we can exponentiate the coefficients to get “incidence rate ratios.”

When we say a one-unit change in the independent variable \(x\), this is like saying

  • \(\hat\beta = \log \hat \lambda_{x + 1} - \log \hat \lambda_{x} = \log \frac{\hat \lambda_{x + 1}}{\hat \lambda_{x}}\)

If we exponentiate, the \(\log\) cancels to 1:

  • \(exp(\hat \beta) = exp(\log \frac{\hat \lambda_{x + 1}}{\hat \lambda_{x}})\)
    • \(exp(\hat \beta) = \frac{\hat \lambda_{x + 1}}{\hat \lambda_{x}}\)
    • This quantity represents a ratio of the expected counts or “rates”
    • For a one-unit change in \(x\), the expected count is estimated to change by a factor of \(exp(\hat \beta)\)
      • This can be converted to a percent change interpretion by taking \((IRR - 1) \times 100\)

For example, the incidence rate ratio for executive orders in a year going from unified to divided government is:

exp(coef(fit)["divided"])
divided 
1.55829 

We can see how this works out using the quantities calculated above. Multiplying the expected count when divided = 0 by this ratio gives us the expected count when divided = 1.

exp(coef(fit)["divided"])*avg.exp.count.0
 divided 
414.1552 
avg.exp.count.1
[1] 414.1552

Note how this is a percent change interpretation where

((avg.exp.count.1 - avg.exp.count.0)/ avg.exp.count.0)* 100
[1] 55.82903
(exp(coef(fit)["divided"])-1)*100
 divided 
55.82903 

For a one-unit change going from unified to divided government, we see a 55.8% increase in executive orders during this period.

10.4.4 Where Poisson is poisonous:

Recall the detail when we specified the pmf, that the mean and variance at the same. When we have covariates \(X\), this means we assume the conditional mean and variance are the same:

  • \(\lambda = \mathbf E[Y_i |x_i] = Var(Y_i | x_i)\); the mean and variance is \(\lambda\) in the distribution.
  • Often, our data are “overdispersed” or “underdispersed”, violating this assumption

We can investigate this in our example. First, let’s look at the raw mean and variance of the outcome.

## whoa! very different, 
## but the good news is we care about the *conditional* mean and variance
## not these raw values
mean(subset(bolton$allnoncerm_eo, bolton$year < 1945))
[1] 280.975
var(subset(bolton$allnoncerm_eo, bolton$year < 1945))
[1] 9011.256

We can conduct a test of overdispersion in our model using dispersiontest. If we have a significant result and a dispersion constant \(>\) 0, this would suggest overdispersion.

library(AER)
dispersiontest(fit)

    Overdispersion test

data:  fit
z = 3.6675, p-value = 0.0001225
alternative hypothesis: true dispersion is greater than 1
sample estimates:
dispersion 
  11.02527 

To check for overdisperson, we can also look at the standardized residuals of the model.

## By hand
prvals <- predict(fit, type = "response") # predicted values
res <- subset(bolton$allnoncerm_eo, bolton$year < 1945) - prvals # residual y - predicted values
sres <- res/sqrt(prvals) # standardized residual

## automatic in R
sres <-  residuals(fit,type="pearson") # automatic

We can graphically look at the standardized residuals by levels of the predicted values from our regression. Here, we don’t want residuals that exceed +/- 2.

## don't want values above 2 or below -2
plot(prvals, sres,
     ylab = "standardized residuals", xlab = "predicted values")
abline(h = c(-2, 0, 2), lty = c(2, 1, 2))

We are in danger!! This model suffers from overdispersion. We have two options

  • Check our model. Do we think it is misspecified in terms of the covariates? Is it suffering from omitted variable bias? We can change the specification and re-run the test. Again, the overdispersion is about the conditional mean and variance, so changing the model may change the diagnostics.
  • Use a different distribution. Because this assumption is often violated, we tend to use one of the following for count data
    • Overdispersed poisson (quasipoisson)
    • Negative binomial regression

More on this in the next section