9.3 Running multinomial logit in R

The multinomial specification will estimate coefficients that are relative to a baseline level of the outcomes. We should be thoughtful about what we choose as the baseline so that the coefficients are useful to us. Here, the author chooses to use “other” as the baseline.

In R, we can easily change the baseline level of a factor variable. So first, we should check the class of the outcome variable and convert it into a factor variable if needed.

class(ker$genderimmedambcat3)
[1] "factor"

We can then adjust the baseline level using the relevel command as has the ref argument for specifying a reference category.

## Sets a level as the base category for the factor
ker$genderimmedambcat3 <- relevel(ker$genderimmedambcat3, ref = "other")

Let’s run a simple regression model with just female as a covariate to see how the regression output differs from other models we have used thus far.

## install.packages("nnet")
library(nnet)
fit0 <- multinom(genderimmedambcat3 ~ factor(female), data = ker)
# weights:  18 (10 variable)
initial  value 3153.496666 
iter  10 value 2849.378831
final  value 2837.879460 
converged

You can see that there is now a set of coefficients– Intercept and for female, for each outcome category-baseline comparison. We also do not automatically have the p-values for the output.

We can find the p-values the same way we have done before by calculating z-scores through the division of the coefficients over the standard errors. Once we have the z-scores we can use pnorm the same way we did in the ordinal section.

z <- summary(fit0)$coefficients/summary(fit0)$standard.errors
p <- as.matrix(2*(pnorm(abs(z), lower.tail = F)))
p
                              (Intercept) factor(female)1
deputy/regidor ballot access 8.040554e-39      0.10711432
senate ballot access         1.096025e-57      0.03803764
mayor/gov ballot access      1.119624e-33      0.06783023
cabinet/party leader         2.110040e-54      0.65008684
bur appt                     2.383587e-26      0.01908118

As you can see, the output here for the coefficients is already messy, and we only have one covariate!! Just imagine how messy it can get with several covariates. Often, because of this, researchers move to present results visually instead. The practice problems will include a chance to replicate one of the author’s visuals from the JOP paper.

9.3.1 Multinomial Quantities of Interest

Like the previous models, we can calculate predicted probabilities at specific values of our \(X\) covariates. What differs here is just the specific form of this function.

Predicted probabilities of the outcome being in a particular outcome category \(C_j\):

  • \(Pr(Y_i = C_j | X) = \frac{\exp(X\hat \beta_j)}{1 + \sum_{j=1}^{J-1} \exp(X \hat \beta_j)}\)
    • The numerator is very similar to the numerator for when we have a binary logistic regression
    • The denominator sums up \(\exp(X \hat \beta_j)\) where \(\hat \beta_j\) represents the set of coefficients for each outcome category except for the baseline (i.e., \(J- 1\) means 1 less than the total number of outcome categories).
    • Recall that in the baseline category \(J\), \(\hat \beta_J\) is forced to 0 to help with the estimation or “identification” of the other coefficients relative to that category. Well why bring that up now? Well, \(exp(X \hat \beta_J) = exp(0) = 1\). When we estimate probabilities in the baseline category, the numerator will just be 1.

Let’s take an example of finding the predicted probability of \(C_j\) = Senate ballot access when \(x_i\) is set to be female = 1.

## Create the model matrix
Xfemale <- cbind(1, 1) # 1 for intercept, 1 for female

## Extract coefficients and give them informative labels
Bsenate <- coef(fit0)[2,] # 2nd row of coefficients
Bdep <- coef(fit0)[1,]
Bmayor <- coef(fit0)[3,]
Bcabinet <- coef(fit0)[4,]
Bbur <- coef(fit0)[5,]

## Probability of senate ballot access for female
exp(Xfemale %*% Bsenate)/ (1 + exp(Xfemale %*% Bsenate) + exp(Xfemale %*% Bdep) +
                             exp(Xfemale %*% Bmayor) + exp(Xfemale %*% Bcabinet) +
                             exp(Xfemale %*% Bbur))
           [,1]
[1,] 0.09846309

Let’s do the same for female = 0. We just need to change X.

Xnfemale <- cbind(1, 0)
exp(Xnfemale %*% Bsenate)/ (1 + exp(Xnfemale %*% Bsenate) + exp(Xnfemale %*% Bdep) +
                             exp(Xnfemale %*% Bmayor) + exp(Xnfemale %*% Bcabinet) +
                             exp(Xnfemale %*% Bbur))
           [,1]
[1,] 0.06829273

We might also be interested in the probability of \(C_j\) = other. The syntax here will look slightly different because “other” was the baseline category. The denominator stays the same, but the numerator is just 1.

## Manual- probability of other for female (the base category)
1/ (1 + exp(Xfemale %*% Bsenate) + exp(Xfemale %*% Bdep) +
                             exp(Xfemale %*% Bmayor) + exp(Xfemale %*% Bcabinet) +
                             exp(Xfemale %*% Bbur))
          [,1]
[1,] 0.3538449

Just like in the other models, we can rely on outside packages, too. For some models, these packages are not going to have full capabilities– they might not be able to calculate standard errors, for example. These are “living packages” so you can always check the documentation and update the packages to see if new capabilities have been added.

library(prediction)
p.senate <- prediction(fit0, at  = list(female = 1))
mean(p.senate$`Pr(senate ballot access)`)
[1] 0.09846309
mean(p.senate$`Pr(other)`)
[1] 0.3538449

We can use margins to calculate the difference in predicted probabilities, for example, between female=1 and female=0. We should specify the category for which we want this comparison. It appears we need to set vce = booststrap for this to work.

library(margins)
m.senate <- margins(fit0, variables = "female", category = "senate ballot access", 
                    vce="bootstrap", change=c(0,1))
summary(m.senate)

Of course, we could also do this manually, have a quantity of interest party, and run the bootstrap ourselves.