7.2 QOI at Designated Values

Usually in social science we have hypotheses about how the predicted probabilities change as one or more of our independent variables change. We will now turn to calculating predicted responses according to specific values of the independent variables.

Recall, sometimes in linear regression, we wanted to calculate a specific estimated value of \(\hat Y_i\) for when we set \(X\) at particular values. (e.g., What value do we estimate for \(Y\) when \(X1 = 2\) and \(X2=4\)?)

In OLS, this would be \(\hat Y_i = \hat \alpha + 2*\hat \beta_1 + 4*\hat \beta_2\)

Here, we can do the same for GLMs by setting specific values for \(X\) when we apply the \(Link^{-1}\) response function.

E.g., What is the predicted probability of \(Y_i = 1\) when \(X1 = 2\) and \(X2=4\)?
- In logistic regression, \(\hat{\pi_i} = \frac{\exp(\hat \alpha + 2*\hat \beta_1 + 4*\hat \beta_2)}{1 + \exp(\hat \alpha + 2*\hat \beta_1 + 4*\hat \beta_2)}\)

Example 1

Example using the Conflict data using different approaches in R.

## Predicted probability when Allies = 1, and all other covariates = 0
allies1 <- predict(out.logit, newdata = 
                     data.frame( MajorPower = 0,
                                 Contiguity = 0, 
                                 Allies = 1, 
                                 ForeignPolicy = 0,BalanceOfPower = 0, 
                                 YearsSince = 0),
                   type = "response")
allies1

          1 
0.002632504

## for allies = 1, careful of the order to make same as coefficients
X <- cbind(1, 0, 0, 1, 0, 0 , 0) 
Bh <- coef(out.logit)

## Approach 1
plogis(X %*% bh)

            [,1]
[1,] 0.002632504

## Approach 2
exp(X%*% bh)/(1 + exp(X%*% Bh))

            [,1]
[1,] 0.002632504

Example 2

Second example keeping X at observed values. Here the manual approach is easier given the limitations of predict (at least until we learn a new package). Now we are estimating \(N\) predicted probabilities, so we take the mean to get the average estimate.

## for allies = 1careful of the order to make same as coefficients
X <- model.matrix(out.logit)
X[, "Allies"] <- 1 # change Allies to 1, leave everything else as is
Bh <- coef(out.logit)

## Approach 1
mean(plogis(X %*% bh))

[1] 0.002759902

## Approach 2
mean(exp(X%*% bh)/(1 + exp(X%*% Bh)))

[1] 0.002759902

This is the average predicted probability of having a dispute when the dyad states are Allies, holding other covariates at their observed values.

Here is a brief video with a second example of the process above, leading into the discussion of marginal effects below. It uses the anes data from Banda and Cassese in section 6.

7.2.1 Marginal Effects

Recall, in linear regression a one-unit change in \(X_k\) is associated with a \(\hat \beta_k\) change in \(Y\) no matter where we are in the domain of \(X_k\). (The slope of a line is constant!)

The catch for glm’s, again, is that our linear predictor (\(\eta\)) is often not in the units of \(Y\) that we want. E.g., In logistic regression, a one-unit change in \(X_k\) is associated with a \(\hat \beta_k\) logits change

Recall, for logit and probit, this takes us into an S-curve for \(Pr(Y_i = 1)\) instead of a line
Well, the slope of an S-curve is not constant. Depending on where we are in \(X\), it will influence how much change we have in the predicted probability of \(Y_i = 1\).
Therefore, to understand the marginal effect in glm’s we have to set \(X\) to particular values and be careful about the values we select.
- By “careful,” this means choosing sensible, theoretically informed values of interest.

You can generate predictions based on any values. Here are three common approaches for understanding the marginal effect of a particular variable \(X_k\).

Marginal effects at the mean
Average marginal effects
Marginal effects at representative values

Wait, what do we mean by marginal effects?

For a discrete (categorical/factor) variable (\(X_k\)) this will be the change in predicted probability associated with a one-unit change in (\(X_k\)).
For continuous variables (\(X_k\)), this technically is the instantaneous rate of change (change in probability associated with a very small change in \(X\)).
- Usually instead of estimating this (what is a very small change anyway?) we will do this by hand instead, and set the specific amount of change). Often, this is called “discrete change” or “first difference” effect.

7.2.2 Marginal effects at the mean

In this approach, when we calculate the difference in predicted probability resulting from a one-unit change in \(X_k\), we set all other covariates \(X_j\) for \(j \neq k\) at their mean values.

This gives us 1 estimate for the difference in predicted probability
When can this be problematic? (think categorical variables)

7.2.3 Marginal effects at representative values

In this approach, when we calculate the difference in predicted probability resulting from a one-unit change in \(X_k\), we set all other covariates \(X_j\) for \(j \neq k\) at values that are of theoretical interest. This could be the mean value, modal vale, or some other value that makes sense for our research question.

This gives us 1 estimate for the difference in predicted probability
Depending on your research question, there may/may not be a particularly interesting set of representative values on all of your covariates.

The example above where we held all other covariates at zero would be an example of calculating marginal effects at representative values.

7.2.4 Average marginal effects

In this approach, when we calculate the difference in predicted probability resulting from a one-unit change in \(X_{ik}\), we hold all covariates \(X_{ij}\) for \(j \neq k\) at their observed values.

This gives us \(N\) estimates for the difference in predicted probability
We report the average of these estimates

Here is an example for average marginal effects. Let’s sat we were interested in the difference in probability of a dispute for Allies vs. non-Allies, when all other covariates are zero. We can do this manually or in predict.

## Extract beta coefficients
Bh <- coef(out.logit)

## Set Allies to 1, hold all other covariates as observed
X1 <- model.matrix(out.logit)
X1[, "Allies"] <- 1

## Set Allies to 0, hold all other covariates as observed
X0 <- model.matrix(out.logit)
X0[, "Allies"] <- 0

pp1 <- mean(plogis(X1 %*% Bh))
pp0 <- mean(plogis(X0 %*% Bh))
pp1 - pp0

[1] -0.0009506303

This represents the average difference in predicted probability of having a dispute for Dyads that are Allies vs. not Allies.

7.2.5 `prediction` and `margins` packages.

There are functions that can make this easier so long as you understand what they are doing. One package developed by Dr. Thomas Leeper is prediction. A second is margins. Documentation available here and here. It is always important to understand what’s going on in a package because, for one, it’s possible that the package will stop being updated, and you will have to find an alternative solution.

We will focus on the prediction package first. The prediction function generates specific quantities of interest. An advantage it has over the built-in predict function is that it makes it easier to “hold all other variables at observed values.” In the prediction function, you specify the designated values for particular variables, and then by default, it assumes you want to hold all other variables at observed values. Here is an example of generating predicted probabilities for Allies = 1 and Allies = 0. It will generate the summary means of these two predictions.

## install.packages("prediction")
library(prediction)

## By default, allows covariates to stay at observed values unless specified
prediction(out.logit, at = list(Allies = c(0, 1)), 
           type = "response")

Data frame with 200000 predictions from
 glm(formula = Conflict ~ MajorPower + Contiguity + Allies + ForeignPolicy + 
    BalanceOfPower + YearsSince, family = binomial(link = "logit"), 
    data = mids)
with average predictions:

 Allies        x
      0 0.003711
      1 0.002760

## compare with the manual calculated values above
pp0

[1] 0.003710532

pp1

[1] 0.002759902

7.2.6 QOI Practice Problems

Conduct the following regression using glm

\(Pr(Conflict_i = 1 | X) = logit^{-1}(\alpha + \beta_1 * Allies_i + \beta_2 * MajorPower_i + \beta_3 * ForeignPolicy_i)\)

What is the predicted probability of entering a dispute when the dyad includes a major power, holding all covariates at observed values?
Repeat the previous exercise, but now use probit. How similar/different are the predicted probability estimates?

Try on your own, then expand for the solution.

## Problem 1
out.logit2 <- glm(Conflict ~ Allies + MajorPower + ForeignPolicy, data=mids,
                  family = binomial(link = "logit"))

## Problem 2
library(prediction)
prediction(out.logit, at = list(MajorPower = 1), 
           type = "response")

Data frame with 100000 predictions from
 glm(formula = Conflict ~ MajorPower + Contiguity + Allies + ForeignPolicy + 
    BalanceOfPower + YearsSince, family = binomial(link = "logit"), 
    data = mids)
with average prediction:

 MajorPower        x
          1 0.007745

## Problem 3
out.probit <- glm(Conflict ~ Allies + MajorPower + ForeignPolicy, data=mids,
                  family = binomial(link = "probit"))
prediction(out.probit, at = list(MajorPower = 1), 
           type = "response")

Data frame with 100000 predictions from
 glm(formula = Conflict ~ Allies + MajorPower + ForeignPolicy, 
    family = binomial(link = "probit"), data = mids)
with average prediction:

 MajorPower       x
          1 0.01451

## Manual approach
X <- model.matrix(out.probit)
X[, "MajorPower"] <- 1
Bhat <- coef(out.probit)

mean(pnorm(X %*% Bhat))

[1] 0.01450613