7.5 Additional R shortcuts
There are a few R packages that help generate these quantities of interest AND estimate uncertainty. If you understand what is going on underneath the packages, then you should feel free to use them to avoid manually coding up each process.
7.5.1 Prediction
We’ve already used the prediction package, but now let’s add uncertainty calculation to this. This package is useful for generating confidence intervals around single quantities of interest. The package below is better for constructing confidence intervals around the difference between quantities of interest.
library(prediction)
<- prediction(out.logit, at = list(Allies = c(0, 1)),
preout type = "response", calculate_se = T)
summary(preout)
at(Allies) Prediction SE z p lower upper
0 0.003711 0.0002270 16.343 4.878e-60 0.003266 0.004156
1 0.002760 0.0003135 8.804 1.322e-18 0.002145 0.003374
I believe prediction
relies on the delta method for uncertainty. The others below will have options for simulations or bootstrapped standard errors.
7.5.2 Margins
Thomas Leeper developed a package called margins
which is modeled after the Stata margins
command. Here is some documentation for this package.
We first fit a regression model like normal.
<- glm(Conflict ~ MajorPower + Contiguity + Allies + ForeignPolicy +
out + YearsSince, data=mids, family=binomial(link = "logit")) BalanceOfPower
Open the package and use the margins
command. It is similar to prediction
but we will specify the uncertainty with vce
and tell it which variable we want the marginal effect, and what type of change in that variable we want to calculate the effect.
## install.packages("margins")
library(margins)
Warning: package 'margins' was built under R version 4.0.2
## Difference in Allies 0 vs. 1 holding other covariates at 0
## Using Delta Method for uncertainty
<- margins(out, vce = "delta",
marg1 at = list(MajorPower=0, Contiguity=0,
ForeignPolicy=0, BalanceOfPower=0, YearsSince=0),
variables = "Allies",
change = c(0, 1),
type="response")
Warning in check_values(data, at): A 'at' value for 'BalanceOfPower' is outside
observed data range (0.000173599997651763,1)!
summary(marg1)
factor MajorPower Contiguity ForeignPolicy BalanceOfPower YearsSince AME
Allies 0.0000 0.0000 0.0000 0.0000 0.0000 -0.0010
SE z p lower upper
0.0004 -2.3910 0.0168 -0.0018 -0.0002
Let’s try a second example, shifting the uncertainty estimate and the X covariates.
## Difference in Allies 0 vs. 1 holding MajorPower at 1, other covariates at observed values
## Using simulations for uncertainty
<- margins(out, vce = "simulation",
marg2 at = list(MajorPower =1),
variables = "Allies",
change = c(0, 1),
type="response")
summary(marg2)
factor MajorPower AME SE z p lower upper
Allies 1.0000 -0.0020 0.0010 -1.9576 0.0503 -0.0041 0.0000
## Manual equivalent of point estimate
<- model.matrix(out)
X.marg1 "Allies"] <-1
X.marg1[, "MajorPower"] <- 1
X.marg1[,
<- model.matrix(out)
X.marg0 "Allies"] <-0
X.marg0[, "MajorPower"] <- 1
X.marg0[,
<- coef(out)
BH <- mean(plogis(X.marg1 %*% BH) - plogis(X.marg0 %*% BH))
ame
## Compare
ame
[1] -0.002039019
summary(marg2)$AME
Allies
-0.002039019
7.5.3 Zelig
Zelig has a lot of built-in plotting functions. This can be a great package if you want to present results the way they have set up the package. They make it slightly harder to extract specific elements from the results. Documentation here.
## install.packages("Zelig")
library(Zelig)
<- zelig(Conflict ~ MajorPower + Contiguity + Allies + ForeignPolicy +
z.out + YearsSince, data=mids, model = "logit", cite = F)
BalanceOfPower
## Use function setx to designate values of X
## Let's set everything to 0
<- setx(z.out, MajorPower=0, Contiguity=0, Allies=0,
set.x ForeignPolicy=0, BalanceOfPower=0, YearsSince=0)
<- sim(z.out, set.x, num = 100) # should do more like 1000
s.out1
## Expected value and 95% CI
s.out1
sim x :
-----
ev
mean sd 50% 2.5% 97.5%
[1,] 0.003755489 0.001072985 0.003499368 0.002113449 0.006454072
pv
0 1
[1,] 1 0
## Change Allies to 1, and let's look at the difference between Allies vs. Non-Allies
<- setx1(z.out, MajorPower=0, Contiguity=0, Allies=1,
set.x1 ForeignPolicy=0, BalanceOfPower=0, YearsSince=0)
## Let's look at the QOI that is the difference between these
<- sim(z.out, set.x, set.x1, num = 50)
s.out2
## Get mean and CI's from distribution
mean(s.out2$get_qi(qi="fd", xvalue="x1"))
[1] -0.0009859548
quantile(s.out2$get_qi(qi="fd", xvalue="x1"), c(0.025, 0.975))
2.5% 97.5%
-0.0018093458 -0.0002380693
7.5.4 Using expand.grid
When we use the predict
function in R, we specify a “`newdata” dataframe to indicate for which values of \(X\) we want to estimate values of the outcome. When we do this, we are actually building a dataframe, like the below:
<- data.frame(MajorPower = 0, Allies=1, Contiguity = 0)
df df
MajorPower Allies Contiguity
1 0 1 0
In cases where we want to make predictions for multiple values of a given variable, we can feed a vector into the data.frame:
<- data.frame(MajorPower = 0, Allies=c(0,1), Contiguity = 0)
df df
MajorPower Allies Contiguity
1 0 0 0
2 0 1 0
However, this becomes more tedious when we want to estimate for combinations of different variables (e.g., MajorPower
and Allies
as 0 or 1). You have to tell R precisely which rows should include which values. This means indicating four separate values for each variables to get all possible combinations.
<- data.frame(MajorPower = c(0, 0, 1,1), Allies=c(0,1, 0, 1), Contiguity = 0)
df df
MajorPower Allies Contiguity
1 0 0 0
2 0 1 0
3 1 0 0
4 1 1 0
What expand.grid
does is let you more quickly indicate that you want to estimate for all possibles combinations of the values of the variables you supply. E.g.:
<- expand.grid(MajorPower = c(0, 1), Allies=c(0,1), Contiguity = 0)
df df
MajorPower Allies Contiguity
1 0 0 0
2 1 0 0
3 0 1 0
4 1 1 0
This can be a useful shortcut when you need to look at combinations of variables that have many different values.