6.3 Probit Regression
Probit regression is very similar to logit except we use a different link function to map the linear predictor into the outcome. Both the logit and probit links are suitable for binary outcomes with a Bernoulli distribution. If we apply a probit transformation, this also restricts our estimates to between 0 and 1.
- \(\pi_i = Pr(Y_i = 1| X_i) = \Phi(\mathbf{x}_i'\beta)\)
- \(\eta_i = \Phi^{-1}(\pi_i) = \mathbf{x}_i'\beta\)
Here, our coefficients \(\hat \beta\) represent changes in “probits” or changes “z-score” units. We use the Normal CDF (\(\Phi()\)) aka pnorm()
in R to transform them back into probabilities, specifically, the probability that \(Y_i\) is 1.
Let’s fit our binary model with probit. We just need to change the link function.
We can then fit using glm
where family = binomial(link="probit")
<- glm(partbinary ~ female + edu + age + sexism, data=anes,
out.probit family = binomial(link="probit"))
Let’s apply the equation tool to this:
## Each time after, run library
library(equatiomatic)
## Will output in latex code, though see package for details on options
extract_eq(out.probit, wrap = TRUE, terms_per_line = 3)
\[ \begin{aligned} P( \operatorname{partbinary} = \operatorname{1} ) &= \Phi[\alpha + \beta_{1}(\operatorname{female}) + \beta_{2}(\operatorname{edu})\ + \\ &\qquad\ \beta_{3}(\operatorname{age}) + \beta_{4}(\operatorname{sexism})] \end{aligned} \]
The summary output includes the probit coefficients, standard errors, z-scores, and p-values.
summary(out.probit)
Call:
glm(formula = partbinary ~ female + edu + age + sexism, family = binomial(link = "probit"),
data = anes)
Deviance Residuals:
Min 1Q Median 3Q Max
-2.6343 0.3188 0.4470 0.6361 1.2477
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.603661 0.184864 3.265 0.00109 **
female -0.202300 0.083407 -2.425 0.01529 *
edu 0.179264 0.027611 6.493 8.44e-11 ***
age 0.005145 0.002257 2.280 0.02261 *
sexism -0.898871 0.186443 -4.821 1.43e-06 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 1361.5 on 1584 degrees of freedom
Residual deviance: 1250.6 on 1580 degrees of freedom
(355 observations deleted due to missingness)
AIC: 1260.6
Number of Fisher Scoring iterations: 5
We can interepret the sign and significance of the coefficients similarly to OLS. They just aren’t in units of \(Y\). In the section next week, we will discuss in detail how to generate quantities of interest from this output.