12.4 Additional considerations for fixed effects
12.4.1 Binary dependent variables
You should use caution when adding fixed effects to binary models, such as the logit. They can suffer from the “incidental parameters” problem. Instead of using the logit, with a large number of fixed effects (some would even say not that large), you can use a linear probability model or the conditional logit mode.
For a resource on fitting the conditional logit in R, see here.
- The
xtlogit
fe command in Stata fits this type of model - Note that computing marginal effects and predictions from this model can be more difficult
12.4.2 Two-way fixed effects
We might be concerned not just about unobserved unit effects but also unobserved time effects \[\begin{align*} Y_{it} = \beta_0 + \beta_1x_{it}+ (c_i + v_t + \epsilon_{it}). \end{align*}\]
Here, we may include fixed effects for both units and time. Warning: can be difficult to interpret.
Alternative, sometimes instead of including dummies for each \(t\) period, people will include a “time trend.” This could be a linear time trend, for example, a year variable treated as numeric.
\[\begin{align*} Y_{it} = \alpha_i + \beta_1 x_{it}+ \beta_2 year_{t} + \epsilon_{it}. \end{align*}\]
Or this could be a polynomial, such as a cubic time trend
\[\begin{align*} Y_{it} = \alpha_i + \beta_1 x_{it}+ \beta_2 year_{t} + \beta_3 year_{t}^2 + \beta_4year_{t}^3 + \epsilon_{it}. \end{align*}\]
12.4.3 First Differences
The within- and LSDV estimators are not the only way to incorporate fixed effects into our models. Another type of estimation is first differences.
\(\Delta Y_{it} = \beta \Delta x_{it} + \Delta \epsilon_{it}\) where, for example, \(\Delta Y_{it} = Y_{it} - Y_{it- 1}\). Instead of demeaning over all time. We subtract the previous instance.
- Both remove unobserved heterogeneity (i.e., \(\alpha_i - \alpha_i\))
- Requires variation in time. \(t\) vs. \(t-1\) or else \(x_{it} - x_{it-1} = 0\) and falls out
- When \(T = 2\), fixed effects = first differences
- When \(T > 2\), fixed effects \(\neq\) first differences
- Under assumptions, both unbiased and consistent
- When no serial correlation, then \(SE(\hat \beta_{FD}) > SE(\hat \beta_{FE})\)
- If \(\Delta \epsilon_{it}\) are uncorrelated, FD preferred.
- In general, hopefully both produce very similar results
I recommend watching this video by Ben Lambert who provides a summary overview of the differences between fixed effects, first differences, and pooled OLS.
12.4.4 Additional Models in R
## first differences
<- plm(inv~value+capital, data = Grunfeld, model = "fd",
fit.fd index = c("firm", "year"),
effect = "individual")
coef(fit.fd)["capital"]
capital
0.2917667
## firm and year effects
<- plm(inv~value+capital, data = Grunfeld, model = "within",
fit.twoway index = c("firm", "year"),
effect = "twoways")
coef(fit.twoway)["capital"]
capital
0.3579163