12.4 Additional considerations for fixed effects

12.4.1 Binary dependent variables

You should use caution when adding fixed effects to binary models, such as the logit. They can suffer from the “incidental parameters” problem. Instead of using the logit, with a large number of fixed effects (some would even say not that large), you can use a linear probability model or the conditional logit mode.

For a resource on fitting the conditional logit in R, see here.

The xtlogit fe command in Stata fits this type of model
Note that computing marginal effects and predictions from this model can be more difficult

12.4.2 Two-way fixed effects

We might be concerned not just about unobserved unit effects but also unobserved time effects \[\begin{align*} Y_{it} = \beta_0 + \beta_1x_{it}+ (c_i + v_t + \epsilon_{it}). \end{align*}\]

Here, we may include fixed effects for both units and time. Warning: can be difficult to interpret.

Alternative, sometimes instead of including dummies for each \(t\) period, people will include a “time trend.” This could be a linear time trend, for example, a year variable treated as numeric.

\[\begin{align*} Y_{it} = \alpha_i + \beta_1 x_{it}+ \beta_2 year_{t} + \epsilon_{it}. \end{align*}\]

Or this could be a polynomial, such as a cubic time trend

\[\begin{align*} Y_{it} = \alpha_i + \beta_1 x_{it}+ \beta_2 year_{t} + \beta_3 year_{t}^2 + \beta_4year_{t}^3 + \epsilon_{it}. \end{align*}\]

12.4.3 First Differences

The within- and LSDV estimators are not the only way to incorporate fixed effects into our models. Another type of estimation is first differences.

\(\Delta Y_{it} = \beta \Delta x_{it} + \Delta \epsilon_{it}\) where, for example, \(\Delta Y_{it} = Y_{it} - Y_{it- 1}\). Instead of demeaning over all time. We subtract the previous instance.

Both remove unobserved heterogeneity (i.e., \(\alpha_i - \alpha_i\))
Requires variation in time. \(t\) vs. \(t-1\) or else \(x_{it} - x_{it-1} = 0\) and falls out
When \(T = 2\), fixed effects = first differences
When \(T > 2\), fixed effects \(\neq\) first differences
Under assumptions, both unbiased and consistent
When no serial correlation, then \(SE(\hat \beta_{FD}) > SE(\hat \beta_{FE})\)
If \(\Delta \epsilon_{it}\) are uncorrelated, FD preferred.
In general, hopefully both produce very similar results

I recommend watching this video by Ben Lambert who provides a summary overview of the differences between fixed effects, first differences, and pooled OLS.

12.4.4 Additional Models in R

## first differences
fit.fd <- plm(inv~value+capital, data = Grunfeld, model = "fd", 
              index = c("firm", "year"),
                  effect = "individual")
coef(fit.fd)["capital"]

  capital 
0.2917667

## firm and year effects
fit.twoway <- plm(inv~value+capital, data = Grunfeld, model = "within",
                  index = c("firm", "year"),
                  effect = "twoways")
                  
coef(fit.twoway)["capital"]

  capital 
0.3579163