12.6 Summing Up

In your own research you might ask the following questions:

Are my observations independent? Do I have constant error variance?
- If independent, you might be fine with a standard model
- If worried about the error variance, you could be fine with implementing some type of robust standard errors. See Gary King’s video for guidance and section 8.5.1 of the course notes.
Alternatively, is there some grouping structure to my data?
- What are the different levels?
Are observations repeated over time?
- Large N, few T, generally considered panel data
- Large T, fewer N, generally considered time series cross-sectional

Then, you might consider the following choices if you have a grouping structure:

Consider clustering standard errors by group in a standard model
- See guidance here, here, and here. For R code, see section 8.5.1.
Or, incorporating the grouping structure into a multilevel random effects model
- Do you meet the random effects assumptions?
- Do you need random effects? Perhaps use the ICC to help with this.
- Should you add varying slopes?
Or, incorporating fixed effects for the groups and/or time
- How much data do you have within groups?
- Do you need to model “level-2” factors?

What we have not yet discussed are the following issues. See links for further study.

Using fixed effects models for causal inference, specifically
- See Imai and Kim (2019), Imai and Kim (2020)
- And R packages PanelMatch and wfe
Incorporating dynamics into longitudinal data: \(Y_{it} = \alpha_i + \rho Y_{i t-1} + \beta x_{it} + \epsilon_{it}\)
- Outcome of today is a function of the past outcome modified by new information. Consider including if you think past outcomes influence future outcomes.
- \(\rho\) is an “autocorrelation” term such that \(|\rho| < 1\).
- Problem: Unless \(\rho = 0\), correlation created between regressors and error term \(\rightarrow\) strict exogeneity violated \(\rightarrow\) Nickell Bias. In random effects models, \(y_{it-1}\) also correlated with any group-level effects \(\alpha_i\).
- Video explanation.
- Concern greatest in samples with small \(T\).
- A lot of debates on the inclusion of an LDV and “dynamic” models in general. See here
  - Achen C. H. (2001) Why lagged dependent variables can suppress the explanatory power of other independent variables
  - Keele, L. and Kelly N. J. (2005) Dynamic models for dynamic theories: the ins and outs of lagged dependent variables
  - Arjun Wilkins. (2017). To Lag or Not to Lag?: Re-Evaluating the Use of Lagged Dependent Variables in Regression Analysis
  - Arellano-Bond and Anderson-Hsiao estimators