12.6 Summing Up
In your own research you might ask the following questions:
- Are my observations independent? Do I have constant error variance?
- If independent, you might be fine with a standard model
- If worried about the error variance, you could be fine with implementing some type of robust standard errors. See Gary King’s video for guidance and section 8.5.1 of the course notes.
- Alternatively, is there some grouping structure to my data?
- What are the different levels?
- Are observations repeated over time?
- Large N, few T, generally considered panel data
- Large T, fewer N, generally considered time series cross-sectional
Then, you might consider the following choices if you have a grouping structure:
- Consider clustering standard errors by group in a standard model
- Or, incorporating the grouping structure into a multilevel random effects model
- Do you meet the random effects assumptions?
- Do you need random effects? Perhaps use the ICC to help with this.
- Should you add varying slopes?
- Or, incorporating fixed effects for the groups and/or time
- How much data do you have within groups?
- Do you need to model “level-2” factors?
What we have not yet discussed are the following issues. See links for further study.
- Using fixed effects models for causal inference, specifically
- See Imai and Kim (2019), Imai and Kim (2020)
- And R packages PanelMatch and wfe
- Incorporating dynamics into longitudinal data: \(Y_{it} = \alpha_i + \rho Y_{i t-1} + \beta x_{it} + \epsilon_{it}\)
- Outcome of today is a function of the past outcome modified by new information. Consider including if you think past outcomes influence future outcomes.
- \(\rho\) is an “autocorrelation” term such that \(|\rho| < 1\).
- Problem: Unless \(\rho = 0\), correlation created between regressors and error term \(\rightarrow\) strict exogeneity violated \(\rightarrow\) Nickell Bias. In random effects models, \(y_{it-1}\) also correlated with any group-level effects \(\alpha_i\).
- Video explanation.
- Concern greatest in samples with small \(T\).
- A lot of debates on the inclusion of an LDV and “dynamic” models in general. See here
- Achen C. H. (2001) Why lagged dependent variables can suppress the explanatory power of other independent variables
- Keele, L. and Kelly N. J. (2005) Dynamic models for dynamic theories: the ins and outs of lagged dependent variables
- Arjun Wilkins. (2017). To Lag or Not to Lag?: Re-Evaluating the Use of Lagged Dependent Variables in Regression Analysis
- Arellano-Bond and Anderson-Hsiao estimators