Section 13 Choose Your Own Adventure

For the final project, you are creating a research project independently.

Below are some tips for working with new data.

How to explore new data

  1. When you encounter a new dataset, try to also identify the codebook that goes along with it.
    • This should have information describing what the variables are in the data and what the values mean.
    • It should also help clarify what unit is in each row of a dataset (e.g., is each row a survey respondent, an election, a Member of Congress, a country in a particular year)
  2. When you download data, identify the file type (e.g., .csv, .dta, .RData)
    • Recall that chapter 1 of QSS section 1.3.5 (pg. 20) goes over how to read in different file types in R with functions like read.csv, read.dta with the library(foreign) package, etc.
  3. When you load in the data in RStudio, explore it.
    • How many rows? nrow, How many columns? ncol
    • You can View() the data or look at the first few rows with head()
      • Are most variables numeric? Or, does it seem like you have factor and character variables?
    • See if the data seem to match up with what the codebook suggests should be there.