Section 2 R Overview
This course will primarily use R for analysis, though we will briefly discuss a few areas where Stata may be more efficient.
Learning to program in R is not a primary goal of this course, but in proceeding through the course, you will gain and/or get practice with a lot of R skills.
For those brand new to R, I strongly recommend you complete the following tutorials prior to or at the beginning of the course.
Goal
By the end of the first week of the course, you will want to have R and RStudio installed on your computer (both free) and feel comfortable using R as a calculator and loading datasets into R.
R and RStudio Installation
- This video from Christopher Bail explains the R and RStudio installation process. This involves
- Going to cran, select the link that matches your operating system, and then follow the installation instructions, and
- Visiting RStudio and follow the download and installation instructions. R is the statistical software and programming language used for analysis. RStudio provides a convenient user interface for running R code. You do not need RStudio to use R, but it is free and can make your life easier!
- After installing R and RStudio, you can also follow along with Christopher Bail’s R Basics and Data Wrangling videos to learn the basic functionality of R.
Supplemental Resources
To supplement the above resources, I would recommend playing around with one of the following:
- An additional great resource is Kosuke Imai’s book Quantitative Social Science. The first chapter provides a written overview of installing R and the basic functions of R and R Studio, including loading data and packages into R. Data for the book is available at the bottom of this page. Alternate coding in tidyverse for the book is available here.
- If you were having difficulties following Chris Bail or Kosuke Imai’s instructions for installation, you can try following the first couple of Rutgers Data Librarian Ryan Womack’s videos, which similarly start from the point of installation. They are here. He goes at a slow pace and codes along in the videos. He also has a number of videos on more advanced data analysis topics.
- R for Data Science is another great resource and focuses on “tidyverse” code in R, which can be particularly helpful in data wrangling and visualization.
Note: Much of the code used in the course will rely on “base R” functions (functions that already exist in R). People have also developed tidyverse packages that can be easily installed in R, which supplements base R tools with alternative functions and a syntax based on a particular design philosophy, grammar, and data structure that they find preferable to base R. Using base R vs. tidyverse is often just a matter of personal taste. Either is fine to use in this course, and you will get exposure to code that relies on both.
This is a lot of information to digest all at once. Don’t worry. No one remembers everything. Plan on going back to these resources often throughout the course and beyond. We will have office hours the first week of the course to help troubleshoot issues.