4.2 Boxplots

For a video explainer of the code for boxplots and barplots, see below. The video only discusses the code. Use the notes and lecture discussion for additional context. (Via youtube, you can speed up the playback to 1.5 or 2x speed.)

Let’s load the data! Here, note that the data file is in a .RData format instead of .csv. This means that instead of using read.csv, we should use a function to load the data that is suitable for the .RData format. This will be load. That function works the following way:

load("status.RData")

After running the above code, an object will show up in your R environment.

head(status)
##         condition male   econcon
## 2        Concrete    1 0.7500000
## 3     Self-Esteem    1 1.0000000
## 4         Placebo    1 0.6666667
## 5     Self-Esteem    0 0.2500000
## 6     Self-Esteem    0 1.0000000
## 7 Social Approval    0 0.8333333

The data include the following variables

  • condition: Placebo, Concrete, Self-Esteem, Social Approval, Conspicuous Consumption
  • gender: 1= male; 0= otherwise
  • econcon: Economic views. Numeric variable from 0 to 1, with higher values reflecting more conservative views

4.2.1 Data Summary: Boxplot

Characterize the distributions of continuous numeric variables at once

  • Features: box, whiskers, outliers
  • We will supply the function with a column in our data, and the boxplot displays the distribution of that variable.

Figure from Will Lowe

Here is an example of the boxplot using our econcon variable.

  • We have added a title and y-axis label to the plot through the main and ylab arguments. Play around with changing the words in those arguments.
boxplot(status$econcon,
        main="Economic Views in the Survey Sample",
        ylab="Economic Views")

After you execute the plot code, a preview of the plot should appear in the bottom-right window of RStudio.

Boxplots are also useful for data summary across multiple distribution: boxplot(y ~ x, data = d)

boxplot(econcon ~ condition, data=status,
        main="Economic Views by Experimental Condition",
        ylab="Economic Views",
        names = c("Placebo", "Concrete", "Conspicuous", 
                  "Self-Esteem", "Social"),
        xlab = "Experimental Condition",
        col = c("red3", rep("dodgerblue", 4)))

The additional arguments are just aesthetics. Play around with different settings.

  • For example, can you change the code to make the first two boxes red? Colors are supplied as a vector using the col = argument.
    • To explore colors in R, run this function colors() in your R console.

How should we interpret these results? Does status or social approval motivation, specifically, influence economic views? What about other potential motivations?