10.5 Application: Health Savings Study

For a video explainer of the code in this section, see below. (Via youtube, you can speed up the playback to 1.5 or 2x speed.)

Health Savings Experiment (Dupas and Robinson 2013): Field experiment in rural Kenya in which they randomly varied access to four innovative saving technologies and observed the impact on asset accumulation.

1. Start with a research question

Can savings technologies help people accumulate assets?

2. Develop a theory of how the world works

Providing people with a safe place to store money will help them save.

3. Construct “null” and “alternative” hypotheses

What are our null/alternative hypotheses?

4. Carry out a test of the hypothesis, such as a difference-in-means.

Individuals in all study arms were encouraged to save for health and were asked to set a health goal for themselves at the beginning of the study.
In the first treatment group (Safe Box), respondents were given a box locked with a padlock and the key
The dependent variable is the amount saved after 12 months fol2_amtinvest

We will compare average savings between treatment conditions (a difference in means).

rosca <- read.csv("https://raw.githubusercontent.com/ktmccabe/teachingdata/main/rosca.csv",
                  stringsAsFactors = T)

## Compare means
mean.safebox <- mean(rosca$fol2_amtinvest[rosca$safe_box == 1], na.rm=T)
mean.encouragement <- mean(rosca$fol2_amtinvest[rosca$encouragement== 1], 
                           na.rm=T)
diff.means <- mean.safebox - mean.encouragement
diff.means

## [1] 150.3816

5. Calculate the uncertainty around this estimate.

To get uncertainty when calculating a difference in means, we can use the t.test function in R.

The first input is the vector of values from one group.
The second input is the vector of values from the other group.

To get the t-statistic, underneath the hood of the function, R is estimating the standard error by calculating the standard deviation in the sample and the sample size (the number of people in each condition).

## Compare amount saved for those in Safe Box vs. 
## Encouragement Only conditions
test <- t.test(rosca$fol2_amtinvest[rosca$safe_box == 1],
               rosca$fol2_amtinvest[rosca$encouragement== 1])

test

## 
##  Welch Two Sample t-test
## 
## data:  rosca$fol2_amtinvest[rosca$safe_box == 1] and rosca$fol2_amtinvest[rosca$encouragement == 1]
## t = 2.1083, df = 150.38, p-value = 0.03666
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##    9.445604 291.317636
## sample estimates:
## mean of x mean of y 
##  408.2150  257.8333

6. Decide whether you can reject or fail to reject the hypothesis of no difference

We can extract the group means, the p-value of the difference and confidence interval of the difference.

test$estimate

## mean of x mean of y 
##  408.2150  257.8333

test$conf.int

## [1]   9.445604 291.317636
## attr(,"conf.level")
## [1] 0.95

test$p.value

## [1] 0.03666403

Was the treatment significant? We say something is significant if the p-value is small, such as less than 0.05. We also use this criteria to assess if we should reject the null hypothesis.

In a “t test”, the t-statistic serves as the z-score. It is also a ratio of standard errors. The t-statistic and z-scores differ slightly in how we calculate the corresponding p-value, but with a large enough sample size, these are also very similar. The t.test function in R calculates the p-value for you.