10.5 Application: Health Savings Study

For a video explainer of the code in this section, see below. (Via youtube, you can speed up the playback to 1.5 or 2x speed.)

Health Savings Experiment (Dupas and Robinson 2013): Field experiment in rural Kenya in which they randomly varied access to four innovative saving technologies and observed the impact on asset accumulation.

1. Start with a research question

Can savings technologies help people accumulate assets?

2. Develop a theory of how the world works

Providing people with a safe place to store money will help them save.

3. Construct “null” and “alternative” hypotheses

What are our null/alternative hypotheses?

4. Carry out a test of the hypothesis, such as a difference-in-means.

  • Individuals in all study arms were encouraged to save for health and were asked to set a health goal for themselves at the beginning of the study.
  • In the first treatment group (Safe Box), respondents were given a box locked with a padlock and the key
  • The dependent variable is the amount saved after 12 months fol2_amtinvest

We will compare average savings between treatment conditions (a difference in means).

rosca <- read.csv("https://raw.githubusercontent.com/ktmccabe/teachingdata/main/rosca.csv",
                  stringsAsFactors = T)
## Compare means
mean.safebox <- mean(rosca$fol2_amtinvest[rosca$safe_box == 1], na.rm=T)
mean.encouragement <- mean(rosca$fol2_amtinvest[rosca$encouragement== 1], 
                           na.rm=T)
diff.means <- mean.safebox - mean.encouragement
diff.means
## [1] 150.3816

5. Calculate the uncertainty around this estimate.

To get uncertainty when calculating a difference in means, we can use the t.test function in R.

  • The first input is the vector of values from one group.
  • The second input is the vector of values from the other group.

To get the t-statistic, underneath the hood of the function, R is estimating the standard error by calculating the standard deviation in the sample and the sample size (the number of people in each condition).

## Compare amount saved for those in Safe Box vs. 
## Encouragement Only conditions
test <- t.test(rosca$fol2_amtinvest[rosca$safe_box == 1],
               rosca$fol2_amtinvest[rosca$encouragement== 1])

test
## 
##  Welch Two Sample t-test
## 
## data:  rosca$fol2_amtinvest[rosca$safe_box == 1] and rosca$fol2_amtinvest[rosca$encouragement == 1]
## t = 2.1083, df = 150.38, p-value = 0.03666
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##    9.445604 291.317636
## sample estimates:
## mean of x mean of y 
##  408.2150  257.8333

6. Decide whether you can reject or fail to reject the hypothesis of no difference

We can extract the group means, the p-value of the difference and confidence interval of the difference.

test$estimate
## mean of x mean of y 
##  408.2150  257.8333
test$conf.int
## [1]   9.445604 291.317636
## attr(,"conf.level")
## [1] 0.95
test$p.value
## [1] 0.03666403

Was the treatment significant? We say something is significant if the p-value is small, such as less than 0.05. We also use this criteria to assess if we should reject the null hypothesis.

  • In a “t test”, the t-statistic serves as the z-score. It is also a ratio of standard errors. The t-statistic and z-scores differ slightly in how we calculate the corresponding p-value, but with a large enough sample size, these are also very similar. The t.test function in R calculates the p-value for you.