10.5 Application: Health Savings Study
For a video explainer of the code in this section, see below. (Via youtube, you can speed up the playback to 1.5 or 2x speed.)
Health Savings Experiment (Dupas and Robinson 2013): Field experiment in rural Kenya in which they randomly varied access to four innovative saving technologies and observed the impact on asset accumulation.
1. Start with a research question
Can savings technologies help people accumulate assets?
2. Develop a theory of how the world works
Providing people with a safe place to store money will help them save.
3. Construct “null” and “alternative” hypotheses
What are our null/alternative hypotheses?
4. Carry out a test of the hypothesis, such as a difference-in-means.
- Individuals in all study arms were encouraged to save for health and were asked to set a health goal for themselves at the beginning of the study.
- In the first treatment group (Safe Box), respondents were given a box locked with a padlock and the key
- The dependent variable is the amount saved after 12 months
fol2_amtinvest
We will compare average savings between treatment conditions (a difference in means).
<- read.csv("https://raw.githubusercontent.com/ktmccabe/teachingdata/main/rosca.csv",
rosca stringsAsFactors = T)
## Compare means
<- mean(rosca$fol2_amtinvest[rosca$safe_box == 1], na.rm=T)
mean.safebox <- mean(rosca$fol2_amtinvest[rosca$encouragement== 1],
mean.encouragement na.rm=T)
<- mean.safebox - mean.encouragement
diff.means diff.means
## [1] 150.3816
5. Calculate the uncertainty around this estimate.
To get uncertainty when calculating a difference in means, we can use the t.test
function in R.
- The first input is the vector of values from one group.
- The second input is the vector of values from the other group.
To get the t-statistic, underneath the hood of the function, R is estimating the standard error by calculating the standard deviation in the sample and the sample size (the number of people in each condition).
## Compare amount saved for those in Safe Box vs.
## Encouragement Only conditions
<- t.test(rosca$fol2_amtinvest[rosca$safe_box == 1],
test $fol2_amtinvest[rosca$encouragement== 1])
rosca
test
##
## Welch Two Sample t-test
##
## data: rosca$fol2_amtinvest[rosca$safe_box == 1] and rosca$fol2_amtinvest[rosca$encouragement == 1]
## t = 2.1083, df = 150.38, p-value = 0.03666
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 9.445604 291.317636
## sample estimates:
## mean of x mean of y
## 408.2150 257.8333
6. Decide whether you can reject or fail to reject the hypothesis of no difference
We can extract the group means, the p-value of the difference and confidence interval of the difference.
$estimate test
## mean of x mean of y
## 408.2150 257.8333
$conf.int test
## [1] 9.445604 291.317636
## attr(,"conf.level")
## [1] 0.95
$p.value test
## [1] 0.03666403
Was the treatment significant? We say something is significant if the p-value is small, such as less than 0.05. We also use this criteria to assess if we should reject the null hypothesis.
- In a “t test”, the t-statistic serves as the z-score. It is also a ratio of standard errors. The t-statistic and z-scores differ slightly in how we calculate the corresponding p-value, but with a large enough sample size, these are also very similar. The
t.test
function in R calculates the p-value for you.