8.5 Step 3: Iterate and Compare Models
When building predictive models, often researchers want to minimize this Root-Mean Squared Error – minimizing the magnitude of the typical prediction error (the distance between the actual value of our outcome, and the true value)
Example: Let’s compare the RMSE from two different models:
## Predicting Runs Scored with OBP
fit <- lm(RS ~ OBP, data = baseball)
sigma(fit)## [1] 39.82189
## Predicting Runs Scored with Batting Average
fit2 <- lm(RS ~ BA, data = baseball)
sigma(fit2)## [1] 51.48172
The Oakland A’s noticed that OBP was a more precise predictor than BA, and RMSE gives us one way to assess this.
8.5.1 Regression with Multiple Predictors
You can also add more than 1 predictor to a regression using the + sign.
## Predicting Runs Scored with OBP and Slugging Percentage
fit3 <- lm(RS ~ OBP + SLG, data = baseball)
sigma(fit3)## [1] 25.12196
Look how the RMSE dropped again, improving our prediction.