7.2 Process of Prediction
Predict (estimate/guess) some unknown using information we have – and do so as accurately and precisely as possible.
- Choose an approach
- Using an observed (known) measure as a direct proxy to predict an outcome
- Using one or more observed (known) measures in a regression model to predict an outcome
- (Beyond the course) Using a statistical model to select the measures to use for predicting an outcome
- Assess accuracy and precision
- Prediction error: \(Prediction - Truth\)
- Bias: Average prediction error: \(\text{mean}(Prediction - Truth)\)
- A prediction is `unbiased’ if the bias is zero (If the prediction is on average true)
- Root-mean squared error: \(\sqrt{\text{mean}((Prediction - Truth)^2)}\)
- Like `absolute’ error– the average magnitude of the prediction error
- the typical distance the prediction is from the truth
- Confusion Matrix
- A cross-tab of predictions you got correct vs. predictions you got wrong (misclassified)
- Gives you true positives and true negatives vs. false positives and false negatives
- Iterate to improve the prediction/classification
- Often, we repeat steps 1-3 until we are confident in your method for predicting.
Eventually, after you have tested the approach and are satisfied with the accuracy, you may start applying it to new data for which you do not know the right answer.