Summary

17.6. Summary

The bias-variance trade-off allows us to more precisely describe the modeling phenomena that we have seen in this chapter: under-fitting relates to model bias; over-fitting results in model variance. In Figure 17.4, the \(x\)-axis measures model complexity and the \(y\)-axis measures the components of risk - model bias squared and model variance. Notice how as model complexity increases, model bias decreases and model variance increases. Thinking in terms of test error, we have seen this error first decrease and then increase as the model variance outweighs the decrease in model bias. To select a useful model, we must strike a balance between model bias and variance.

../../_images/bias_modeling_bias_var_plot.png

Fig. 17.4 Bias-variance Trade-off Diagram. As model complexity increases, model variance increases and model bias decreases. In the other direction, model variance decreases and model bias increases as model complexity decreases.

Collecting more observations reduces bias if the model can fit the population process exactly. If the model is inherently incapable of modeling the population (as in the example above), even infinite data cannot get rid of model bias. In terms of variance, collecting more data reduces variance. One recent trend is to select a model with low bias and high intrinsic variance (such as a neural network) but collect many data points so that the model variance is low enough to make accurate predictions. While effective in practice, collecting enough data for these models tends to require large amounts of time and money.

Creating more features, whether useful or not, typically increases model variance. Models with many parameters have many possible combinations of parameters and therefore have higher variance than models with few parameters. On the other hand, adding a useful feature to the data, such as a quadratic feature when the underlying process is quadratic, reduces bias, but even adding a useless feature rarely increases bias.

Being aware of the bias-variance tradeoff can help… And techniques of train-test, cv, and reg are designed to ameliorate this issue.