Linear models let us model relationships between features. We discussed the simple linear model and extended it to the multiple linear model. Along the way, we used mathematical techniques that are widely useful in modeling—we used calculus to minimize loss for the simple linear model and matrix geometry for the multiple linear model.
Linear models may seem simple, but they are used for all sorts of tasks today, but they are flexible enough to allow us to include categorical features as well as nonlinear transformations of variables, such as oog-transformations, polynomials, and ratios. Linear models have the advantage of being broadly interpretable for non-technical people, yet sophisticated enough to capture many common patterns in data.
It can be tempting to throw all of the variables available to us into a model to get the “best fit possible”. But, we should keep in mind the geometry of least squares when fitting models. Recall, that \(p\) explanatory variables can be thought of as \(p\) vectors in \(n\)-dimensional space, and if these vectors are highly correlated, then the projections onto this space will be similar to projections onto smaller spaces made up of fewer vectors. This implies that:
Adding more variables may not provide a large improvement in the model
Interpretation of the coefficients can be difficult
Several models can be equally effective in predicting/explaining the response variable
If we are concerned with making inferences, where we want to interpret/understand the model, then we should err on the side of simpler models. On the other hand, if our primary concern is on predictive ability of the model, then we tend not to concern ourselves with the coefficients of the model.
In this chapter, we used linear models in a descriptive way—we looked for patterns in the data that we already have. In the next chapter, we see how to use linear models and simulation techniques to make inferences about the population and predictions for future observations.