15.6. Summary

Linear models let us model relationships between variables for the first time in this book. We discussed the simple linear model and extended it to the multiple linear model. Along the way, we used mathematical techniques that are widely useful in modeling—we used calculus to minimize loss for the simple linear model and used matrix geometry for the multiple linear model. We concluded the chapter by introducing one-hot encoding, a feature engineering technique that lets us fit models on categorical data.

Linear models may seem simple, but they are used for all sorts of tasks today because they are interpretable enough for non-technical people to understand, yet sophisticated enough to capture many common patterns in data. Data scientists use linear models to measure the size of an effect, to make predictions, and to calibrate scientific instruments (Chapter 12).

In this chapter, we used linear models in a descriptive way—we looked for patterns in the data that we already have. In the next chapter, we’ll see how to use linear models and simulation techniques to make inferences about the population.