# Modeling and Summary Statistics

# 4. Modeling and Summary Statistics¶

Essentially, all models are wrong, but some are useful.

A *model* is an idealized representation of a system. You probably use models
all the time. For instance, a weather forecast is a model. A weather forecast
uses past weather, current conditions, and the physics of the atmosphere to
make predictions about the future. Models don’t always match reality, as you’ve
experienced if you’ve been surprised by rain or snow. And even the most
complicated models of weather can’t make precise predictions more than a few
weeks into the future. Still, weather forecasts are useful enough that we check
the forecast before heading outside each day.

We’ve previously introduced a model called the urn model in Chapter 3. Like all models, the urn model is a simpler version of a system. It treats the underlying chance process in data generation like draws of marbles from an urn. In this chapter we introduce another kind of model called the constant model. While the urn model creates simulated data, the constant model takes a data sample and tries to describe the signal in the data by taking out the random variation in the sample. This process is called fitting a model to data. Although the constant model is simple, it serves as a useful building block towards the more complex models appearing later in the book.

For example, the model lets us explain model fitting from the perspective of
*loss minimization*, a technique that connects summary statistics like the mean
and median to more complex models. It also gives us a first look at randomness
and signal in a sample, fundamental parts of modeling that we address later in
`Chapter %s`

.

We’ll begin by introducing the constant model through a dataset of bus stop wait times.