In Conclusion
12.5. In Conclusion¶
In this chapter, we replicated Barkjohn’s analysis. We created a model that corrects PurpleAir measurements so that they closely match AQS measurements. The accuracy of this model enables the PurpleAir sensors to be included on official US government maps, like the AirNow Fire and Smoke map. Importantly, this model gives people timely and accurate measurements of air quality.
We also applied many concepts covered in the book thus far.
We used pandas
code extensively throughout this analysis (and even
a bit of SQL too).
Data wrangling, exploratory data analysis, and data visualization were
major parts of the analysis—we used these concepts to find and correct
numerous issues like granularity, missing data points, and even duplicated
data values.
Finally, we applied modeling concepts to create our final correction model.
We reviewed loss functions and fit two constant models to the data.
And we found that linear models sufficiently reduce model error for
real-world use.
At this point in the book, we encourage you to take stock of what you’ve learned thus far. Pat yourself on the back—you’ve already come a long way! The principles and techniques we’ve covered here are useful for nearly every type of data analysis, and you can readily start applying them towards analyses of your own.