Dimensionality Reduction and PCA

22. Dimensionality Reduction and PCA

Data visualization techniques allow us to search for patterns between variables of a dataset. However, it is challenging to search for patterns in data that contain many columns since most data visualization techniques can only plot a few variables at once. To address this, we turn to dimensionality reduction techniques which enable us to explore and analyze data that have many columns. Broadly speaking, these techniques summarize the data using a few columns which then allows us to explore the simplified data using standard data visualization techniques. This chapter introduces principal component analysis (PCA), a useful technique for performing dimensionality reduction.