11.6. Creating Plots Using plotly

In this section, we cover the basics of the plotly Python package. plotly is the main tool we use in this book to create plots.

The plotly package has several advantages over other plotting libraries. First, it creates interactive plots rather than static images. When you create a plot in plotly, you can pan and zoom to see parts of the plot that are too small to see normally. You can also hover over plot elements, like the symbols in a scatter plot, to see the raw data values. Second, it can save plots using the SVG file format, which means that images appear sharp even when zoomed in. If you’re reading this chapter in a PDF or paper copy of the book, we used this feature to render plot images. Finally, it has a simple API for creating basic plots, which helps when you’re doing exploratory analysis and want to quickly create many plots.

We’ll go over the fundamentals of plotly in this section. We recommend using the official plotly documentation if you encounter something that isn’t covered here 1.

11.6.1. Figure and Trace Objects

Every plot in plotly is wrapped in a Figure object. Figure objects keep track of what plots to draw. For instance, a single Figure can draw a scatter plot on the left and a line plot on the right. Figure objects also keep track of the plot layout, which includes the size of the plot, title, legend, and annotations.

Let’s look at an example using the dataset of dog breeds.

dogs = pd.read_csv('data/akc.csv').dropna()
dogs
breed group score longevity ... size weight height repetition
2 Brittany sporting 3.54 12.92 ... medium 16.0 48.0 5-15
3 Cairn Terrier terrier 3.53 13.84 ... small 6.0 25.0 15-25
5 English Cocker Spaniel sporting 3.33 11.66 ... medium 14.0 41.0 5-15
... ... ... ... ... ... ... ... ... ...
82 Bullmastiff working 1.64 7.57 ... large 52.0 65.0 40-80
83 Mastiff working 1.57 6.50 ... large 79.0 76.0 80-100
85 Saint Bernard working 1.42 7.78 ... large 70.0 67.0 40-80

43 rows × 12 columns

The plotly.express module provides a concise API for making plots.

import plotly.express as px

We use plotly.express below to make a scatter plot of weight against height for the dog breeds.
Notice that the return value from .scatter() is a Figure object.

fig = px.scatter(dogs, x='height', y='weight', 
                 width=350, height=250)

# fig is a plotly Figure object:
fig.__class__
plotly.graph_objs._figure.Figure

Displaying a Figure object renders it to the screen.

fig
../../_images/viz_plotly_12_0.svg

This particular Figure holds one plot, but Figure objects can hold any number of plots. Below, we create a facet of three scatter plots.

fig = px.scatter(dogs, x='height', y='weight',
                 facet_col='size',
                 width=650, height=250)
fig.update_layout(margin=dict(t=30))
fig
../../_images/viz_plotly_14_0.svg

These three plots are stored in Trace objects. However, we don’t usually manipulate Trace objects manually. Instead, plotly provides functions that automatically create facetted subplots, like the px.scatter function we used here. Now that we have seen how to make a simple plot, we next show how to modify plots.

11.6.2. Modifying Layout

We often need to change the figure’s layout. For instance, we might want to adjust the figure margins or change the axis range. To do this, we can use the Figure.update_layout() method. Let’s look at an example of a scatter plot where the title is cut off because the plot doesn’t have large enough margins.

fig = px.scatter(dogs, x='weight', y='longevity',
                 title='Smaller dogs live longer',
                 width=350, height=250)
fig
../../_images/viz_plotly_18_0.svg

We can adjust the margin to give enough space for the title.

fig = px.scatter(dogs, x='weight', y='longevity',
                 title='Smaller dogs live longer',
                 width=350, height=250)

fig.update_layout(margin=dict(t=30))
fig
../../_images/viz_plotly_20_0.svg

The .update_layout() lets us modify any property of a layout. This includes the plot title (title), margins (margins dictionary), and whether to display a legend (showlegend). The plotly documentation has the full list of layout properties 2.

Figure objects also have .update_xaxes() and .update_yaxes() functions, which are similar to .update_layout(). These two functions let us modify properties of the axes, like the axis limits (range), number of ticks (nticks), and axis label (title). Below, we adjust the range of the x-axis.

fig = px.scatter(dogs, x='weight', y='longevity',
                 width=350, height=250)

fig.update_xaxes(range=[-5, 40])
fig
../../_images/viz_plotly_22_0.svg

The plotly package comes with many plotting methods; we describe several of them in the next section.

11.6.3. Plotting Functions

The plotly methods includes line plots, scatter plots, bar plots, box plots, and histograms. The API is similar for each type of plot. The dataframe is the first argument. Then, we can specify a column of the dataframe to place on the x-axis and a column to place on the y-axis using the x and y keyword arguments.

run = pd.read_csv('data/cherryBlossomMen.csv')
medians = run.groupby('year')[['time']].median().reset_index()
medians
year time
0 1999 5057.0
1 2000 5102.5
2 2001 5218.0
... ... ...
11 2010 5813.0
12 2011 5757.0
13 2012 5248.0

14 rows × 2 columns

# x and y are names of columns in the input dataframe
px.line(medians, x='year', y='time',
        width=350, height=250)
../../_images/viz_plotly_27_0.svg
lifespans = dogs.groupby('size')['longevity'].mean().reset_index()

# x and y work the same for other plotting methods, like px.bar
px.bar(lifespans, x='size', y='longevity',
       width=350, height=250)
../../_images/viz_plotly_28_0.svg

Plotting methods in plotly also contain arguments for making facet plots. We can facet using color on the same plot (color argument), or facet into multiple subplots (facet_col and facet_row). Below are examples of each.

fig = px.scatter(dogs, x='height', y='weight', color='size',
                 width=350, height=250)
fig
../../_images/viz_plotly_30_0.svg
fig = px.histogram(dogs, x='longevity', facet_col='size',
                   width=600, height=200)
fig.update_layout(margin=dict(t=30))
../../_images/viz_plotly_31_0.svg

For the complete list of plotting functions, see the main documentation for plotly 1 or plotly.express 3, the submodule of plotly that we use in the book.

To add context to a plot, we use the ‘plotly’ annotation methods; these are described next.

11.6.4. Annotations

To add annotations to a Figure, we use the Figure.add_annotation() method. Annotations have text and an arrow. The location of the arrow is set using the x and y parameters, and we can shift the location of the text from its default position using the ax and ay parameters.

fig = px.scatter(dogs, x='weight', y='longevity',
                 width=350, height=250)

fig.add_annotation(text='Chihuahuas live 16.5 years on average!',
                   x=2, y=16.5,
                   ax=30, ay=5,
                   xshift=3,
                   xanchor='left')
fig
../../_images/viz_plotly_34_0.svg

This section covered the basics of creating plots using the plotly Python package. We introduced the Figure object, which is the object plotly uses to store plots and their layouts. We covered the basic plot types that plotly makes available, and a few ways to customize plots by adjusting the layout and axes, and by adding annotations. In the next section, we’ll compare plotly to other common tools for creating visualizations in Python.


1(1,2)

https://plotly.com/python/

2

https://plotly.com/python-api-reference/generated/plotly.graph_objects.Layout.html

3

https://plotly.com/python-api-reference/plotly.express.html