Creating Plots Using plotly
Contents
11.6. Creating Plots Using plotly
¶
In this section, we cover the basics of the plotly
Python package.
plotly
is the main tool we use in this book to create plots.
The plotly
package has several advantages over other plotting libraries.
First, it creates interactive plots rather than static images.
When you create a plot in plotly
, you can pan and zoom to see parts of the
plot that are too small to see normally.
You can also hover over plot elements, like the symbols in a scatter plot, to
see the raw data values.
Second, it can save plots using the SVG file format, which means that
images appear sharp even when zoomed in. If you’re reading this chapter
in a PDF or paper copy of the book, we used this feature to render plot images.
Finally, it has a simple API for creating basic plots, which helps when
you’re doing exploratory analysis and want to quickly create many plots.
We’ll go over the fundamentals of plotly
in this section.
We recommend using the official plotly
documentation if you encounter
something that isn’t covered here 1.
11.6.1. Figure
and Trace
Objects¶
Every plot in plotly
is wrapped in a Figure
object.
Figure
objects keep track of what plots to draw.
For instance, a single Figure
can draw a scatter plot on the left and
a line plot on the right.
Figure
objects also keep track of the plot layout, which includes the
size of the plot, title, legend, and annotations.
Let’s look at an example using the dataset of dog breeds.
dogs = pd.read_csv('data/akc.csv').dropna()
dogs
breed | group | score | longevity | ... | size | weight | height | repetition | |
---|---|---|---|---|---|---|---|---|---|
2 | Brittany | sporting | 3.54 | 12.92 | ... | medium | 16.0 | 48.0 | 5-15 |
3 | Cairn Terrier | terrier | 3.53 | 13.84 | ... | small | 6.0 | 25.0 | 15-25 |
5 | English Cocker Spaniel | sporting | 3.33 | 11.66 | ... | medium | 14.0 | 41.0 | 5-15 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
82 | Bullmastiff | working | 1.64 | 7.57 | ... | large | 52.0 | 65.0 | 40-80 |
83 | Mastiff | working | 1.57 | 6.50 | ... | large | 79.0 | 76.0 | 80-100 |
85 | Saint Bernard | working | 1.42 | 7.78 | ... | large | 70.0 | 67.0 | 40-80 |
43 rows × 12 columns
The plotly.express
module provides a concise API for making plots.
import plotly.express as px
We use plotly.express
below to make a scatter plot of weight against height for the dog breeds.
Notice that the return value from .scatter()
is a Figure
object.
fig = px.scatter(dogs, x='height', y='weight',
width=350, height=250)
# fig is a plotly Figure object:
fig.__class__
plotly.graph_objs._figure.Figure
Displaying a Figure object renders it to the screen.
fig
This particular Figure
holds one plot, but Figure
objects can hold any number of plots. Below, we create a facet of three scatter plots.
fig = px.scatter(dogs, x='height', y='weight',
facet_col='size',
width=650, height=250)
fig.update_layout(margin=dict(t=30))
fig
These three plots are stored in Trace
objects.
However, we don’t usually manipulate Trace
objects manually.
Instead, plotly
provides functions that automatically create
facetted subplots, like the px.scatter
function we used here.
Now that we have seen how to make a simple plot, we next show how to modify plots.
11.6.2. Modifying Layout¶
We often need to change the figure’s layout.
For instance, we might want to adjust the figure margins or change the axis range.
To do this, we can use the Figure.update_layout()
method.
Let’s look at an example of a scatter plot where the title is cut off because
the plot doesn’t have large enough margins.
fig = px.scatter(dogs, x='weight', y='longevity',
title='Smaller dogs live longer',
width=350, height=250)
fig
We can adjust the margin to give enough space for the title.
fig = px.scatter(dogs, x='weight', y='longevity',
title='Smaller dogs live longer',
width=350, height=250)
fig.update_layout(margin=dict(t=30))
fig
The .update_layout()
lets us modify any property of a layout.
This includes the plot title (title
), margins (margins
dictionary),
and whether to display a legend (showlegend
).
The plotly
documentation has the full list of layout properties 2.
Figure
objects also have .update_xaxes()
and .update_yaxes()
functions,
which are similar to .update_layout()
. These two functions let us modify
properties of the axes, like the axis limits (range
), number of ticks
(nticks
), and axis label (title
). Below, we adjust the range of the x-axis.
fig = px.scatter(dogs, x='weight', y='longevity',
width=350, height=250)
fig.update_xaxes(range=[-5, 40])
fig
The plotly
package comes with many plotting methods; we describe several of them in the next section.
11.6.3. Plotting Functions¶
The plotly
methods includes line plots, scatter plots, bar plots, box plots, and histograms.
The API is similar for each type of plot.
The dataframe is the first argument.
Then, we can specify a column of the dataframe to place on the x-axis
and a column to place on the y-axis using the x
and y
keyword arguments.
run = pd.read_csv('data/cherryBlossomMen.csv')
medians = run.groupby('year')[['time']].median().reset_index()
medians
year | time | |
---|---|---|
0 | 1999 | 5057.0 |
1 | 2000 | 5102.5 |
2 | 2001 | 5218.0 |
... | ... | ... |
11 | 2010 | 5813.0 |
12 | 2011 | 5757.0 |
13 | 2012 | 5248.0 |
14 rows × 2 columns
# x and y are names of columns in the input dataframe
px.line(medians, x='year', y='time',
width=350, height=250)
lifespans = dogs.groupby('size')['longevity'].mean().reset_index()
# x and y work the same for other plotting methods, like px.bar
px.bar(lifespans, x='size', y='longevity',
width=350, height=250)
Plotting methods in plotly
also contain arguments for making facet plots.
We can facet using color on the same plot (color
argument), or
facet into multiple subplots (facet_col
and facet_row
). Below are examples of each.
fig = px.scatter(dogs, x='height', y='weight', color='size',
width=350, height=250)
fig
fig = px.histogram(dogs, x='longevity', facet_col='size',
width=600, height=200)
fig.update_layout(margin=dict(t=30))
For the complete list of plotting functions, see the main documentation for
plotly
1 or plotly.express
3, the submodule of plotly
that we
use in the book.
To add context to a plot, we use the ‘plotly’ annotation methods; these are described next.
11.6.4. Annotations¶
To add annotations to a Figure
, we use the Figure.add_annotation()
method.
Annotations have text and an arrow. The location of the arrow
is set using the x
and y
parameters, and we can shift the
location of the text from its default position
using the ax
and ay
parameters.
fig = px.scatter(dogs, x='weight', y='longevity',
width=350, height=250)
fig.add_annotation(text='Chihuahuas live 16.5 years on average!',
x=2, y=16.5,
ax=30, ay=5,
xshift=3,
xanchor='left')
fig
This section covered the basics of creating plots using the plotly
Python
package. We introduced the Figure
object, which is the object plotly
uses to store plots and their layouts.
We covered the basic plot types that plotly
makes available, and
a few ways to customize plots by adjusting the layout and axes, and by
adding annotations.
In the next section, we’ll compare plotly
to other common tools for creating
visualizations in Python.