# coding: utf-8 # # # # # #
# # # # #

Bokeh Tutorial

#
# #

10. High Level Charts

# This section covers the `bokeh.charts` interface, which is a high-level API that is especially useful for exploratory data analysis (for instance, in a Jupyter notebook). It provides functions for quickly producing many standard chart types, often with a single line of code. We will look at the following types in this notebook: # # * [Scatter Plot](#Scatter-Plot) # * [Bar Chart](#Bar-Chart) # * [Histogram](#Histogram) # * [Box Plot](#Box-Plot) # In[1]: from bokeh.io import output_notebook, show output_notebook() # # Scatter Plot # # A high-level scatter plot is provided by [`bokeh.charts.Scatter`](). # # For this section will use the "iris" data set. First let's import it and take a look at a few rows: # In[2]: from bokeh.sampledata.iris import flowers flowers.head() # In[3]: from bokeh.charts import Scatter # A basic scatter chart takes the data (in this case a pandas DataFrame) as the first argument, and specifies the `x` and `y` coordinates for the scatter as the names of columns in the data. # In[4]: p = Scatter(flowers, x='petal_length', y='petal_width') show(p) # By passing a column name for the `color` parameter, you can make `Scatter` automatically color the markers according to the groups in that column. Let's also add a legend by specify its location as the value of a `legend` paramter (in this case `"top_left"`) # In[5]: p = Scatter(flowers, x='petal_length', y='petal_width', color='species', legend='top_left') show(p) # By passing a column name for the `marker` parameter, you can make `Scatter` automatically vary the marker shapes according to the groups in that column. Let's try that as an exercise. # In[6]: # EXERCISE: vary the marker shape by passing a column name as the `marker` keyword argument # # Bar Chart # # A high-level bar chart is provided by [`bokeh.charts.Bar`]() # # For this section, we will use the "autompg" data set. Let's import it and take a quick look: # In[7]: from bokeh.sampledata.autompg import autompg autompg.head() # In[8]: from bokeh.charts import Bar # A basic bar chart takes the data (again a DataFrame) as the first value, as well as column names for: # # * `label` - a column to group to label the x-axis # * `values` - a column to aggregate values for each group, to give the bar heights # * `agg` - the name of an aggregation to perform over the values (e.g., `"mean"`, `"max"`, etc.) # # A simple example that also specifies some other properties such as `title` and `legend` is shown below: # In[9]: p = Bar(autompg, label='cyl', values='mpg', agg='max', title="Max MPG by CYL", legend=None, tools='crosshair') show(p) # By passing another column name as the `group` parameter, the aggregations can be further subdivided by the groups in that column, and the bars grouped visually. The example below demonstrates this, as well as adding a legend by specifying its location: # In[10]: p = Bar(autompg, label='yr', values='mpg', agg='median', group='origin', title="Median MPG by YR, grouped by ORIGIN", legend='top_left', tools='crosshair') show(p) # Similarly, bars for subgroups can be stacked visually, by providing a column name for the `stack` parameter. Let's try that as an exercise. # In[11]: # EXERCISE: change the chart above to stack the bars with title "Median MPG by YR, stacked by ORIGIN" # # Histogram # # A high-level Histogram is provided by [`bokeh.charts.Histogram`]() # # For this section, we will construct our own synthetic data set that has values generated from two different probability distributions. # In[12]: import pandas as pd import numpy as np # build some distributions mu, sigma = 0, 0.5 normal = pd.DataFrame({'value': np.random.normal(mu, sigma, 1000), 'type': 'normal'}) lognormal = pd.DataFrame({'value': np.random.lognormal(mu, sigma, 1000), 'type': 'lognormal'}) # create a pandas data frame df = pd.concat([normal, lognormal]) df[995:1005] # In[13]: from bokeh.charts import Histogram # A basic histogram takes the data as the first parameter, and a column name as the `values` parameter. Optionally, you can also specify the number of bins to use by giving a value for the `bins` parameter. The example below shows the distribution of ***all*** the values (both the "normal" and "lognormal" values). # In[14]: hist = Histogram(df, values='value', bins=30) show(hist) # It's also possible to generate multiple histograms at once by grouping the data. The column to group by is specified by the `color` parameter (and the histogram for each group is colored differently automatically). Let's try that as an exercise. # In[15]: # EXERCISE: generate histograms for each "type" of distribution, and add a legend to the top left. # # Box Plot # # A high-level box plot is provided by [`bokeh.charts.BoxPlot`]() # # For this section we will use the "iris" data set again. # In[16]: from bokeh.charts import BoxPlot # A basic box plot takes the data as the first value, as well as column names for: # # * `label` - a column to group to label the x-axis # * `values` - a column to aggregate values for each group # # A simple example that also specifies some other properties such as `title` and `legend` is shown below: # In[17]: p = BoxPlot(flowers, label='species', values='petal_width', tools='crosshair', color='#aa4444', xlabel='', ylabel='petal width, mm', title='Distributions of petal widths') show(p) # Instead of a single color, the box and whiskers groups can be colored by grouping one of the columns. This is done by passing a column name as the `color` parameter. Let's try that as an exercise. # In[18]: # EXERCISE: color the boxes by "species" and add a legend to the top left # --- # # # Further reading # # # http://nbviewer.jupyter.org/github/bokeh/bokeh/tree/0.11.1/examples/charts/file/ # # http://nbviewer.jupyter.org/github/bokeh/bokeh/tree/0.11.1/examples/howto/charts/ # # http://nbviewer.jupyter.org/github/bokeh/bokeh-demos/blob/master/presentations/2016-03-pydata-strata/notebooks/Charts.ipynb # # http://nbviewer.jupyter.org/github/bokeh/bokeh-demos/blob/master/presentations/2016-03-pydata-strata/notebooks/Charts%20Demo.ipynb