from bokeh.charts import Donut, show, output_notebook, vplot
from bokeh.charts.utils import df_from_json
from bokeh.sampledata.olympics2014 import data
from bokeh.sampledata.autompg import autompg

output_notebook()

import pandas as pd

Generic Examples¶

Values with implied index¶

d = Donut([2, 4, 5, 2, 8])
show(d)

Values with Explicit Index¶

d = Donut(pd.Series([2, 4, 5, 2, 8], index=['a', 'b', 'c', 'd', 'e']))
show(d)

Autompg Data¶

Take a look at the data¶

autompg.head()

Simple example implies count when object or categorical¶

d = Donut(autompg.cyl.astype(str))
show(d)

Equivalent with columns specified¶

d = Donut(autompg, label='cyl', agg='count')
show(d)

Given an indexed series of data pre-aggregated¶

d = Donut(autompg.groupby('cyl').displ.mean())
show(d)

Equivalent with columns specified¶

d = Donut(autompg, label='cyl',
           values='displ', agg='mean')
show(d)

Given a multi-indexed series fo data pre-aggregated¶

Since the aggregation type isn't specified, we must provide it to the chart for use in the tooltip, otherwise it will just say "value".

d = Donut(autompg.groupby(['cyl', 'origin']).displ.mean(), hover_text='mean')
show(d)

Column Labels Produces Slightly Different Result¶

In previous series input example we do not have the original values so we cannot size the wedges based on the mean of displacement for Cyl, then size the wedges proportionally inside of the Cyl wedge. This column labeled example can perform the right sizing, so would be preferred for any aggregated values.

d = Donut(autompg, label=['cyl', 'origin'],
           values='displ', agg='mean')
show(d)

The spacing between each donut level can be altered¶

By default, this is applied to only the levels other than the first.

d = Donut(autompg, label=['cyl', 'origin'],
           values='displ', agg='mean', level_spacing=0.15)
show(d)

Can specify the spacing for each level¶

This is applied to each level individually, including the first.

d = Donut(autompg, label=['cyl', 'origin'],
           values='displ', agg='mean', level_spacing=[0.8, 0.3])
show(d)

Olympics Example¶

Take a look at source data¶

print(data.keys())
data['data'][0]

Look at table formatted data¶

# utilize utility to make it easy to get json/dict data converted to a dataframe
df = df_from_json(data)
df.head()

Prepare the data¶

This data is in a "pivoted" format, and since the charts interface is built around referencing columns, it is more convenient to de-pivot the data.

We will sort the data by total medals and select the top rows by the total medals.
Use pandas.melt to de-pivot the data.

# filter by countries with at least one medal and sort by total medals
df = df[df['total'] > 8]
df = df.sort("total", ascending=False)
olympics = pd.melt(df, id_vars=['abbr'],
                   value_vars=['bronze', 'silver', 'gold'],
                   value_name='medal_count', var_name='medal')
olympics.head()

# original example
d0 = Donut(olympics, label=['abbr', 'medal'], values='medal_count',
           text_font_size='8pt', hover_text='medal_count')
show(d0)