“Fruit” example (from Hybrid Sankey diagrams paper)¶

This notebook gives a fairly complicated example of building a Sankey diagram from the sample “fruit” database used in the paper Hybrid Sankey diagrams: Visual analysis of multidimensional data for understanding resource use.

For more explanation of the steps and concepts, see the tutorials.

[1]:

from floweaver import *

Load the dataset:

[2]:

dataset = Dataset.from_csv('fruit_flows.csv', 'fruit_processes.csv')

This made-up dataset describes flows from farms to consumers:

[3]:

dataset._flows.head()

[3]:

	source	target	material	time	value
0	farm1	eat1	apples	2011-08-01	2.720691
1	eat1	landfill Cambridge	apples	2011-08-01	1.904484
2	eat1	composting Cambridge	apples	2011-08-01	0.816207
3	farm1	eat1	apples	2011-08-02	8.802195
4	eat1	landfill Cambridge	apples	2011-08-02	6.161537

Additional information is available in the process dimension table:

[4]:

dataset._dim_process.head()

[4]:

	type	location	function	sector
id
inputs	stock	*	inputs	NaN
farm1	process	Cambridge	small farm	farming
farm2	process	Cambridge	small farm	farming
farm3	process	Ely	small farm	farming
farm4	process	Ely	allotment	farming

We’ll also define some partitions that will be useful:

[5]:

farm_ids = ['farm{}'.format(i) for i in range(1, 16)]

farm_partition_5 = Partition.Simple('process', [('Other farms', farm_ids[5:])] + farm_ids[:5])
partition_fruit = Partition.Simple('material', ['bananas', 'apples', 'oranges'])
partition_sector = Partition.Simple('process.sector', ['government', 'industry', 'domestic'])

Now define the Sankey diagram definition.

Process groups represent sets of processes in the underlying database. The underlying processes can be specified as a list of ids (e.g. ['inputs']) or as a Pandas query expression (e.g. 'function == "landfill"').
Waypoints allow extra control over the partitioning and placement of flows.

[6]:

nodes = {
    'inputs':     ProcessGroup(['inputs'], title='Inputs'),
    'compost':    ProcessGroup('function == "composting stock"', title='Compost'),
    'farms':      ProcessGroup('function in ["allotment", "large farm", "small farm"]', farm_partition_5),
    'eat':        ProcessGroup('function == "consumers" and location != "London"', partition_sector,
                               title='consumers by sector'),
    'landfill':   ProcessGroup('function == "landfill" and location != "London"', title='Landfill'),
    'composting': ProcessGroup('function == "composting process" and location != "London"', title='Composting'),

    'fruit':        Waypoint(partition_fruit, title='fruit type'),
    'w1':           Waypoint(direction='L', title=''),
    'w2':           Waypoint(direction='L', title=''),
    'export fruit': Waypoint(Partition.Simple('material', ['apples', 'bananas', 'oranges'])),
    'exports':      Waypoint(title='Exports'),
}

The ordering defines how the process groups and waypoints are arranged in the final diagram. It is structured as a list of vertical layers (from left to right), each containing a list of horizontal bands (from top to bottom), each containing a list of process group and waypoint ids (from top to bottom).

[7]:

ordering = [
    [[], ['inputs', 'compost'], []],
    [[], ['farms'], ['w2']],
    [['exports'], ['fruit'], []],
    [[], ['eat'], []],
    [['export fruit'], ['landfill', 'composting'], ['w1']],
]

Bundles represent flows in the underlying database:

[8]:

bundles = [
    Bundle('inputs', 'farms'),
    Bundle('compost', 'farms'),
    Bundle('farms', 'eat', waypoints=['fruit']),
    Bundle('farms', 'compost', waypoints=['w2']),
    Bundle('eat', 'landfill'),
    Bundle('eat', 'composting'),
    Bundle('composting', 'compost', waypoints=['w1', 'w2']),
    Bundle('farms', Elsewhere, waypoints=['exports', 'export fruit']),
]

Finally, the process groups, waypoints, bundles and ordering are combined into a Sankey diagram definition (SDD). When applied to the dataset, the result is a Sankey diagram!

[9]:

sdd = SankeyDefinition(nodes, bundles, ordering,
                       flow_partition=dataset.partition('material'))
weave(sdd, dataset) \
    .to_widget(width=570, height=550, margins=dict(left=70, right=90))

[9]:

“Fruit” example (from Hybrid Sankey diagrams paper)¶

Navigation

Related Topics