{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# \"Fruit\" example (from Hybrid Sankey diagrams paper)\n", "\n", "This notebook gives a fairly complicated example of building a Sankey diagram from the sample \"fruit\" database used in the paper [Hybrid Sankey diagrams: Visual analysis of multidimensional data for understanding resource use](https://doi.org/10.1016/j.resconrec.2017.05.002).\n", "\n", "For more explanation of the steps and concepts, see the [tutorials](../tutorials/index.ipynb)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from floweaver import *" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Load the dataset:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dataset = Dataset.from_csv('fruit_flows.csv', 'fruit_processes.csv')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This made-up dataset describes flows from farms to consumers:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dataset._flows.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Additional information is available in the process dimension table:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dataset._dim_process.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We'll also define some partitions that will be useful:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "farm_ids = ['farm{}'.format(i) for i in range(1, 16)]\n", "\n", "farm_partition_5 = Partition.Simple('process', [('Other farms', farm_ids[5:])] + farm_ids[:5])\n", "partition_fruit = Partition.Simple('material', ['bananas', 'apples', 'oranges'])\n", "partition_sector = Partition.Simple('process.sector', ['government', 'industry', 'domestic'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now define the Sankey diagram definition.\n", "\n", "- Process groups represent sets of processes in the underlying database. The underlying processes can be specified as a list of ids (e.g. `['inputs']`) or as a Pandas query expression (e.g. `'function == \"landfill\"'`).\n", "- Waypoints allow extra control over the partitioning and placement of flows." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "nodes = {\n", " 'inputs': ProcessGroup(['inputs'], title='Inputs'),\n", " 'compost': ProcessGroup('function == \"composting stock\"', title='Compost'),\n", " 'farms': ProcessGroup('function in [\"allotment\", \"large farm\", \"small farm\"]', farm_partition_5),\n", " 'eat': ProcessGroup('function == \"consumers\" and location != \"London\"', partition_sector,\n", " title='consumers by sector'),\n", " 'landfill': ProcessGroup('function == \"landfill\" and location != \"London\"', title='Landfill'),\n", " 'composting': ProcessGroup('function == \"composting process\" and location != \"London\"', title='Composting'),\n", "\n", " 'fruit': Waypoint(partition_fruit, title='fruit type'),\n", " 'w1': Waypoint(direction='L', title=''),\n", " 'w2': Waypoint(direction='L', title=''),\n", " 'export fruit': Waypoint(Partition.Simple('material', ['apples', 'bananas', 'oranges'])),\n", " 'exports': Waypoint(title='Exports'),\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The ordering defines how the process groups and waypoints are arranged in the final diagram. It is structured as a list of vertical *layers* (from left to right), each containing a list of horizontal *bands* (from top to bottom), each containing a list of process group and waypoint ids (from top to bottom)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ordering = [\n", " [[], ['inputs', 'compost'], []],\n", " [[], ['farms'], ['w2']],\n", " [['exports'], ['fruit'], []],\n", " [[], ['eat'], []],\n", " [['export fruit'], ['landfill', 'composting'], ['w1']],\n", "]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Bundles represent flows in the underlying database:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "bundles = [\n", " Bundle('inputs', 'farms'),\n", " Bundle('compost', 'farms'),\n", " Bundle('farms', 'eat', waypoints=['fruit']),\n", " Bundle('farms', 'compost', waypoints=['w2']),\n", " Bundle('eat', 'landfill'),\n", " Bundle('eat', 'composting'),\n", " Bundle('composting', 'compost', waypoints=['w1', 'w2']),\n", " Bundle('farms', Elsewhere, waypoints=['exports', 'export fruit']),\n", "]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, the process groups, waypoints, bundles and ordering are combined into a Sankey diagram definition (SDD). When applied to the dataset, the result is a Sankey diagram!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sdd = SankeyDefinition(nodes, bundles, ordering,\n", " flow_partition=dataset.partition('material'))\n", "weave(sdd, dataset) \\\n", " .to_widget(width=570, height=550, margins=dict(left=70, right=90))" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.3" } }, "nbformat": 4, "nbformat_minor": 1 }