Visualise the population of Switzerland using Python and bokeh

As some sort of simple introduction to displaying data on a map we can visualize the population of Switzerland. For this we need data on population per canton and the map to plot the data on. Fortunatley both are freely available.

Getting the map

Shapefiles with the data on cantons and much more can be downloaded from https://shop.swisstopo.admin.ch/en/products/landscape/boundaries3D

Population data

The population data is available in the official report at: https://www.bfs.admin.ch/bfs/en/home/statistics/population.gnpdetail.2017-0586.html unfortunatley only as a pdf. But it is relativly quick o copy-paste it to a csv file.

Imports

In [1]:
#basic infrastructure
import os
import numpy as np
import pandas as pd

#plotting
from bokeh.models import ColumnDataSource, HoverTool, LinearColorMapper, ColorBar, AdaptiveTicker
from bokeh.io import show, output_file, output_notebook
from bokeh.plotting import figure, save
from bokeh import palettes

#colormaps for bokeh
import colorcet as cc

#read shapefiles
import shapefile

#Data path
path = '/Users/erikfrojdh/Dropbox/Data/'

Reading population data

After manually copy pasting the data into a text file we can read it using pandas and read_csv.
In [2]:
#Reading data
df = pd.read_csv(os.path.join(path,'bevolkerung.csv'), 
                 index_col=0,skipinitialspace=True)

#Create a dict to look up values
pop = df.to_dict('index')

#display sorted values
df.sort_values('Total', ascending=False)
Out[2]:
Total Mann Frau Schweizer Ausländer
Kanton
Zürich 1487969 739814 748155 1092631 395338
Bern 1026513 503789 522724 861614 164899
Vaud 784822 385389 399433 520957 263865
Aargau 663462 333364 330098 499712 163750
St. Gallen 502552 251526 251026 382829 119723
Genève 489524 237112 252412 292641 196883
Luzern 403397 200897 202500 329264 74133
Ticino 354375 172877 181498 254828 99547
Valais 339176 168072 171104 260444 78732
Fribourg 311914 156334 155580 242087 69827
Basel-Landschaft 285624 140142 145482 221990 63634
Thurgau 270709 136199 134510 204378 66331
Solothurn 269441 134300 135141 210240 59201
Graubünden 197550 98853 98697 160932 36618
Basel-Stadt 193070 93212 99858 124026 69044
Neuchâtel 178567 87312 91255 132878 45689
Schwyz 155863 79852 76011 123597 32266
Zug 123948 62684 61264 89809 34139
Schaffhausen 80769 40020 40749 59889 20880
Jura 73122 36158 36964 62471 10651
Appenzell Ausserrhoden 54954 27778 27176 46044 8910
Nidwalden 42556 21795 20761 36521 6035
Glarus 40147 20329 19818 30650 9497
Obwalden 37378 18965 18413 31892 5486
Uri 36145 18427 17718 31850 4295
Appenzell Innerrhoden 16003 8237 7766 14230 1773

Reading shape data

To read the shapefile we can use the shapefile package. Each canton is stored with a record and a shape. Using iterShapeRecords we can iterate over them to extract the name and the points for the shape.
In [3]:
#assuming the data was exctracted to path
data_pathname = os.path.join(path, 
                             'BOUNDARIES_2017/DATEN/swissBOUNDARIES3D/'\
                             'SHAPEFILE_LV95_LN02/swissBOUNDARIES3D_1_3_TLM_KANTONSGEBIET.shp')

data = shapefile.Reader(data_pathname)

#lists to hold results
x = []
y = []
canton_names = []
population = []

#iterate over records and shapes
for sr in data.iterShapeRecords(): 
    #The names sometimes include special characters 
    #in this case they are returned as byte strings instead
    #of strings, so before adding we try to decode
    try:
        name = sr.record[18].decode('cp1252')
    except AttributeError:
        name = sr.record[18]
        
    #Each canton can contain multiple closed loops 
    #In order to plot we first need to extract them
    #To get the right color scale we also need one to 
    #duplicate the population note for the canton
    for i, start in enumerate(sr.shape.parts):
        try:
            stop = sr.shape.parts[i+1]
        except IndexError:
            stop = len(sr.shape.points)
        #append x,y and lookup population
        x.append( [p[0] for p in sr.shape.points[start:stop]] )
        y.append( [p[1] for p in sr.shape.points[start:stop]] )
        canton_names.append(name)
        population.append(pop[name]['Total'])
From the data we can now create a ColumnDataSource which will be used to create the patches. Then we set up a color_mapper to translate the population into colors. Since the colormaps included in bokeh supports relatively few numbers we use colorcet to set up.
In [4]:
#data source
source = ColumnDataSource(data=dict(
    x=x, y=y,
    name=canton_names,
    population = population
))

#colormap from colorcet, between 0 and the maximum population
color_mapper = LinearColorMapper(
    palette=cc.blues,
    low = 0,
    high = df.Total.max()
)

#tools visible in the figure
tools = "pan,wheel_zoom,reset,hover,save"
In [7]:
fig = figure(
    title="Population per canton", 
    tools=tools,
    x_axis_location=None, 
    y_axis_location=None,
    match_aspect=True
)
fig.grid.grid_line_color = None
fig.patches('x', 'y', source=source,
          fill_color={'field': 'population','transform': color_mapper},
          fill_alpha=0.8, line_color="black", line_width=0.3)

#setup tooltip
hover = fig.select_one(HoverTool)
hover.point_policy = "follow_mouse"
hover.tooltips = [("Canton", "@name"),("Population", "@population"),
                  ("(E, N)", "($x, $y)")]

output_file('test.html')
show(fig)

Matplotlib rendering

Depending on your preferences or output format sometimes matplotlib renders nicer images.
In [6]:
import matplotlib as mpl
import matplotlib.pyplot as plt
from matplotlib.patches import Polygon, Rectangle
from matplotlib.collections import PatchCollection
import seaborn as sns
sns.set()
sns.set_style('white')
sns.set_context('talk', font_scale=1.1)

patches = []
for xx,yy in zip(x,y):
    xy = list(zip(xx,yy))
    polygon = Polygon(xy, True)
    patches.append(polygon)
    
p = PatchCollection(patches, cmap='Reds', alpha=1, 
                    linestyle = '-', 
                    linewidth = 1,
                    edgecolor = 'Black')
p.set_array(np.array(population))
fig, ax = plt.subplots(1, figsize = (14,7))
ax.add_collection(p)
# ax.plot(xy)
ax.autoscale_view()
ax.set_aspect('equal')
ax.grid(True)
plt.show()

One thought on “Visualise the population of Switzerland using Python and bokeh

  1. Thank you very much for sharing. After a lot of searching, it was exactly what i was looking for !!!!. Have you ever consider to post this in stackoverflow.com? I think you should.

Leave a Reply

Your email address will not be published. Required fields are marked *