Unified Data Layers

Original author: Sina Kashuk, modified by Dahn Jahn

This notebook uses the H3-Pandas API to simplify uber/h3-py-notebooks/notebooks/unified_data_layers.ipynb

Summary

One of the applications of hexagons is to be able to combine different datasets with different geographic shapes and forms. In this tutorial we are going through an example of how to bring the US census data, NYC 311 Noise complaints, and Digital Elevation Model to the hexagon aperture and then how to visualize the data to gain insight.

Data

  • POLYGON: Census Tract Data [Source]

  • POINT: NYC 311 noise complaints [Source]

  • RASTER: NYC Digital Elevation Model [Source]

[1]:
pip install h3pandas
Requirement already satisfied: h3pandas in /home/dahn/miniconda3/lib/python3.9/site-packages (0.2.1+5.g23ac830.dirty)
Requirement already satisfied: shapely in /home/dahn/miniconda3/lib/python3.9/site-packages (from h3pandas) (1.7.1)
Requirement already satisfied: pandas in /home/dahn/miniconda3/lib/python3.9/site-packages (from h3pandas) (1.2.5)
Requirement already satisfied: typing-extensions in /home/dahn/miniconda3/lib/python3.9/site-packages (from h3pandas) (3.10.0.0)
Requirement already satisfied: h3 in /home/dahn/miniconda3/lib/python3.9/site-packages (from h3pandas) (3.7.3)
Requirement already satisfied: numpy in /home/dahn/miniconda3/lib/python3.9/site-packages (from h3pandas) (1.21.0)
Requirement already satisfied: geopandas in /home/dahn/miniconda3/lib/python3.9/site-packages (from h3pandas) (0.9.0)
Requirement already satisfied: pyproj>=2.2.0 in /home/dahn/miniconda3/lib/python3.9/site-packages (from geopandas->h3pandas) (3.1.0)
Requirement already satisfied: fiona>=1.8 in /home/dahn/miniconda3/lib/python3.9/site-packages (from geopandas->h3pandas) (1.8.20)
Requirement already satisfied: attrs>=17 in /home/dahn/miniconda3/lib/python3.9/site-packages (from fiona>=1.8->geopandas->h3pandas) (21.2.0)
Requirement already satisfied: certifi in /home/dahn/miniconda3/lib/python3.9/site-packages (from fiona>=1.8->geopandas->h3pandas) (2021.5.30)
Requirement already satisfied: click>=4.0 in /home/dahn/miniconda3/lib/python3.9/site-packages (from fiona>=1.8->geopandas->h3pandas) (7.1.2)
Requirement already satisfied: cligj>=0.5 in /home/dahn/miniconda3/lib/python3.9/site-packages (from fiona>=1.8->geopandas->h3pandas) (0.7.2)
Requirement already satisfied: click-plugins>=1.0 in /home/dahn/miniconda3/lib/python3.9/site-packages (from fiona>=1.8->geopandas->h3pandas) (1.1.1)
Requirement already satisfied: six>=1.7 in /home/dahn/miniconda3/lib/python3.9/site-packages (from fiona>=1.8->geopandas->h3pandas) (1.16.0)
Requirement already satisfied: munch in /home/dahn/miniconda3/lib/python3.9/site-packages (from fiona>=1.8->geopandas->h3pandas) (2.5.0)
Requirement already satisfied: setuptools in /home/dahn/miniconda3/lib/python3.9/site-packages (from fiona>=1.8->geopandas->h3pandas) (49.6.0.post20210108)
Requirement already satisfied: python-dateutil>=2.7.3 in /home/dahn/miniconda3/lib/python3.9/site-packages (from pandas->h3pandas) (2.8.1)
Requirement already satisfied: pytz>=2017.3 in /home/dahn/miniconda3/lib/python3.9/site-packages (from pandas->h3pandas) (2021.1)
Note: you may need to restart the kernel to use updated packages.
[2]:
import pandas as pd
import geopandas as gpd
import h3pandas
import matplotlib.pyplot as plt

Load data

[3]:
# modified data link
ct_data_link = 'https://gist.githubusercontent.com/kashuk/e6e3e3d8fde34da1212b59248a7cc5a8/raw/da3b63c1c0ef4a1c8cc8e10f61455c436a0d0ad9/CT_data.csv'
ct_shape_link = 'https://gist.githubusercontent.com/kashuk/d73342adeccbc65de7a53e19ad78b4df/raw/4300dcb80861d454ecae8f8429166e196779fc21/CT_simplified_shape.json'
noise_311_link = 'https://gist.githubusercontent.com/kashuk/670a350ea1f9fc543c3f6916ab392f62/raw/4c5ced45cc94d5b00e3699dd211ad7125ee6c4d3/NYC311_noise.csv'
nyc_dem_link = 'https://gist.githubusercontent.com/DahnJ/5b77a1d7047412e35e0c1f9ff10ea182/raw/b328c6beb369127081287ac2a17820280d35898b/dem_nyc_xyz.csv'
[4]:
METRIC_COL = 'SE_T002_002' # population density
INDEX_COL = 'BoroCT2010'

Load 311 noise complaints

[5]:
df311 = pd.read_csv(noise_311_link)

Load Census Tract data

[6]:
gdf = gpd.read_file(ct_shape_link).drop(columns='gdf').astype({INDEX_COL: str}).set_index(INDEX_COL)
[7]:
df = pd.read_csv(ct_data_link, usecols=[INDEX_COL,METRIC_COL], dtype={INDEX_COL: str}).set_index(INDEX_COL)

gdf = gdf.join(df)
gdf[METRIC_COL] = gdf[METRIC_COL].fillna(0)
gdf = gdf.loc[~gdf.geometry.isna()]

Load DEM

[8]:
df_dem = pd.read_csv(nyc_dem_link)

Point to hex

[9]:
# Visualize the 311 noise complaints points
df311.plot(x='lng',y='lat',style='.',alpha=0.03,figsize=(12,12));
plt.title('sample points: 311 noise compliants');
../_images/notebook_01-unified-data-layers_16_0.png
[10]:
APERTURE_SIZE = 9
[11]:
df311 = (df311
 .assign(count=1)
 .h3.geo_to_h3_aggregate(APERTURE_SIZE))
[12]:
df311.plot(figsize=(17, 15), column='count', cmap='viridis', edgecolor='none', legend=True)
plt.title('hex-grid: noise complaints')
[12]:
Text(0.5, 1.0, 'hex-grid: noise complaints')
../_images/notebook_01-unified-data-layers_19_1.png

Spatial smoothing

[13]:
k = 2
[14]:
df311s = df311.h3.k_ring_smoothing(k)
[15]:
df311s.plot(figsize=(17, 15), column='count', cmap='viridis', edgecolor='none', legend=True)
plt.title('noise complaints: 2-ring average');
../_images/notebook_01-unified-data-layers_23_0.png

Visualize Census Tract polygons

[16]:
gdf.plot(column=METRIC_COL, cmap='viridis', linewidth=0.05, figsize=(16, 12), legend=True)
plt.title('census tract: population density');
../_images/notebook_01-unified-data-layers_25_0.png

Census polygon to hex

[17]:
APERTURE_SIZE=10
dfh = gdf.h3.polyfill_resample(APERTURE_SIZE)
[18]:
dfh.plot(figsize=(16, 12), column=METRIC_COL, legend=True)
plt.title('hex-grid: population density');
../_images/notebook_01-unified-data-layers_28_0.png

Spatial weighted smoothing

[19]:
coef = [1, .8, .4, .15, 0.05]
[20]:
df_ct_kw = dfh.h3.k_ring_smoothing(weights=coef)
[21]:
df_ct_kw.plot(figsize=(16, 12), column=METRIC_COL, legend=True)
plt.title('hex-grid: smoothed population density')
[21]:
Text(0.5, 1.0, 'hex-grid: smoothed population density')
../_images/notebook_01-unified-data-layers_32_1.png

Hierarchical re-sampling

Resolution 9

[22]:
df_coarse = df_ct_kw.h3.h3_to_parent_aggregate(9, 'mean')
[23]:
df_coarse.plot(figsize=(16, 12), column=METRIC_COL, legend=True)
plt.title('hex-grid: population density')
[23]:
Text(0.5, 1.0, 'hex-grid: population density')
../_images/notebook_01-unified-data-layers_36_1.png

Resolution 8

[24]:
df_coarse.h3.h3_to_parent_aggregate(8, 'mean').plot(figsize=(16, 12), column=METRIC_COL, legend=True)
plt.title('hex-grid: population density')
[24]:
Text(0.5, 1.0, 'hex-grid: population density')
../_images/notebook_01-unified-data-layers_38_1.png

Raster to hex

[25]:
df_dem = df_dem.h3.geo_to_h3_aggregate(9, 'mean')
[26]:
df_dem.plot(figsize=(16, 12), column='elevation', legend=True)
plt.title('hex-grid: elevation')
[26]:
Text(0.5, 1.0, 'hex-grid: elevation')
../_images/notebook_01-unified-data-layers_41_1.png

Unifying data layers

[27]:
dfu = (df_dem
       .join(df311s.drop(columns='geometry'))
       .join(df_coarse.drop(columns='geometry'))
       .rename(columns={METRIC_COL: "population","count":"noise_complaints"}))
[28]:
pd.plotting.scatter_matrix(dfu, alpha=0.05,figsize=(15,15));
dfu.describe()
[28]:
elevation noise_complaints population
count 8047.000000 7343.000000 8047.000000
mean 48.874082 13.281937 24833.451017
std 49.145254 26.843631 25614.949764
min -2.473003 0.052632 0.000000
25% 13.088472 2.000000 5397.794072
50% 32.209505 4.315789 16142.488019
75% 67.886977 11.315789 37885.947293
max 386.767609 210.578947 170595.715931
../_images/notebook_01-unified-data-layers_44_1.png