Files
personal-portfolio/notebooks/safety/crime_rate_choropleth.ipynb
lmiranda 69c4216cd5 fix: Update notebooks to use public_marts schema
dbt creates mart tables in public_marts schema, not public.
Updated all notebook SQL queries to use the correct schema.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 19:45:23 -05:00

173 lines
4.1 KiB
Plaintext
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Crime Rate Choropleth Map\n",
"\n",
"Displays crime rates per 100,000 population across Toronto's 158 neighbourhoods."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. Data Reference\n",
"\n",
"### Source Tables\n",
"\n",
"| Table | Grain | Key Columns |\n",
"|-------|-------|-------------|\n",
"| `mart_neighbourhood_safety` | neighbourhood × year | crime_rate_per_100k, crime_index, safety_tier, geometry |\n",
"\n",
"### SQL Query"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"from sqlalchemy import create_engine\n",
"import os\n",
"\n",
"engine = create_engine(os.environ.get('DATABASE_URL', 'postgresql://portfolio:portfolio@localhost:5432/portfolio'))\n",
"\n",
"query = \"\"\"\n",
"SELECT\n",
" neighbourhood_id,\n",
" neighbourhood_name,\n",
" geometry,\n",
" year,\n",
" crime_rate_per_100k,\n",
" crime_index,\n",
" safety_tier,\n",
" total_incidents,\n",
" population\n",
"FROM public_marts.mart_neighbourhood_safety\n",
"WHERE year = (SELECT MAX(year) FROM public_marts.mart_neighbourhood_safety)\n",
"ORDER BY crime_rate_per_100k DESC\n",
"\"\"\"\n",
"\n",
"df = pd.read_sql(query, engine)\n",
"print(f\"Loaded {len(df)} neighbourhoods\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Transformation Steps\n",
"\n",
"1. Filter to most recent year\n",
"2. Convert geometry to GeoJSON\n",
"3. Use reversed color scale (green=low crime, red=high crime)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import geopandas as gpd\n",
"import json\n",
"\n",
"gdf = gpd.GeoDataFrame(\n",
" df,\n",
" geometry=gpd.GeoSeries.from_wkb(df['geometry']),\n",
" crs='EPSG:4326'\n",
")\n",
"\n",
"geojson = json.loads(gdf.to_json())\n",
"data = df.drop(columns=['geometry']).to_dict('records')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Sample Output"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df[['neighbourhood_name', 'crime_rate_per_100k', 'crime_index', 'safety_tier', 'total_incidents']].head(10)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. Data Visualization\n",
"\n",
"### Figure Factory\n",
"\n",
"Uses `create_choropleth_figure` from `portfolio_app.figures.choropleth`.\n",
"\n",
"**Key Parameters:**\n",
"- `color_column`: 'crime_rate_per_100k'\n",
"- `color_scale`: 'RdYlGn_r' (red=high crime, green=low crime)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import sys\n",
"sys.path.insert(0, '../..')\n",
"\n",
"from portfolio_app.figures.choropleth import create_choropleth_figure\n",
"\n",
"fig = create_choropleth_figure(\n",
" geojson=geojson,\n",
" data=data,\n",
" location_key='neighbourhood_id',\n",
" color_column='crime_rate_per_100k',\n",
" hover_data=['neighbourhood_name', 'crime_index', 'total_incidents'],\n",
" color_scale='RdYlGn_r',\n",
" title='Toronto Crime Rate per 100,000 Population',\n",
" zoom=10,\n",
")\n",
"\n",
"fig.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Safety Tier Interpretation\n",
"\n",
"| Tier | Meaning |\n",
"|------|--------|\n",
"| 1 | Highest crime (top 20%) |\n",
"| 2-4 | Middle tiers |\n",
"| 5 | Lowest crime (bottom 20%) |"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"name": "python",
"version": "3.11.0"
}
},
"nbformat": 4,
"nbformat_minor": 4
}