{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Crime Rate Choropleth Map\n", "\n", "Displays crime rates per 100,000 population across Toronto's 158 neighbourhoods." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. Data Reference\n", "\n", "### Source Tables\n", "\n", "| Table | Grain | Key Columns |\n", "|-------|-------|-------------|\n", "| `mart_neighbourhood_safety` | neighbourhood × year | crime_rate_per_100k, crime_index, safety_tier, geometry |\n", "\n", "### SQL Query" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "\n", "import pandas as pd\n", "from dotenv import load_dotenv\n", "from sqlalchemy import create_engine\n", "\n", "# Load .env from project root\n", "load_dotenv(\"../../.env\")\n", "\n", "engine = create_engine(os.environ[\"DATABASE_URL\"])\n", "\n", "query = \"\"\"\n", "SELECT\n", " neighbourhood_id,\n", " neighbourhood_name,\n", " geometry,\n", " year,\n", " crime_rate_per_100k,\n", " crime_index,\n", " safety_tier,\n", " total_incidents,\n", " population\n", "FROM public_marts.mart_neighbourhood_safety\n", "WHERE year = (SELECT MAX(year) FROM public_marts.mart_neighbourhood_safety)\n", "ORDER BY crime_rate_per_100k DESC\n", "\"\"\"\n", "\n", "df = pd.read_sql(query, engine)\n", "print(f\"Loaded {len(df)} neighbourhoods\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Transformation Steps\n", "\n", "1. Filter to most recent year\n", "2. Convert geometry to GeoJSON\n", "3. Use reversed color scale (green=low crime, red=high crime)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import json\n", "\n", "import geopandas as gpd\n", "\n", "gdf = gpd.GeoDataFrame(\n", " df, geometry=gpd.GeoSeries.from_wkb(df[\"geometry\"]), crs=\"EPSG:4326\"\n", ")\n", "\n", "geojson = json.loads(gdf.to_json())\n", "data = df.drop(columns=[\"geometry\"]).to_dict(\"records\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Sample Output" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df[\n", " [\n", " \"neighbourhood_name\",\n", " \"crime_rate_per_100k\",\n", " \"crime_index\",\n", " \"safety_tier\",\n", " \"total_incidents\",\n", " ]\n", "].head(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Data Visualization\n", "\n", "### Figure Factory\n", "\n", "Uses `create_choropleth_figure` from `portfolio_app.figures.toronto.choropleth`.\n", "\n", "**Key Parameters:**\n", "- `color_column`: 'crime_rate_per_100k'\n", "- `color_scale`: 'RdYlGn_r' (red=high crime, green=low crime)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import sys\n", "\n", "sys.path.insert(0, \"../..\")\n", "\n", "from portfolio_app.figures.toronto.choropleth import create_choropleth_figure\n", "\n", "fig = create_choropleth_figure(\n", " geojson=geojson,\n", " data=data,\n", " location_key=\"neighbourhood_id\",\n", " color_column=\"crime_rate_per_100k\",\n", " hover_data=[\"neighbourhood_name\", \"crime_index\", \"total_incidents\"],\n", " color_scale=\"RdYlGn_r\",\n", " title=\"Toronto Crime Rate per 100,000 Population\",\n", " zoom=10,\n", ")\n", "\n", "fig.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Safety Tier Interpretation\n", "\n", "| Tier | Meaning |\n", "|------|--------|\n", "| 1 | Highest crime (top 20%) |\n", "| 2-4 | Middle tiers |\n", "| 5 | Lowest crime (bottom 20%) |" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "name": "python", "version": "3.11.0" } }, "nbformat": 4, "nbformat_minor": 4 }