Merge pull request 'Change-Toronto-Analysis' (#46) from lmiranda-change-proposal into development

Reviewed-on: lmiranda/personal-portfolio#46
This commit was merged in pull request #46.
This commit is contained in:
2026-01-16 12:52:33 +00:00

View File

@@ -0,0 +1,423 @@
# Toronto Neighbourhood Dashboard — Deliverables
**Project Type:** Interactive Data Visualization Dashboard
**Geographic Scope:** City of Toronto, 158 Official Neighbourhoods
**Author:** Leo Miranda
**Version:** 1.0 | January 2026
---
## Executive Summary
Multi-tab analytics dashboard built around Toronto's official neighbourhood boundaries. The core interaction is a choropleth map where users explore the city through different thematic lenses—housing affordability, safety, demographics, amenities—with supporting visualizations that tell a cohesive story per theme.
**Primary Goals:**
1. Demonstrate interactive data visualization skills (Plotly/Dash)
2. Showcase data engineering capabilities (multi-source ETL, dimensional modeling)
3. Create a portfolio piece with genuine analytical value
---
## Part 1: Geographic Foundation (Required First)
| Dataset | Source | Format | Last Updated | Download |
|---------|--------|--------|--------------|----------|
| **Neighbourhoods Boundaries** | Toronto Open Data | GeoJSON | 2024 | [Link](https://open.toronto.ca/dataset/neighbourhoods/) |
| **Neighbourhood Profiles** | Toronto Open Data | CSV | 2021 Census | [Link](https://open.toronto.ca/dataset/neighbourhood-profiles/) |
**Critical Notes:**
- Toronto uses 158 official neighbourhoods (updated 2024, was 140)
- GeoJSON includes `AREA_ID` for joining to tabular data
- Neighbourhood Profiles has 2,400+ indicators per neighbourhood from Census
---
## Part 2: Tier 1 — MVP Datasets
| Dataset | Source | Measures Available | Update Freq | Granularity |
|---------|--------|-------------------|-------------|-------------|
| **Neighbourhoods GeoJSON** | Toronto Open Data | Boundary polygons, area IDs | Static | Neighbourhood |
| **Neighbourhood Profiles (full)** | Toronto Open Data | 2,400+ Census indicators | Every 5 years | Neighbourhood |
| **Neighbourhood Crime Rates** | Toronto Police Portal | MCI rates per 100K by year | Annual | Neighbourhood |
| **CMHC Rental Market Survey** | CMHC Portal | Avg rent by bedroom, vacancy rate | Annual (Oct) | 15 CMHC Zones |
| **Parks** | Toronto Open Data | Park locations, area, type | Annual | Point/Polygon |
**Total API/Download Calls:** 5
**Data Volume:** ~50MB combined
### Tier 1 Measures to Extract
**From Neighbourhood Profiles:**
- Population, population density
- Median household income
- Age distribution (0-14, 15-24, 25-44, 45-64, 65+)
- % Immigrants, % Visible minorities
- Top languages spoken
- Unemployment rate
- Education attainment (% with post-secondary)
- Housing tenure (own vs rent %)
- Dwelling types distribution
- Average rent, housing costs as % of income
**From Crime Rates:**
- Total MCI rate per 100K population
- Year-over-year crime trend
**From CMHC:**
- Average monthly rent (1BR, 2BR, 3BR)
- Vacancy rates
**From Parks:**
- Park count per neighbourhood
- Park area per capita
---
## Part 3: Tier 2 — Expansion Datasets
| Dataset | Source | Measures Available | Update Freq | Granularity |
|---------|--------|-------------------|-------------|-------------|
| **Major Crime Indicators (MCI)** | Toronto Police Portal | Assault, B&E, auto theft, robbery, theft over | Quarterly | Neighbourhood |
| **Shootings & Firearm Discharges** | Toronto Police Portal | Shooting incidents, injuries, fatalities | Quarterly | Neighbourhood |
| **Building Permits** | Toronto Open Data | New construction, permits by type | Monthly | Address-level |
| **Schools** | Toronto Open Data | Public/Catholic, elementary/secondary | Annual | Point |
| **TTC Routes & Stops** | Toronto Open Data | Route geometry, stop locations | Static | Route/Stop |
| **Licensed Child Care Centres** | Toronto Open Data | Capacity, ages served, locations | Annual | Point |
### Tier 2 Measures to Extract
**From MCI Details:**
- Breakdown by crime type (assault, B&E, auto theft, robbery, theft over)
**From Shootings:**
- Shooting incidents count
- Injuries/fatalities
**From Building Permits:**
- New construction permits (trailing 12 months)
- Permit types distribution
**From Schools:**
- Schools per 1000 children
- School type breakdown
**From TTC:**
- Transit stops within neighbourhood
- Transit accessibility score
**From Child Care:**
- Child care spaces per capita
- Coverage by age group
---
## Part 4: Data Sources by Thematic Group
### GROUP A: Housing & Affordability
| Dataset | Tier | Measures | Update Freq |
|---------|------|----------|-------------|
| Neighbourhood Profiles (Housing) | 1 | Avg rent, ownership %, dwelling types, housing costs as % of income | Every 5 years |
| CMHC Rental Market Survey | 1 | Avg rent by bedroom, vacancy rate, rental universe | Annual |
| Building Permits | 2 | New construction, permits by type | Monthly |
**Calculated Metrics:**
- Rent-to-Income Ratio (CMHC rent ÷ Census income)
- Affordability Index (% of income spent on housing)
---
### GROUP B: Safety & Crime
| Dataset | Tier | Measures | Update Freq |
|---------|------|----------|-------------|
| Neighbourhood Crime Rates | 1 | MCI rates per 100K pop by year | Annual |
| Major Crime Indicators (MCI) | 2 | Assault, B&E, auto theft, robbery, theft over | Quarterly |
| Shootings & Firearm Discharges | 2 | Shooting incidents, injuries, fatalities | Quarterly |
**Calculated Metrics:**
- Year-over-year crime change %
- Crime type distribution
---
### GROUP C: Demographics & Community
| Dataset | Tier | Measures | Update Freq |
|---------|------|----------|-------------|
| Neighbourhood Profiles (Demographics) | 1 | Age distribution, household composition, income | Every 5 years |
| Neighbourhood Profiles (Immigration) | 1 | Immigration status, visible minorities, languages | Every 5 years |
| Neighbourhood Profiles (Education) | 1 | Education attainment, field of study | Every 5 years |
| Neighbourhood Profiles (Labour) | 1 | Employment rate, occupation, industry | Every 5 years |
---
### GROUP D: Transportation & Mobility
| Dataset | Tier | Measures | Update Freq |
|---------|------|----------|-------------|
| Commute Mode (Census) | 1 | % car, transit, walk, bike | Every 5 years |
| TTC Routes & Stops | 2 | Route geometry, stop locations | Static |
**Calculated Metrics:**
- Transit accessibility (stops within 500m of neighbourhood centroid)
---
### GROUP E: Amenities & Services
| Dataset | Tier | Measures | Update Freq |
|---------|------|----------|-------------|
| Parks | 1 | Park locations, area, type | Annual |
| Schools | 2 | Public/Catholic, elementary/secondary | Annual |
| Licensed Child Care Centres | 2 | Capacity, ages served | Annual |
**Calculated Metrics:**
- Park area per capita
- Schools per 1000 children (ages 5-17)
- Child care spaces per 1000 children (ages 0-4)
---
## Part 5: Tab Structure
### Tab Architecture
```
┌────────────────────────────────────────────────────────────────┐
│ [Overview] [Housing] [Safety] [Demographics] [Amenities] │
├────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────┐ ┌────────────────┐ │
│ │ │ │ KPI Card 1 │ │
│ │ CHOROPLETH MAP │ ├────────────────┤ │
│ │ (158 Neighbourhoods) │ │ KPI Card 2 │ │
│ │ │ ├────────────────┤ │
│ │ Click to select │ │ KPI Card 3 │ │
│ │ │ └────────────────┘ │
│ └─────────────────────────────────┘ │
│ │
│ ┌─────────────────────┐ ┌─────────────────────┐ │
│ │ Supporting Chart 1 │ │ Supporting Chart 2 │ │
│ │ (Context/Trend) │ │ (Comparison/Rank) │ │
│ └─────────────────────┘ └─────────────────────┘ │
│ │
│ [Neighbourhood: Selected Name] ──────────────────────── │
│ Details panel with all metrics for selected area │
└────────────────────────────────────────────────────────────────┘
```
---
### Tab 1: Overview (Default Landing)
**Story:** "How do Toronto neighbourhoods compare across key livability metrics?"
| Element | Content | Data Source |
|---------|---------|-------------|
| Map Colour | Composite livability score | Calculated from weighted metrics |
| KPI Cards | Population, Median Income, Avg Crime Rate | Neighbourhood Profiles, Crime Rates |
| Chart 1 | Top 10 / Bottom 10 by livability score | Calculated |
| Chart 2 | Income vs Crime scatter plot | Neighbourhood Profiles, Crime Rates |
**Metric Selector:** Allow user to change map colour by any single metric.
---
### Tab 2: Housing & Affordability
**Story:** "Where can you afford to live, and what's being built?"
| Element | Content | Data Source |
|---------|---------|-------------|
| Map Colour | Rent-to-Income Ratio (Affordability Index) | CMHC + Census income |
| KPI Cards | Median Rent (1BR), Vacancy Rate, New Permits (12mo) | CMHC, Building Permits |
| Chart 1 | Rent trend (5-year line chart by bedroom) | CMHC historical |
| Chart 2 | Dwelling type breakdown (pie/bar) | Neighbourhood Profiles |
**Metric Selector:** Toggle between rent, ownership %, dwelling types.
---
### Tab 3: Safety
**Story:** "How safe is each neighbourhood, and what crimes are most common?"
| Element | Content | Data Source |
|---------|---------|-------------|
| Map Colour | Total MCI Rate per 100K | Crime Rates |
| KPI Cards | Total Crimes, YoY Change %, Shooting Incidents | Crime Rates, Shootings |
| Chart 1 | Crime type breakdown (stacked bar) | MCI Details |
| Chart 2 | 5-year crime trend (line chart) | Crime Rates historical |
**Metric Selector:** Toggle between total crime, specific crime types, shootings.
---
### Tab 4: Demographics
**Story:** "Who lives here? Age, income, diversity."
| Element | Content | Data Source |
|---------|---------|-------------|
| Map Colour | Median Household Income | Neighbourhood Profiles |
| KPI Cards | Population, % Immigrant, Unemployment Rate | Neighbourhood Profiles |
| Chart 1 | Age distribution (population pyramid or bar) | Neighbourhood Profiles |
| Chart 2 | Top languages spoken (horizontal bar) | Neighbourhood Profiles |
**Metric Selector:** Income, immigrant %, age groups, education.
---
### Tab 5: Amenities & Services
**Story:** "What's nearby? Parks, schools, child care, transit."
| Element | Content | Data Source |
|---------|---------|-------------|
| Map Colour | Park Area per Capita | Parks + Population |
| KPI Cards | Parks Count, Schools Count, Child Care Spaces | Multiple datasets |
| Chart 1 | Amenity density comparison (radar or bar) | Calculated |
| Chart 2 | Transit accessibility (stops within 500m) | TTC Stops |
**Metric Selector:** Parks, schools, child care, transit access.
---
## Part 6: Data Pipeline Architecture
### ETL Flow
```
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ DATA SOURCES │ │ STAGING LAYER │ │ MART LAYER │
│ │ │ │ │ │
│ Toronto Open │────▶│ stg_geography │────▶│ dim_neighbourhood│
│ Data Portal │ │ stg_census │ │ fact_crime │
│ │ │ stg_crime │ │ fact_housing │
│ CMHC Portal │────▶│ stg_rental │ │ fact_amenities │
│ │ │ stg_permits │ │ │
│ Toronto Police │────▶│ stg_amenities │ │ agg_dashboard │
│ Portal │ │ stg_childcare │ │ (pre-computed) │
└─────────────────┘ └─────────────────┘ └─────────────────┘
```
### Key Transformations
| Transformation | Description |
|----------------|-------------|
| **Geography Standardization** | Ensure all datasets use `neighbourhood_id` (AREA_ID from GeoJSON) |
| **Census Pivot** | Neighbourhood Profiles is wide format — pivot to metrics per neighbourhood |
| **CMHC Zone Mapping** | Create crosswalk from 15 CMHC zones to 158 neighbourhoods |
| **Amenity Aggregation** | Spatial join point data (schools, parks, child care) to neighbourhood polygons |
| **Rate Calculations** | Normalize counts to per-capita or per-100K |
### Data Refresh Schedule
| Layer | Frequency | Trigger |
|-------|-----------|---------|
| Staging (API pulls) | Weekly | Scheduled job |
| Marts (transforms) | Weekly | Post-staging |
| Dashboard cache | On-demand | User refresh button |
---
## Part 7: Technical Stack
### Core Stack
| Component | Technology | Rationale |
|-----------|------------|-----------|
| **Frontend** | Plotly Dash | Production-ready, rapid iteration |
| **Mapping** | Plotly `choropleth_mapbox` | Native Dash integration |
| **Data Store** | PostgreSQL + PostGIS | Spatial queries, existing expertise |
| **ETL** | Python (Pandas, SQLAlchemy) | Existing stack |
| **Deployment** | Render / Railway | Free tier, easy Dash hosting |
### Alternative (Portfolio Stretch)
| Component | Technology | Why Consider |
|-----------|------------|--------------|
| **Frontend** | React + deck.gl | More "modern" for portfolio |
| **Data Store** | DuckDB | Serverless, embeddable |
| **ETL** | dbt | Aligns with skills roadmap |
---
## Appendix A: Data Source URLs
| Source | URL |
|--------|-----|
| Toronto Open Data — Neighbourhoods | https://open.toronto.ca/dataset/neighbourhoods/ |
| Toronto Open Data — Neighbourhood Profiles | https://open.toronto.ca/dataset/neighbourhood-profiles/ |
| Toronto Police — Neighbourhood Crime Rates | https://data.torontopolice.on.ca/datasets/neighbourhood-crime-rates-open-data |
| Toronto Police — MCI | https://data.torontopolice.on.ca/datasets/major-crime-indicators-open-data |
| Toronto Police — Shootings | https://data.torontopolice.on.ca/datasets/shootings-firearm-discharges-open-data |
| CMHC Rental Market Survey | https://www.cmhc-schl.gc.ca/professionals/housing-markets-data-and-research/housing-data/data-tables/rental-market |
| Toronto Open Data — Parks | https://open.toronto.ca/dataset/parks/ |
| Toronto Open Data — Schools | https://open.toronto.ca/dataset/school-locations-all-types/ |
| Toronto Open Data — Building Permits | https://open.toronto.ca/dataset/building-permits-cleared-permits/ |
| Toronto Open Data — Child Care | https://open.toronto.ca/dataset/licensed-child-care-centres/ |
| Toronto Open Data — TTC Routes | https://open.toronto.ca/dataset/ttc-routes-and-schedules/ |
---
## Appendix B: Colour Palettes
### Affordability (Diverging)
| Status | Hex | Usage |
|--------|-----|-------|
| Affordable (<30% income) | `#2ecc71` | Green |
| Stretched (30-50%) | `#f1c40f` | Yellow |
| Unaffordable (>50%) | `#e74c3c` | Red |
### Safety (Sequential)
| Status | Hex | Usage |
|--------|-----|-------|
| Safest (lowest crime) | `#27ae60` | Dark green |
| Moderate | `#f39c12` | Orange |
| Highest Crime | `#c0392b` | Dark red |
### Demographics — Income (Sequential)
| Level | Hex | Usage |
|-------|-----|-------|
| Highest Income | `#1a5276` | Dark blue |
| Mid Income | `#5dade2` | Light blue |
| Lowest Income | `#ecf0f1` | Light gray |
### General Recommendation
Use **Viridis** or **Plasma** colorscales for perceptually uniform gradients on continuous metrics.
---
## Appendix C: Glossary
| Term | Definition |
|------|------------|
| **MCI** | Major Crime Indicators — Assault, B&E, Auto Theft, Robbery, Theft Over |
| **CMHC Zone** | Canada Mortgage and Housing Corporation rental market survey zones (15 in Toronto) |
| **Rent-to-Income Ratio** | Monthly rent ÷ monthly household income; <30% is considered affordable |
| **PostGIS** | PostgreSQL extension for geographic data |
| **Choropleth** | Thematic map where areas are shaded based on a statistical variable |
---
## Appendix D: Interview Talking Points
When discussing this project in interviews, emphasize:
1. **Data Engineering:** "I built a multi-source ETL pipeline that standardizes geographic keys across Census data, police data, and CMHC rental surveys—three different granularities I had to reconcile."
2. **Dimensional Modeling:** "The data model follows star schema patterns with a central neighbourhood dimension table and fact tables for crime, housing, and amenities."
3. **dbt Patterns:** "The transformation layer uses staging → intermediate → mart patterns, which I've documented for maintainability."
4. **Business Value:** "The dashboard answers questions like 'Where can a young professional afford to live that's safe and has good transit?' — turning raw data into actionable insights."
5. **Technical Decisions:** "I chose Plotly Dash over a React frontend because it let me iterate faster while maintaining production-quality interactivity. For a portfolio piece, speed to working demo matters."
---
*Document Version: 1.0*
*Created: January 2026*
*Author: Leo Miranda / Claude*