docs: Rewrite documentation with accurate project state

- Delete obsolete change proposals and bio content source
- Rewrite README.md with correct features, data sources, structure
- Update PROJECT_REFERENCE.md with accurate status and completed work
- Update CLAUDE.md references and sprint status
- Add docs/CONTRIBUTING.md developer guide with:
  - How to add blog posts (frontmatter, markdown)
  - How to add new pages (Dash routing)
  - How to add dashboard tabs
  - How to create figure factories
  - Branch workflow and code standards

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
2026-01-17 12:27:25 -05:00
parent 1a878313f8
commit 4818c53fd2
7 changed files with 794 additions and 1190 deletions

View File

@@ -1,276 +0,0 @@
# Toronto Neighbourhood Dashboard — Implementation Plan
**Document Type:** Execution Guide
**Target:** Transition from TRREB-based to Neighbourhood-based Dashboard
**Version:** 2.0 | January 2026
---
## Overview
Transition from TRREB district-based housing dashboard to a comprehensive Toronto Neighbourhood Dashboard built around the city's 158 official neighbourhoods.
**Key Changes:**
- Geographic foundation: TRREB districts (~35) → City Neighbourhoods (158)
- Data sources: PDF parsing → Open APIs (Toronto Open Data, Toronto Police, CMHC)
- Scope: Housing-only → 5 thematic tabs (Overview, Housing, Safety, Demographics, Amenities)
---
## Phase 1: Repository Cleanup
### Files to DELETE
| File | Reason |
|------|--------|
| `portfolio_app/toronto/schemas/trreb.py` | TRREB schema obsolete |
| `portfolio_app/toronto/parsers/trreb.py` | PDF parsing no longer needed |
| `portfolio_app/toronto/loaders/trreb.py` | TRREB loading logic obsolete |
| `dbt/models/staging/stg_trreb__purchases.sql` | TRREB staging obsolete |
| `dbt/models/intermediate/int_purchases__monthly.sql` | TRREB intermediate obsolete |
| `dbt/models/marts/mart_toronto_purchases.sql` | Will rebuild for neighbourhood grain |
### Files to MODIFY (Remove TRREB References)
| File | Action |
|------|--------|
| `portfolio_app/toronto/schemas/__init__.py` | Remove TRREB imports |
| `portfolio_app/toronto/parsers/__init__.py` | Remove TRREB parser imports |
| `portfolio_app/toronto/loaders/__init__.py` | Remove TRREB loader imports |
| `portfolio_app/toronto/models/facts.py` | Remove `FactPurchases` model |
| `portfolio_app/toronto/models/dimensions.py` | Remove `DimTRREBDistrict` model |
| `portfolio_app/toronto/demo_data.py` | Remove TRREB demo data |
| `dbt/models/sources.yml` | Remove TRREB source definitions |
| `dbt/models/schema.yml` | Remove TRREB model documentation |
### Files to KEEP (Reusable)
| File | Why |
|------|-----|
| `portfolio_app/toronto/schemas/cmhc.py` | CMHC data still used |
| `portfolio_app/toronto/parsers/cmhc.py` | Reusable with modifications |
| `portfolio_app/toronto/loaders/base.py` | Generic database utilities |
| `portfolio_app/toronto/loaders/dimensions.py` | Dimension loading patterns |
| `portfolio_app/toronto/models/base.py` | SQLAlchemy base class |
| `portfolio_app/figures/*.py` | All chart factories reusable |
| `portfolio_app/components/*.py` | All UI components reusable |
---
## Phase 2: Documentation Updates
| Document | Action |
|----------|--------|
| `CLAUDE.md` | Update data model section, mark transition complete |
| `docs/PROJECT_REFERENCE.md` | Update architecture, data sources |
| `docs/toronto_housing_dashboard_spec_v5.md` | Archive or delete |
| `docs/wbs_sprint_plan_v4.md` | Archive or delete |
---
## Phase 3: New Data Model
### Star Schema (Neighbourhood-Centric)
| Table | Type | Description |
|-------|------|-------------|
| `dim_neighbourhood` | Central Dimension | 158 neighbourhoods with geometry |
| `dim_time` | Dimension | Date dimension (keep existing) |
| `dim_cmhc_zone` | Bridge Dimension | 15 CMHC zones with neighbourhood mapping |
| `bridge_cmhc_neighbourhood` | Bridge | Zone-to-neighbourhood area weights |
| `fact_census` | Fact | Census indicators by neighbourhood |
| `fact_crime` | Fact | Crime stats by neighbourhood |
| `fact_rentals` | Fact | Rental data by CMHC zone (keep existing) |
| `fact_amenities` | Fact | Amenity counts by neighbourhood |
### New Schema Files
| File | Contains |
|------|----------|
| `toronto/schemas/neighbourhood.py` | NeighbourhoodRecord, CensusRecord, CrimeRecord |
| `toronto/schemas/amenities.py` | AmenityType enum, AmenityRecord |
### New Parser Files
| File | Data Source | API |
|------|-------------|-----|
| `toronto/parsers/toronto_open_data.py` | Neighbourhoods, Census, Parks, Schools, Childcare | Toronto Open Data Portal |
| `toronto/parsers/toronto_police.py` | Crime Rates, MCI, Shootings | Toronto Police Portal |
### New Loader Files
| File | Purpose |
|------|---------|
| `toronto/loaders/neighbourhoods.py` | Load GeoJSON boundaries |
| `toronto/loaders/census.py` | Load neighbourhood profiles |
| `toronto/loaders/crime.py` | Load crime statistics |
| `toronto/loaders/amenities.py` | Load parks, schools, childcare |
| `toronto/loaders/cmhc_crosswalk.py` | Build CMHC-neighbourhood bridge |
---
## Phase 4: dbt Restructuring
### Staging Layer
| Model | Source |
|-------|--------|
| `stg_toronto__neighbourhoods` | dim_neighbourhood |
| `stg_toronto__census` | fact_census |
| `stg_toronto__crime` | fact_crime |
| `stg_toronto__amenities` | fact_amenities |
| `stg_cmhc__rentals` | fact_rentals (modify existing) |
| `stg_cmhc__zone_crosswalk` | bridge_cmhc_neighbourhood |
### Intermediate Layer
| Model | Purpose |
|-------|---------|
| `int_neighbourhood__demographics` | Combined census demographics |
| `int_neighbourhood__housing` | Housing indicators |
| `int_neighbourhood__crime_summary` | Aggregated crime by type |
| `int_neighbourhood__amenity_scores` | Normalized amenity metrics |
| `int_rentals__neighbourhood_allocated` | CMHC rentals allocated to neighbourhoods |
### Mart Layer (One per Tab)
| Model | Tab | Key Metrics |
|-------|-----|-------------|
| `mart_neighbourhood_overview` | Overview | Composite livability score |
| `mart_neighbourhood_housing` | Housing | Affordability index, rent-to-income |
| `mart_neighbourhood_safety` | Safety | Crime rates, YoY change |
| `mart_neighbourhood_demographics` | Demographics | Income, age, diversity |
| `mart_neighbourhood_amenities` | Amenities | Parks, schools, transit per capita |
---
## Phase 5: Dashboard Implementation
### Tab Structure
```
pages/toronto/
├── dashboard.py # Main layout with tab navigation
├── tabs/
│ ├── overview.py # Composite livability
│ ├── housing.py # Affordability
│ ├── safety.py # Crime
│ ├── demographics.py # Population
│ └── amenities.py # Services
└── callbacks/
├── map_callbacks.py
├── chart_callbacks.py
└── selection_callbacks.py
```
### Layout Pattern (All Tabs)
Each tab follows the same structure:
1. **Choropleth Map** (left) — 158 neighbourhoods, click to select
2. **KPI Cards** (right) — 3-4 contextual metrics
3. **Supporting Charts** (bottom) — Trend + comparison visualizations
4. **Details Panel** (collapsible) — All metrics for selected neighbourhood
### Graphs by Tab
| Tab | Choropleth Metric | Chart 1 | Chart 2 |
|-----|-------------------|---------|---------|
| Overview | Livability score | Top/Bottom 10 bar | Income vs Crime scatter |
| Housing | Affordability index | Rent trend (5yr line) | Dwelling types (pie/bar) |
| Safety | Crime rate per 100K | Crime breakdown (stacked bar) | Crime trend (5yr line) |
| Demographics | Median income | Age pyramid | Top languages (bar) |
| Amenities | Park area per capita | Amenity radar | Transit accessibility (bar) |
---
## Phase 6: Jupyter Notebooks
### Purpose
One notebook per graph to document:
1. **Data Reference** — How the data was built (query, transformation steps, sample output)
2. **Data Visualization** — Import figure factory, render the graph
### Directory Structure
```
notebooks/
├── README.md
├── overview/
├── housing/
├── safety/
├── demographics/
└── amenities/
```
### Notebook Template
```markdown
# [Graph Name]
## 1. Data Reference
### Source Tables
- List tables/marts used
- Grain of each table
### Query
```sql
SELECT ... FROM ...
```
### Transformation Steps
1. Step description
2. Step description
### Sample Data
```python
df = pd.read_sql(query, engine)
df.head(10)
```
## 2. Data Visualization
```python
from portfolio_app.figures.choropleth import create_choropleth_figure
fig = create_choropleth_figure(...)
fig.show()
```
```
Create one notebook per graph as each is implemented (15 total across 5 tabs).
---
## Phase 7: Final Documentation Review
After all implementation, audit and update:
- [ ] `CLAUDE.md` — Project status, app structure, data model, URL routes
- [ ] `README.md` — Project description, installation, quick start
- [ ] `docs/PROJECT_REFERENCE.md` — Architecture matches implementation
- [ ] Remove or archive legacy spec documents
---
## Data Source Reference
| Source | Datasets | URL |
|--------|----------|-----|
| Toronto Open Data | Neighbourhoods, Census Profiles, Parks, Schools, Childcare, TTC | open.toronto.ca |
| Toronto Police | Crime Rates, MCI, Shootings | data.torontopolice.on.ca |
| CMHC | Rental Market Survey | cmhc-schl.gc.ca |
---
## CMHC Zone Mapping Note
CMHC uses 15 zones that don't align with 158 neighbourhoods. Strategy:
- Create `bridge_cmhc_neighbourhood` with area weights
- Allocate rental metrics proportionally to overlapping neighbourhoods
- Document methodology in `/toronto/methodology` page
---
*Document Version: 2.0*
*Trimmed from v1.0 for execution clarity*

View File

@@ -1,423 +0,0 @@
# Toronto Neighbourhood Dashboard — Deliverables
**Project Type:** Interactive Data Visualization Dashboard
**Geographic Scope:** City of Toronto, 158 Official Neighbourhoods
**Author:** Leo Miranda
**Version:** 1.0 | January 2026
---
## Executive Summary
Multi-tab analytics dashboard built around Toronto's official neighbourhood boundaries. The core interaction is a choropleth map where users explore the city through different thematic lenses—housing affordability, safety, demographics, amenities—with supporting visualizations that tell a cohesive story per theme.
**Primary Goals:**
1. Demonstrate interactive data visualization skills (Plotly/Dash)
2. Showcase data engineering capabilities (multi-source ETL, dimensional modeling)
3. Create a portfolio piece with genuine analytical value
---
## Part 1: Geographic Foundation (Required First)
| Dataset | Source | Format | Last Updated | Download |
|---------|--------|--------|--------------|----------|
| **Neighbourhoods Boundaries** | Toronto Open Data | GeoJSON | 2024 | [Link](https://open.toronto.ca/dataset/neighbourhoods/) |
| **Neighbourhood Profiles** | Toronto Open Data | CSV | 2021 Census | [Link](https://open.toronto.ca/dataset/neighbourhood-profiles/) |
**Critical Notes:**
- Toronto uses 158 official neighbourhoods (updated 2024, was 140)
- GeoJSON includes `AREA_ID` for joining to tabular data
- Neighbourhood Profiles has 2,400+ indicators per neighbourhood from Census
---
## Part 2: Tier 1 — MVP Datasets
| Dataset | Source | Measures Available | Update Freq | Granularity |
|---------|--------|-------------------|-------------|-------------|
| **Neighbourhoods GeoJSON** | Toronto Open Data | Boundary polygons, area IDs | Static | Neighbourhood |
| **Neighbourhood Profiles (full)** | Toronto Open Data | 2,400+ Census indicators | Every 5 years | Neighbourhood |
| **Neighbourhood Crime Rates** | Toronto Police Portal | MCI rates per 100K by year | Annual | Neighbourhood |
| **CMHC Rental Market Survey** | CMHC Portal | Avg rent by bedroom, vacancy rate | Annual (Oct) | 15 CMHC Zones |
| **Parks** | Toronto Open Data | Park locations, area, type | Annual | Point/Polygon |
**Total API/Download Calls:** 5
**Data Volume:** ~50MB combined
### Tier 1 Measures to Extract
**From Neighbourhood Profiles:**
- Population, population density
- Median household income
- Age distribution (0-14, 15-24, 25-44, 45-64, 65+)
- % Immigrants, % Visible minorities
- Top languages spoken
- Unemployment rate
- Education attainment (% with post-secondary)
- Housing tenure (own vs rent %)
- Dwelling types distribution
- Average rent, housing costs as % of income
**From Crime Rates:**
- Total MCI rate per 100K population
- Year-over-year crime trend
**From CMHC:**
- Average monthly rent (1BR, 2BR, 3BR)
- Vacancy rates
**From Parks:**
- Park count per neighbourhood
- Park area per capita
---
## Part 3: Tier 2 — Expansion Datasets
| Dataset | Source | Measures Available | Update Freq | Granularity |
|---------|--------|-------------------|-------------|-------------|
| **Major Crime Indicators (MCI)** | Toronto Police Portal | Assault, B&E, auto theft, robbery, theft over | Quarterly | Neighbourhood |
| **Shootings & Firearm Discharges** | Toronto Police Portal | Shooting incidents, injuries, fatalities | Quarterly | Neighbourhood |
| **Building Permits** | Toronto Open Data | New construction, permits by type | Monthly | Address-level |
| **Schools** | Toronto Open Data | Public/Catholic, elementary/secondary | Annual | Point |
| **TTC Routes & Stops** | Toronto Open Data | Route geometry, stop locations | Static | Route/Stop |
| **Licensed Child Care Centres** | Toronto Open Data | Capacity, ages served, locations | Annual | Point |
### Tier 2 Measures to Extract
**From MCI Details:**
- Breakdown by crime type (assault, B&E, auto theft, robbery, theft over)
**From Shootings:**
- Shooting incidents count
- Injuries/fatalities
**From Building Permits:**
- New construction permits (trailing 12 months)
- Permit types distribution
**From Schools:**
- Schools per 1000 children
- School type breakdown
**From TTC:**
- Transit stops within neighbourhood
- Transit accessibility score
**From Child Care:**
- Child care spaces per capita
- Coverage by age group
---
## Part 4: Data Sources by Thematic Group
### GROUP A: Housing & Affordability
| Dataset | Tier | Measures | Update Freq |
|---------|------|----------|-------------|
| Neighbourhood Profiles (Housing) | 1 | Avg rent, ownership %, dwelling types, housing costs as % of income | Every 5 years |
| CMHC Rental Market Survey | 1 | Avg rent by bedroom, vacancy rate, rental universe | Annual |
| Building Permits | 2 | New construction, permits by type | Monthly |
**Calculated Metrics:**
- Rent-to-Income Ratio (CMHC rent ÷ Census income)
- Affordability Index (% of income spent on housing)
---
### GROUP B: Safety & Crime
| Dataset | Tier | Measures | Update Freq |
|---------|------|----------|-------------|
| Neighbourhood Crime Rates | 1 | MCI rates per 100K pop by year | Annual |
| Major Crime Indicators (MCI) | 2 | Assault, B&E, auto theft, robbery, theft over | Quarterly |
| Shootings & Firearm Discharges | 2 | Shooting incidents, injuries, fatalities | Quarterly |
**Calculated Metrics:**
- Year-over-year crime change %
- Crime type distribution
---
### GROUP C: Demographics & Community
| Dataset | Tier | Measures | Update Freq |
|---------|------|----------|-------------|
| Neighbourhood Profiles (Demographics) | 1 | Age distribution, household composition, income | Every 5 years |
| Neighbourhood Profiles (Immigration) | 1 | Immigration status, visible minorities, languages | Every 5 years |
| Neighbourhood Profiles (Education) | 1 | Education attainment, field of study | Every 5 years |
| Neighbourhood Profiles (Labour) | 1 | Employment rate, occupation, industry | Every 5 years |
---
### GROUP D: Transportation & Mobility
| Dataset | Tier | Measures | Update Freq |
|---------|------|----------|-------------|
| Commute Mode (Census) | 1 | % car, transit, walk, bike | Every 5 years |
| TTC Routes & Stops | 2 | Route geometry, stop locations | Static |
**Calculated Metrics:**
- Transit accessibility (stops within 500m of neighbourhood centroid)
---
### GROUP E: Amenities & Services
| Dataset | Tier | Measures | Update Freq |
|---------|------|----------|-------------|
| Parks | 1 | Park locations, area, type | Annual |
| Schools | 2 | Public/Catholic, elementary/secondary | Annual |
| Licensed Child Care Centres | 2 | Capacity, ages served | Annual |
**Calculated Metrics:**
- Park area per capita
- Schools per 1000 children (ages 5-17)
- Child care spaces per 1000 children (ages 0-4)
---
## Part 5: Tab Structure
### Tab Architecture
```
┌────────────────────────────────────────────────────────────────┐
│ [Overview] [Housing] [Safety] [Demographics] [Amenities] │
├────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────┐ ┌────────────────┐ │
│ │ │ │ KPI Card 1 │ │
│ │ CHOROPLETH MAP │ ├────────────────┤ │
│ │ (158 Neighbourhoods) │ │ KPI Card 2 │ │
│ │ │ ├────────────────┤ │
│ │ Click to select │ │ KPI Card 3 │ │
│ │ │ └────────────────┘ │
│ └─────────────────────────────────┘ │
│ │
│ ┌─────────────────────┐ ┌─────────────────────┐ │
│ │ Supporting Chart 1 │ │ Supporting Chart 2 │ │
│ │ (Context/Trend) │ │ (Comparison/Rank) │ │
│ └─────────────────────┘ └─────────────────────┘ │
│ │
│ [Neighbourhood: Selected Name] ──────────────────────── │
│ Details panel with all metrics for selected area │
└────────────────────────────────────────────────────────────────┘
```
---
### Tab 1: Overview (Default Landing)
**Story:** "How do Toronto neighbourhoods compare across key livability metrics?"
| Element | Content | Data Source |
|---------|---------|-------------|
| Map Colour | Composite livability score | Calculated from weighted metrics |
| KPI Cards | Population, Median Income, Avg Crime Rate | Neighbourhood Profiles, Crime Rates |
| Chart 1 | Top 10 / Bottom 10 by livability score | Calculated |
| Chart 2 | Income vs Crime scatter plot | Neighbourhood Profiles, Crime Rates |
**Metric Selector:** Allow user to change map colour by any single metric.
---
### Tab 2: Housing & Affordability
**Story:** "Where can you afford to live, and what's being built?"
| Element | Content | Data Source |
|---------|---------|-------------|
| Map Colour | Rent-to-Income Ratio (Affordability Index) | CMHC + Census income |
| KPI Cards | Median Rent (1BR), Vacancy Rate, New Permits (12mo) | CMHC, Building Permits |
| Chart 1 | Rent trend (5-year line chart by bedroom) | CMHC historical |
| Chart 2 | Dwelling type breakdown (pie/bar) | Neighbourhood Profiles |
**Metric Selector:** Toggle between rent, ownership %, dwelling types.
---
### Tab 3: Safety
**Story:** "How safe is each neighbourhood, and what crimes are most common?"
| Element | Content | Data Source |
|---------|---------|-------------|
| Map Colour | Total MCI Rate per 100K | Crime Rates |
| KPI Cards | Total Crimes, YoY Change %, Shooting Incidents | Crime Rates, Shootings |
| Chart 1 | Crime type breakdown (stacked bar) | MCI Details |
| Chart 2 | 5-year crime trend (line chart) | Crime Rates historical |
**Metric Selector:** Toggle between total crime, specific crime types, shootings.
---
### Tab 4: Demographics
**Story:** "Who lives here? Age, income, diversity."
| Element | Content | Data Source |
|---------|---------|-------------|
| Map Colour | Median Household Income | Neighbourhood Profiles |
| KPI Cards | Population, % Immigrant, Unemployment Rate | Neighbourhood Profiles |
| Chart 1 | Age distribution (population pyramid or bar) | Neighbourhood Profiles |
| Chart 2 | Top languages spoken (horizontal bar) | Neighbourhood Profiles |
**Metric Selector:** Income, immigrant %, age groups, education.
---
### Tab 5: Amenities & Services
**Story:** "What's nearby? Parks, schools, child care, transit."
| Element | Content | Data Source |
|---------|---------|-------------|
| Map Colour | Park Area per Capita | Parks + Population |
| KPI Cards | Parks Count, Schools Count, Child Care Spaces | Multiple datasets |
| Chart 1 | Amenity density comparison (radar or bar) | Calculated |
| Chart 2 | Transit accessibility (stops within 500m) | TTC Stops |
**Metric Selector:** Parks, schools, child care, transit access.
---
## Part 6: Data Pipeline Architecture
### ETL Flow
```
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ DATA SOURCES │ │ STAGING LAYER │ │ MART LAYER │
│ │ │ │ │ │
│ Toronto Open │────▶│ stg_geography │────▶│ dim_neighbourhood│
│ Data Portal │ │ stg_census │ │ fact_crime │
│ │ │ stg_crime │ │ fact_housing │
│ CMHC Portal │────▶│ stg_rental │ │ fact_amenities │
│ │ │ stg_permits │ │ │
│ Toronto Police │────▶│ stg_amenities │ │ agg_dashboard │
│ Portal │ │ stg_childcare │ │ (pre-computed) │
└─────────────────┘ └─────────────────┘ └─────────────────┘
```
### Key Transformations
| Transformation | Description |
|----------------|-------------|
| **Geography Standardization** | Ensure all datasets use `neighbourhood_id` (AREA_ID from GeoJSON) |
| **Census Pivot** | Neighbourhood Profiles is wide format — pivot to metrics per neighbourhood |
| **CMHC Zone Mapping** | Create crosswalk from 15 CMHC zones to 158 neighbourhoods |
| **Amenity Aggregation** | Spatial join point data (schools, parks, child care) to neighbourhood polygons |
| **Rate Calculations** | Normalize counts to per-capita or per-100K |
### Data Refresh Schedule
| Layer | Frequency | Trigger |
|-------|-----------|---------|
| Staging (API pulls) | Weekly | Scheduled job |
| Marts (transforms) | Weekly | Post-staging |
| Dashboard cache | On-demand | User refresh button |
---
## Part 7: Technical Stack
### Core Stack
| Component | Technology | Rationale |
|-----------|------------|-----------|
| **Frontend** | Plotly Dash | Production-ready, rapid iteration |
| **Mapping** | Plotly `choropleth_mapbox` | Native Dash integration |
| **Data Store** | PostgreSQL + PostGIS | Spatial queries, existing expertise |
| **ETL** | Python (Pandas, SQLAlchemy) | Existing stack |
| **Deployment** | Render / Railway | Free tier, easy Dash hosting |
### Alternative (Portfolio Stretch)
| Component | Technology | Why Consider |
|-----------|------------|--------------|
| **Frontend** | React + deck.gl | More "modern" for portfolio |
| **Data Store** | DuckDB | Serverless, embeddable |
| **ETL** | dbt | Aligns with skills roadmap |
---
## Appendix A: Data Source URLs
| Source | URL |
|--------|-----|
| Toronto Open Data — Neighbourhoods | https://open.toronto.ca/dataset/neighbourhoods/ |
| Toronto Open Data — Neighbourhood Profiles | https://open.toronto.ca/dataset/neighbourhood-profiles/ |
| Toronto Police — Neighbourhood Crime Rates | https://data.torontopolice.on.ca/datasets/neighbourhood-crime-rates-open-data |
| Toronto Police — MCI | https://data.torontopolice.on.ca/datasets/major-crime-indicators-open-data |
| Toronto Police — Shootings | https://data.torontopolice.on.ca/datasets/shootings-firearm-discharges-open-data |
| CMHC Rental Market Survey | https://www.cmhc-schl.gc.ca/professionals/housing-markets-data-and-research/housing-data/data-tables/rental-market |
| Toronto Open Data — Parks | https://open.toronto.ca/dataset/parks/ |
| Toronto Open Data — Schools | https://open.toronto.ca/dataset/school-locations-all-types/ |
| Toronto Open Data — Building Permits | https://open.toronto.ca/dataset/building-permits-cleared-permits/ |
| Toronto Open Data — Child Care | https://open.toronto.ca/dataset/licensed-child-care-centres/ |
| Toronto Open Data — TTC Routes | https://open.toronto.ca/dataset/ttc-routes-and-schedules/ |
---
## Appendix B: Colour Palettes
### Affordability (Diverging)
| Status | Hex | Usage |
|--------|-----|-------|
| Affordable (<30% income) | `#2ecc71` | Green |
| Stretched (30-50%) | `#f1c40f` | Yellow |
| Unaffordable (>50%) | `#e74c3c` | Red |
### Safety (Sequential)
| Status | Hex | Usage |
|--------|-----|-------|
| Safest (lowest crime) | `#27ae60` | Dark green |
| Moderate | `#f39c12` | Orange |
| Highest Crime | `#c0392b` | Dark red |
### Demographics — Income (Sequential)
| Level | Hex | Usage |
|-------|-----|-------|
| Highest Income | `#1a5276` | Dark blue |
| Mid Income | `#5dade2` | Light blue |
| Lowest Income | `#ecf0f1` | Light gray |
### General Recommendation
Use **Viridis** or **Plasma** colorscales for perceptually uniform gradients on continuous metrics.
---
## Appendix C: Glossary
| Term | Definition |
|------|------------|
| **MCI** | Major Crime Indicators — Assault, B&E, Auto Theft, Robbery, Theft Over |
| **CMHC Zone** | Canada Mortgage and Housing Corporation rental market survey zones (15 in Toronto) |
| **Rent-to-Income Ratio** | Monthly rent ÷ monthly household income; <30% is considered affordable |
| **PostGIS** | PostgreSQL extension for geographic data |
| **Choropleth** | Thematic map where areas are shaded based on a statistical variable |
---
## Appendix D: Interview Talking Points
When discussing this project in interviews, emphasize:
1. **Data Engineering:** "I built a multi-source ETL pipeline that standardizes geographic keys across Census data, police data, and CMHC rental surveys—three different granularities I had to reconcile."
2. **Dimensional Modeling:** "The data model follows star schema patterns with a central neighbourhood dimension table and fact tables for crime, housing, and amenities."
3. **dbt Patterns:** "The transformation layer uses staging → intermediate → mart patterns, which I've documented for maintainability."
4. **Business Value:** "The dashboard answers questions like 'Where can a young professional afford to live that's safe and has good transit?' — turning raw data into actionable insights."
5. **Technical Decisions:** "I chose Plotly Dash over a React frontend because it let me iterate faster while maintaining production-quality interactivity. For a portfolio piece, speed to working demo matters."
---
*Document Version: 1.0*
*Created: January 2026*
*Author: Leo Miranda / Claude*