33 KiB
Toronto Housing Price Dashboard
Portfolio Project — Data Specification & Architecture
Version: 5.1 Last Updated: January 2026 Status: Specification Complete
Document Context
| Attribute | Value |
|---|---|
| Parent Document | portfolio_project_plan_v5.md |
| Role | Detailed specification for Toronto Housing Dashboard |
| Scope | Data schemas, source URLs, geographic boundaries, V1/V2 decisions |
Rule: For overall project scope, phasing, tech stack, and deployment architecture, see portfolio_project_plan_v5.md. This document provides implementation-level detail for the Toronto Housing project specifically.
Terminology Note: This document uses Stages 1–4 to describe Toronto Housing implementation steps. These are distinct from the Phases 1–5 in portfolio_project_plan_v5.md, which describe the overall portfolio project lifecycle.
Project Overview
A dashboard analyzing housing price variations across Toronto neighbourhoods over time, with dual analysis tracks:
| Track | Data Domain | Primary Source | Geographic Unit |
|---|---|---|---|
| Purchases | Sales transactions | TRREB Monthly Reports | ~35 Districts |
| Rentals | Rental market stats | CMHC Rental Market Survey | ~20 Zones |
Core Visualization: Interactive choropleth map of Toronto with toggle between rental/purchase analysis, time-series exploration by month/year.
Enrichment Layer (V1: overlay only): Neighbourhood-level demographic and socioeconomic context including population density, education attainment, and income. Crime data deferred to Phase 4 of the portfolio project (post-Energy project).
Tech Stack & Deployment: See portfolio_project_plan_v5.md → Tech Stack, Deployment Architecture
Geographic Layers
Layer Architecture
┌─────────────────────────────────────────────────────────────────┐
│ City of Toronto Official Neighbourhoods (158) │ ← Reference overlay + Enrichment data
├─────────────────────────────────────────────────────────────────┤
│ TRREB Districts (~35) — W01, C01, E01, etc. │ ← Purchase data
├─────────────────────────────────────────────────────────────────┤
│ CMHC Survey Zones (~20) — Census Tract aligned │ ← Rental data
└─────────────────────────────────────────────────────────────────┘
Boundary Files
| Layer | Zones | Format | Source | Status |
|---|---|---|---|---|
| City Neighbourhoods | 158 | GeoJSON, Shapefile | GitHub - jasonicarter/toronto-geojson | ✅ Ready to use |
| TRREB Districts | ~35 | PDF only | TRREB Toronto Map PDF | ⚠ Requires manual digitization |
| CMHC Zones | ~20 | R package | R cmhc package via get_cmhc_geography() |
✅ Available (see note) |
Digitization Task: TRREB Districts
Input: TRREB Toronto PDF map Output: GeoJSON with district codes (W01-W10, C01-C15, E01-E11) Tool: QGIS
Process:
- Import PDF as raster layer in QGIS
- Create vector layer with polygon features
- Trace district boundaries
- Add attributes:
district_code,district_name,area_type(West/Central/East) - Export as GeoJSON (WGS84 / EPSG:4326)
CMHC Zone Boundaries
Source: The R cmhc package provides CMHC survey geography via the get_cmhc_geography() function.
Extraction Process:
# In R
library(cmhc)
library(sf)
# Get Toronto CMA zones
toronto_zones <- get_cmhc_geography(
geography_type = "ZONE",
cma = "Toronto"
)
# Export to GeoJSON for Python/PostGIS
st_write(toronto_zones, "cmhc_zones.geojson", driver = "GeoJSON")
Output: data/toronto/raw/geo/cmhc_zones.geojson
Why R?: CMHC zone boundaries are not published as standalone files. The cmhc R package is the only reliable programmatic source. One-time extraction, then use GeoJSON in Python stack.
⚠ Neighbourhood Boundary Change (140 → 158)
The City of Toronto updated from 140 to 158 social planning neighbourhoods in April 2021. This affects data alignment:
| Data Source | Pre-2021 | Post-2021 | Handling |
|---|---|---|---|
| Census (2016 and earlier) | 140 neighbourhoods | N/A | Use 140-model files |
| Census (2021+) | N/A | 158 neighbourhoods | Use 158-model files |
V1 Strategy: Use 2021 Census on 158 boundaries only. Defer historical trend analysis to portfolio Phase 4.
Data Source #1: TRREB Monthly Market Reports
Source Details
| Attribute | Value |
|---|---|
| Provider | Toronto Regional Real Estate Board |
| URL | TRREB Market Watch |
| Format | PDF (monthly reports) |
| Update Frequency | Monthly |
| Historical Availability | 2007–Present |
| Access | Public (aggregate data in PDFs) |
| Extraction Method | PDF parsing (pdfplumber or camelot-py) |
Available Tables
Table: trreb_monthly_summary
Location in PDF: Pages 3-4 (Summary by Area)
| Column | Data Type | Description |
|---|---|---|
report_date |
DATE | First of month (YYYY-MM-01) |
area_code |
VARCHAR(3) | District code (W01, C01, E01, etc.) |
area_name |
VARCHAR(100) | District name |
area_type |
VARCHAR(10) | West / Central / East / North |
sales |
INTEGER | Number of transactions |
dollar_volume |
DECIMAL | Total sales volume ($) |
avg_price |
DECIMAL | Average sale price ($) |
median_price |
DECIMAL | Median sale price ($) |
new_listings |
INTEGER | New listings count |
active_listings |
INTEGER | Active listings at month end |
avg_sp_lp |
DECIMAL | Avg sale price / list price ratio (%) |
avg_dom |
INTEGER | Average days on market |
Dimensions
| Dimension | Granularity | Values |
|---|---|---|
| Time | Monthly | 2007-01 to present |
| Geography | District | ~35 TRREB districts |
| Property Type | Aggregate | All residential (no breakdown in summary) |
Metrics Available
| Metric | Aggregation | Use Case |
|---|---|---|
avg_price |
Pre-calculated monthly avg | Primary price indicator |
median_price |
Pre-calculated monthly median | Robust price indicator |
sales |
Count | Market activity volume |
avg_dom |
Average | Market velocity |
avg_sp_lp |
Ratio | Buyer/seller market indicator |
new_listings |
Count | Supply indicator |
active_listings |
Snapshot | Inventory level |
⚠ Limitations
- No transaction-level data (aggregates only)
- Property type breakdown requires parsing additional tables
- PDF structure may vary slightly across years
- District boundaries haven't changed since 2011
Data Source #2: CMHC Rental Market Survey
Source Details
| Attribute | Value |
|---|---|
| Provider | Canada Mortgage and Housing Corporation |
| URL | CMHC Housing Market Information Portal |
| Format | CSV export, API |
| Update Frequency | Annual (October survey) |
| Historical Availability | 1990–Present |
| Access | Public, free registration for bulk downloads |
| Geographic Levels | CMA → Zone → Neighbourhood → Census Tract |
Available Tables
Table: cmhc_rental_summary
Portal Path: Toronto → Primary Rental Market → Summary Statistics
| Column | Data Type | Description |
|---|---|---|
survey_year |
INTEGER | Survey year (October) |
zone_code |
VARCHAR(10) | CMHC zone identifier |
zone_name |
VARCHAR(100) | Zone name |
bedroom_type |
VARCHAR(20) | Bachelor / 1-Bed / 2-Bed / 3-Bed+ / Total |
universe |
INTEGER | Total rental units in zone |
vacancy_rate |
DECIMAL | Vacancy rate (%) |
vacancy_rate_reliability |
VARCHAR(1) | Reliability code (a/b/c/d) |
availability_rate |
DECIMAL | Availability rate (%) |
average_rent |
DECIMAL | Average monthly rent ($) |
average_rent_reliability |
VARCHAR(1) | Reliability code |
median_rent |
DECIMAL | Median monthly rent ($) |
rent_change_pct |
DECIMAL | YoY rent change (%) |
turnover_rate |
DECIMAL | Unit turnover rate (%) |
Dimensions
| Dimension | Granularity | Values |
|---|---|---|
| Time | Annual | 1990 to present (October snapshot) |
| Geography | Zone | ~20 CMHC zones in Toronto CMA |
| Bedroom Type | Category | Bachelor, 1-Bed, 2-Bed, 3-Bed+, Total |
| Structure Type | Category | Row, Apartment (available in detailed tables) |
Metrics Available
| Metric | Aggregation | Use Case |
|---|---|---|
average_rent |
Pre-calculated avg | Primary rent indicator |
median_rent |
Pre-calculated median | Robust rent indicator |
vacancy_rate |
Percentage | Market tightness |
availability_rate |
Percentage | Supply accessibility |
turnover_rate |
Percentage | Tenant mobility |
rent_change_pct |
YoY % | Rent growth tracking |
universe |
Count | Market size |
Reliability Codes
| Code | Meaning | Coefficient of Variation |
|---|---|---|
a |
Excellent | CV ≤ 2.5% |
b |
Good | 2.5% < CV ≤ 5% |
c |
Fair | 5% < CV ≤ 10% |
d |
Poor (use with caution) | CV > 10% |
** |
Data suppressed | Sample too small |
⚠ Limitations
- Annual only (no monthly granularity)
- October snapshot (point-in-time)
- Zones are larger than TRREB districts
- Purpose-built rental only (excludes condo rentals in base survey)
Data Source #3: City of Toronto Open Data
Source Details
| Attribute | Value |
|---|---|
| Provider | City of Toronto |
| URL | Toronto Open Data Portal |
| Format | GeoJSON, Shapefile, CSV |
| Use Case | Reference layer, demographic enrichment |
Relevant Datasets
Dataset: neighbourhoods
| Column | Data Type | Description |
|---|---|---|
area_id |
INTEGER | Neighbourhood ID (1-158) |
area_name |
VARCHAR(100) | Official neighbourhood name |
geometry |
POLYGON | Boundary geometry |
Dataset: neighbourhood_profiles (Census-linked)
| Column | Data Type | Description |
|---|---|---|
neighbourhood_id |
INTEGER | Links to neighbourhoods |
population |
INTEGER | Total population |
avg_household_income |
DECIMAL | Average household income |
dwelling_count |
INTEGER | Total dwellings |
owner_pct |
DECIMAL | % owner-occupied |
renter_pct |
DECIMAL | % renter-occupied |
Enrichment Potential
Can overlay demographic context on housing data:
- Income brackets by neighbourhood
- Ownership vs rental ratios
- Population density
- Dwelling type distribution
Data Source #4: Enrichment Data (Density, Education)
Purpose
Provide socioeconomic context to housing price analysis. Enables questions like:
- Do neighbourhoods with higher education attainment have higher prices?
- How does population density correlate with price per square foot?
Geographic Alignment Reality
Critical constraint: Enrichment data is available at the 158-neighbourhood level, while core housing data sits at TRREB districts (~35) and CMHC zones (~20). These do not align cleanly.
158 Neighbourhoods (fine) → Enrichment data lives here
(no clean crosswalk)
~35 TRREB Districts (coarse) → Purchase data lives here
~20 CMHC Zones (coarse) → Rental data lives here
Available Enrichment Datasets
Dataset: Neighbourhood Profiles (Census)
| Attribute | Value |
|---|---|
| Provider | City of Toronto (via Statistics Canada Census) |
| URL | Toronto Open Data - Neighbourhood Profiles |
| Format | CSV, JSON, XML, XLSX |
| Update Frequency | Every 5 years (Census cycle) |
| Available Years | 2001, 2006, 2011, 2016, 2021 |
| Geographic Unit | 158 neighbourhoods (140 pre-2021) |
Key Variables:
| Variable | Description | Use Case |
|---|---|---|
population |
Total population | Density calculation |
land_area_sqkm |
Area in square kilometers | Density calculation |
pop_density_per_sqkm |
Population per km | Density metric |
pct_bachelors_or_higher |
% age 25-64 with bachelor's+ | Education proxy |
median_household_income |
Median total household income | Income metric |
avg_household_income |
Average total household income | Income metric |
pct_owner_occupied |
% owner-occupied dwellings | Tenure split |
pct_renter_occupied |
% renter-occupied dwellings | Tenure split |
Download URL (2021, 158 neighbourhoods):
https://ckan0.cf.opendata.inter.prod-toronto.ca/dataset/6e19a90f-971c-46b3-852c-0c48c436d1fc/resource/19d4a806-7385-4889-acf2-256f1e079060/download/nbhd_2021_census_profile_full_158model.xlsx
Crime Data — Deferred to Portfolio Phase 4
Crime data (TPS Neighbourhood Crime Rates) is not included in V1 scope. It will be added in portfolio Phase 4 after the Energy Pricing project is complete.
Rationale:
- Crime data is socially/politically sensitive and requires careful methodology documentation
- V1 focuses on core housing metrics and policy events
- Deferral reduces scope creep risk
Future Reference (Portfolio Phase 4):
- Source: TPS Public Safety Data Portal
- Dataset: Neighbourhood Crime Rates (Major Crime Indicators)
- Geographic Unit: 158 neighbourhoods
V1 Enrichment Data Summary
| Measure | Source | Geography | Frequency | Format | Status |
|---|---|---|---|---|---|
| Population Density | Neighbourhood Profiles | 158 neighbourhoods | Census (5-year) | CSV/JSON | ✅ Ready |
| Education Attainment | Neighbourhood Profiles | 158 neighbourhoods | Census (5-year) | CSV/JSON | ✅ Ready |
| Median Income | Neighbourhood Profiles | 158 neighbourhoods | Census (5-year) | CSV/JSON | ✅ Ready |
| Crime Rates (MCI) | TPS Data Portal | 158 neighbourhoods | Annual | GeoJSON/CSV | Deferred to Portfolio Phase 4 |
Data Source #5: Policy Events
Purpose
Provide temporal context for housing price movements. Display as annotation markers on time series charts. No causation claims — correlation/context only.
Event Schema
Table: dim_policy_event
| Column | Data Type | Description |
|---|---|---|
event_id |
INTEGER (PK) | Auto-increment primary key |
event_date |
DATE | Date event was announced/occurred |
effective_date |
DATE | Date policy took effect (if different) |
level |
VARCHAR(20) | federal / provincial / municipal |
category |
VARCHAR(20) | monetary / tax / regulatory / supply / economic |
title |
VARCHAR(200) | Short event title for display |
description |
TEXT | Longer description for tooltip |
expected_direction |
VARCHAR(10) | bearish / bullish / neutral |
source_url |
VARCHAR(500) | Link to official announcement/documentation |
confidence |
VARCHAR(10) | high / medium / low |
created_at |
TIMESTAMP | Record creation timestamp |
Event Tiers
| Tier | Level | Category Examples | Inclusion Criteria |
|---|---|---|---|
| 1 | Federal | BoC rate decisions, OSFI stress tests | Always include; objective, documented |
| 1 | Provincial | Fair Housing Plan, foreign buyer tax, rent control | Always include; legislative record |
| 2 | Municipal | Zoning reforms, development charges | Include if material impact expected |
| 2 | Economic | COVID measures, major employer closures | Include if Toronto-specific impact |
| 3 | Market | Major project announcements | Strict criteria; must be verifiable |
Expected Direction Values
| Value | Meaning | Example |
|---|---|---|
bullish |
Expected to increase prices | Rate cut, supply restriction |
bearish |
Expected to decrease prices | Rate hike, foreign buyer tax |
neutral |
Uncertain or mixed impact | Regulatory clarification |
⚠ Caveats
- No causation claims: Events are context, not explanation
- Lag effects: Policy impact may not be immediate
- Confounding factors: Multiple simultaneous influences
- Display only: No statistical analysis in V1
Sample Events (Tier 1)
| Date | Level | Category | Title | Direction |
|---|---|---|---|---|
| 2017-04-20 | provincial | tax | Ontario Fair Housing Plan | bearish |
| 2018-01-01 | federal | regulatory | OSFI B-20 Stress Test | bearish |
| 2020-03-27 | federal | monetary | BoC Emergency Rate Cut (0.25%) | bullish |
| 2022-03-02 | federal | monetary | BoC Rate Hike Cycle Begins | bearish |
| 2023-06-01 | federal | tax | Federal 2-Year Foreign Buyer Ban | bearish |
Data Integration Strategy
Temporal Alignment
| Source | Native Frequency | Alignment Strategy |
|---|---|---|
| TRREB | Monthly | Use as-is |
| CMHC | Annual (October) | Spread to monthly OR display annual overlay |
| Census/Enrichment | 5-year | Static snapshot; display as reference |
| Policy Events | Event-based | Display as vertical markers on time axis |
Recommendation: Keep separate time axes. TRREB monthly for purchases, CMHC annual for rentals. Don't force artificial monthly rental data.
Geographic Alignment
┌─────────────────────────────────────────────────────────────────┐
│ VISUALIZATION APPROACH │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Purchase Mode Rental Mode │
│ ───────────────── ────────────── │
│ Map: TRREB Districts Map: CMHC Zones │
│ Time: Monthly slider Time: Annual selector │
│ Metrics: Price, Sales Metrics: Rent, Vacancy │
│ │
│ ┌───────────────────────────────────────────────────────┐ │
│ │ City Neighbourhoods Overlay │ │
│ │ (158 boundaries as reference layer) │ │
│ │ + Enrichment data (density, education, income) │ │
│ ──────────────────────────────────────────────────────────┘ │
│ │
────────────────────────────────────────────────────────────────────┘
Enrichment Integration Strategy (Phased)
V1: Reference Overlay (Current Scope)
Approach: Display neighbourhood enrichment as a separate toggle-able layer. No joins to housing data.
UX:
- User hovers over TRREB district → tooltip shows "This district contains neighbourhoods: Annex, Casa Loma, Yorkville..."
- User toggles "Show Enrichment" → choropleth switches to neighbourhood-level density/education/income
- Enrichment and housing metrics displayed side-by-side, not merged
Pros:
- No imputation or dodgy aggregations
- Honest about geographic mismatch
- Ships faster
Cons:
- Can't do correlation analysis (price vs. enrichment) directly in dashboard
Implementation:
dim_neighbourhoodas standalone dimension (no FK to fact tables)- Spatial lookup on hover (point-in-polygon)
V2/Portfolio Phase 4: Area-Weighted Aggregation (Future Scope)
Approach: Pre-compute area-weighted averages of neighbourhood metrics for each TRREB district and CMHC zone.
Process:
- Spatial join: intersect neighbourhood polygons with TRREB/CMHC polygons
- Compute overlap area for each neighbourhood-district pair
- Weight neighbourhood metrics by overlap area proportion
- User selects aggregation method in UI
Aggregation Methods to Expose:
| Method | Description | Best For |
|---|---|---|
| Area-weighted mean | Weight by % overlap area | Continuous metrics (density) |
| Population-weighted mean | Weight by population in overlap | Per-capita metrics (education) |
| Majority assignment | Assign neighbourhood to district with >50% overlap | Categorical data |
| Max overlap | Assign to single district with largest overlap | 1:1 mapping needs |
Default: Population-weighted (more defensible for per-capita metrics). Hide selector behind "Advanced" toggle.
V1 Future-Proofing (Do Now)
| Action | Why |
|---|---|
| Store neighbourhood boundaries in same CRS as TRREB/CMHC (WGS84) | Avoids reprojection headaches |
Keep dim_neighbourhood normalized (not denormalized into district tables) |
Clean separation for V2 join |
| Document Census year for each metric | Ready for 2026 Census |
Include census_year column in dim_neighbourhood |
Enables SCD tracking |
V1 Defer (Don't Do Yet)
| Action | Why Not |
|---|---|
| Pre-compute area-weighted crosswalk | Don't need for V1 |
| Build aggregation method selector UI | No backend to support it |
| Crime data integration | Deferred to Portfolio Phase 4 |
| Historical neighbourhood boundary reconciliation (140→158) | Use 2021+ data only for V1 |
Proposed Data Model
Star Schema
┌──────────────────┐
│ dim_time │
├──────────────────┤
│ date_key (PK) │
│ year │
│ month │
│ quarter │
│ month_name │
───────────────────────┘
│
┌─────────────────────────────────────────────┐
│ │ │
│
┌──────────────────┐ │ ┌──────────────────┐
│ dim_trreb_district│ │ │ dim_cmhc_zone │
├──────────────────┤ │ ├──────────────────┤
│ district_key (PK)│ │ │ zone_key (PK) │
│ district_code │ │ │ zone_code │
│ district_name │ │ │ zone_name │
│ area_type │ │ │ geometry │
│ geometry │
───────────────────────┘ │ │
│ │ │
│
┌──────────────────┐ │ ┌──────────────────┐
│ fact_purchases │ │ │ fact_rentals │
├──────────────────┤ │ ├──────────────────┤
│ date_key (FK) │ │ │ date_key (FK) │
│ district_key (FK)│ │ │ zone_key (FK) │
│ sales_count │ │ │ bedroom_type │
│ avg_price │ │ │ avg_rent │
│ median_price │ │ │ median_rent │
│ new_listings │ │ │ vacancy_rate │
│ active_listings │ │ │ universe │
│ avg_dom │ │ │ turnover_rate │
│ avg_sp_lp │ │ │ reliability_code │
─────────────────────┘ │ ─────────────────────┘
│
┌───────────────────────────┐
│ dim_neighbourhood │
├───────────────────────────┤
│ neighbourhood_id (PK) │
│ name │
│ geometry │
│ population │
│ land_area_sqkm │
│ pop_density_per_sqkm │
│ pct_bachelors_or_higher │
│ median_household_income │
│ pct_owner_occupied │
│ pct_renter_occupied │
│ census_year │ ← For SCD tracking
──────────────────────────────┘
┌───────────────────────────┐
│ dim_policy_event │
├───────────────────────────┤
│ event_id (PK) │
│ event_date │
│ effective_date │
│ level │ ← federal/provincial/municipal
│ category │ ← monetary/tax/regulatory/supply/economic
│ title │
│ description │
│ expected_direction │ ← bearish/bullish/neutral
│ source_url │
│ confidence │ ← high/medium/low
│ created_at │
──────────────────────────────┘
┌───────────────────────────┐
│ bridge_district_neighbourhood │ ← Portfolio Phase 4 ONLY
├───────────────────────────┤
│ district_key (FK) │
│ neighbourhood_id (FK) │
│ area_overlap_pct │
│ population_overlap │ ← For pop-weighted agg
──────────────────────────────┘
Notes:
dim_neighbourhoodhas no FK relationship to fact tables in V1dim_policy_eventis standalone (no FK to facts); used for time-series annotationbridge_district_neighbourhoodis Portfolio Phase 4 scope only- Similar bridge table needed for CMHC zones in Portfolio Phase 4
File Structure
Note
: Toronto Housing data logic lives in
portfolio_app/toronto/. Seeportfolio_project_plan_v5.mdfor full project structure.
Data Directory Structure
data/
└── toronto/
├── raw/
│ ├── trreb/
│ │ └── market_watch_YYYY_MM.pdf
│ ├── cmhc/
│ │ └── rental_survey_YYYY.csv
│ ├── enrichment/
│ │ └── neighbourhood_profiles_2021.xlsx
│ └── geo/
│ ├── toronto_neighbourhoods.geojson
│ ├── trreb_districts.geojson ← (to be created via QGIS)
│ └── cmhc_zones.geojson ← (from R cmhc package)
│
├── processed/ ← gitignored
│ ├── fact_purchases.parquet
│ ├── fact_rentals.parquet
│ ├── dim_time.parquet
│ ├── dim_trreb_district.parquet
│ ├── dim_cmhc_zone.parquet
│ ├── dim_neighbourhood.parquet
│ └── dim_policy_event.parquet
│
└── reference/
├── policy_events.csv ← Curated event list
└── neighbourhood_boundary_changelog.md ← 140→158 notes
Code Module Structure
portfolio_app/toronto/
├── __init__.py
├── parsers/
│ ├── __init__.py
│ ├── trreb.py # PDF extraction
│ └── cmhc.py # CSV processing
├── loaders/
│ ├── __init__.py
│ └── database.py # DB operations
├── schemas/ # Pydantic models
│ ├── __init__.py
│ ├── trreb.py
│ ├── cmhc.py
│ ├── enrichment.py
│ └── policy_event.py
├── models/ # SQLAlchemy ORM
│ ├── __init__.py
│ ├── base.py # DeclarativeBase, engine
│ ├── dimensions.py # dim_time, dim_trreb_district, dim_policy_event, etc.
│ └── facts.py # fact_purchases, fact_rentals
└── transforms/
└── __init__.py
Notebooks
notebooks/
├── 01_trreb_pdf_extraction.ipynb
├── 02_cmhc_data_prep.ipynb
├── 03_geo_layer_prep.ipynb
├── 04_enrichment_data_prep.ipynb
├── 05_policy_events_curation.ipynb
└── 06_spatial_crosswalk.ipynb ← Portfolio Phase 4 only
✅ Implementation Checklist
Note
: These are Stages within the Toronto Housing project (Portfolio Phase 1). They are distinct from the overall portfolio Phases defined in
portfolio_project_plan_v5.md.
Stage 1: Data Acquisition
- Download TRREB monthly PDFs (2020-present as MVP)
- Register for CMHC portal and export Toronto rental data
- Extract CMHC zone boundaries via R
cmhcpackage - Download City of Toronto neighbourhood GeoJSON (158 boundaries)
- Digitize TRREB district boundaries in QGIS
- Download Neighbourhood Profiles (2021 Census, 158-model)
Stage 2: Data Processing
- Build TRREB PDF parser (
portfolio_app/toronto/parsers/trreb.py) - Build Pydantic schemas (
portfolio_app/toronto/schemas/) - Build SQLAlchemy models (
portfolio_app/toronto/models/) - Extract and validate TRREB monthly summaries
- Clean and structure CMHC rental data
- Process Neighbourhood Profiles into
dim_neighbourhood - Curate and load policy events into
dim_policy_event - Create dimension tables
- Build fact tables
- Validate all geospatial layers use same CRS (WGS84/EPSG:4326)
Stage 3: Visualization (V1)
- Create dashboard page (
portfolio_app/pages/toronto/dashboard.py) - Build choropleth figures (
portfolio_app/figures/choropleth.py) - Build time series figures (
portfolio_app/figures/time_series.py) - Design dashboard layout (purchase/rental toggle)
- Implement choropleth map with layer switching
- Add time slider/selector
- Build neighbourhood overlay (toggle-able)
- Add enrichment layer toggle (density/education/income choropleth)
- Add policy event markers on time series
- Add tooltips with cross-reference info ("This district contains...")
- Add tooltips showing enrichment metrics on hover
Stage 4: Polish (V1)
- Add data source citations
- Document methodology (especially geographic limitations)
- Write docs (
docs/methodology.md,docs/data_sources.md) - Deploy to portfolio
Future Enhancements (Portfolio Phase 4 — Post-Energy Project)
- Add crime data to dim_neighbourhood
- Build spatial crosswalk (neighbourhood ↔ district/zone intersections)
- Compute area-weighted and population-weighted aggregations
- Add aggregation method selector to UI
- Enable correlation analysis (price vs. enrichment metrics)
- Add historical neighbourhood boundary support (140→158)
Deployment & dbt Architecture: See portfolio_project_plan_v5.md for:
- dbt layer structure and testing strategy
- Deployment architecture
- Data quality framework
References & Links
Core Housing Data
| Resource | URL |
|---|---|
| TRREB Market Watch | https://trreb.ca/index.php/market-news/market-watch |
| CMHC Housing Portal | https://www03.cmhc-schl.gc.ca/hmip-pimh/ |
Geographic Boundaries
| Resource | URL |
|---|---|
| Toronto Neighbourhoods GeoJSON | https://github.com/jasonicarter/toronto-geojson |
| TRREB District Map (PDF) | https://webapp.proptx.ca/trrebdata/common/maps/Toronto.pdf |
| Statistics Canada Census Tracts | https://www12.statcan.gc.ca/census-recensement/2021/geo/sip-pis/boundary-limites/index-eng.cfm |
R cmhc package (CRAN) |
https://cran.r-project.org/package=cmhc |
Enrichment Data
| Resource | URL |
|---|---|
| Toronto Open Data Portal | https://open.toronto.ca/ |
| Neighbourhood Profiles (CKAN) | https://ckan0.cf.opendata.inter.prod-toronto.ca/dataset/neighbourhood-profiles |
| Neighbourhood Profiles 2021 (Direct Download) | https://ckan0.cf.opendata.inter.prod-toronto.ca/dataset/6e19a90f-971c-46b3-852c-0c48c436d1fc/resource/19d4a806-7385-4889-acf2-256f1e079060/download/nbhd_2021_census_profile_full_158model.xlsx |
Policy Events Research
| Resource | URL |
|---|---|
| Bank of Canada Interest Rates | https://www.bankofcanada.ca/rates/interest-rates/ |
| OSFI (Stress Test Rules) | https://www.osfi-bsif.gc.ca/ |
| Ontario Legislature (Bills) | https://www.ola.org/ |
Reference Documentation
| Resource | URL |
|---|---|
| Statistics Canada 2021 Census Reference | https://www12.statcan.gc.ca/census-recensement/2021/ref/index-eng.cfm |
| City of Toronto Neighbourhood Profiles Overview | https://www.toronto.ca/city-government/data-research-maps/neighbourhoods-communities/neighbourhood-profiles/ |
Related Documents
| Document | Relationship | Use For |
|---|---|---|
portfolio_project_plan_v5.md |
Parent document | Overall scope, phasing, tech stack, deployment, dbt architecture, data quality framework |
Document Version: 5.1 Updated: January 2026 Project: Toronto Housing Price Dashboard — Portfolio Piece