docs: Add local lessons learned backup system

- Create docs/project-lessons-learned/ for local lesson storage - Add INDEX.md with lesson template and index table - Document Phase 4 dbt test syntax deprecation lesson - Update CLAUDE.md with backup method when Wiki.js unavailable This provides a fallback for capturing lessons learned while Wiki.js integration is being configured. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
feat: Implement Phase 4 dbt model restructuring
2026-01-16 11:52:06 -05:00 · 2026-01-16 11:41:27 -05:00 · 2026-01-16 11:07:13 -05:00
36 changed files with 2817 additions and 2 deletions
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -261,4 +261,71 @@ All scripts in `scripts/`:

 ---

+## Projman Plugin Workflow
+
+**CRITICAL: Always use the projman plugin for sprint and task management.**
+
+### When to Use Projman Skills
+
+| Skill | Trigger | Purpose |
+|-------|---------|---------|
+| `/projman:sprint-plan` | New sprint or phase implementation | Architecture analysis + Gitea issue creation |
+| `/projman:sprint-start` | Beginning implementation work | Load lessons learned (Wiki.js or local), start execution |
+| `/projman:sprint-status` | Check progress | Review blockers and completion status |
+| `/projman:sprint-close` | Sprint completion | Capture lessons learned (Wiki.js or local backup) |
+
+### Default Behavior
+
+When user requests implementation work:
+
+1. **ALWAYS start with `/projman:sprint-plan`** before writing code
+2. Create Gitea issues with proper labels and acceptance criteria
+3. Use `/projman:sprint-start` to begin execution with lessons learned
+4. Track progress via Gitea issue comments
+5. Close sprint with `/projman:sprint-close` to document lessons
+
+### Gitea Repository
+
+- **Repo**: `lmiranda/personal-portfolio`
+- **Host**: `gitea.hotserv.cloud`
+- **Note**: `lmiranda` is a user account (not org), so label lookup may require repo-level labels
+
+### MCP Tools Available
+
+**Gitea**:
+- `list_issues`, `get_issue`, `create_issue`, `update_issue`, `add_comment`
+- `get_labels`, `suggest_labels`
+
+**Wiki.js**:
+- `search_lessons`, `create_lesson`, `search_pages`, `get_page`
+
+### Lessons Learned (Backup Method)
+
+**When Wiki.js is unavailable**, use the local backup in `docs/project-lessons-learned/`:
+
+**At Sprint Start:**
+1. Review `docs/project-lessons-learned/INDEX.md` for relevant past lessons
+2. Search lesson files by tags/keywords before implementation
+3. Apply prevention strategies from applicable lessons
+
+**At Sprint Close:**
+1. Try Wiki.js `create_lesson` first
+2. If Wiki.js fails, create lesson in `docs/project-lessons-learned/`
+3. Use naming convention: `{phase-or-sprint}-{short-description}.md`
+4. Update `INDEX.md` with new entry
+5. Follow the lesson template in INDEX.md
+
+**Migration:** Once Wiki.js is configured, lessons will be migrated there for better searchability.
+
+### Issue Structure
+
+Every Gitea issue should include:
+- **Overview**: Brief description
+- **Files to Create/Modify**: Explicit paths
+- **Acceptance Criteria**: Checkboxes
+- **Technical Notes**: Implementation hints
+- **Labels**: Listed in body (workaround for label API issues)
+
+---
+
 *Last Updated: Sprint 9*
--- a/dbt/models/intermediate/_intermediate.yml
+++ b/dbt/models/intermediate/_intermediate.yml
@@ -11,3 +11,77 @@ models:
      - name: zone_code
        tests:
          - not_null
+
+  - name: int_neighbourhood__demographics
+    description: "Combined census demographics with neighbourhood attributes"
+    columns:
+      - name: neighbourhood_id
+        description: "Neighbourhood identifier"
+        tests:
+          - not_null
+      - name: census_year
+        description: "Census year"
+        tests:
+          - not_null
+      - name: income_quintile
+        description: "Income quintile (1-5, city-wide)"
+
+  - name: int_neighbourhood__housing
+    description: "Housing indicators combining census and rental data"
+    columns:
+      - name: neighbourhood_id
+        description: "Neighbourhood identifier"
+        tests:
+          - not_null
+      - name: year
+        description: "Reference year"
+      - name: rent_to_income_pct
+        description: "Rent as percentage of median income"
+      - name: is_affordable
+        description: "Boolean: rent <= 30% of income"
+
+  - name: int_neighbourhood__crime_summary
+    description: "Aggregated crime with year-over-year trends"
+    columns:
+      - name: neighbourhood_id
+        description: "Neighbourhood identifier"
+        tests:
+          - not_null
+      - name: year
+        description: "Statistics year"
+        tests:
+          - not_null
+      - name: crime_rate_per_100k
+        description: "Total crime rate per 100K population"
+      - name: yoy_change_pct
+        description: "Year-over-year change percentage"
+
+  - name: int_neighbourhood__amenity_scores
+    description: "Normalized amenities per capita and per area"
+    columns:
+      - name: neighbourhood_id
+        description: "Neighbourhood identifier"
+        tests:
+          - not_null
+      - name: year
+        description: "Reference year"
+      - name: total_amenities_per_1000
+        description: "Total amenities per 1000 population"
+      - name: amenities_per_sqkm
+        description: "Total amenities per square km"
+
+  - name: int_rentals__neighbourhood_allocated
+    description: "CMHC rental data allocated to neighbourhoods via area weights"
+    columns:
+      - name: neighbourhood_id
+        description: "Neighbourhood identifier"
+        tests:
+          - not_null
+      - name: year
+        description: "Survey year"
+        tests:
+          - not_null
+      - name: avg_rent_2bed
+        description: "Weighted average 2-bedroom rent"
+      - name: vacancy_rate
+        description: "Weighted average vacancy rate"
--- a/dbt/models/intermediate/int_neighbourhood__amenity_scores.sql
+++ b/dbt/models/intermediate/int_neighbourhood__amenity_scores.sql
@@ -0,0 +1,79 @@
+-- Intermediate: Normalized amenities per 1000 population
+-- Pivots amenity types and calculates per-capita metrics
+-- Grain: One row per neighbourhood per year
+
+with neighbourhoods as (
+    select * from {{ ref('stg_toronto__neighbourhoods') }}
+),
+
+amenities as (
+    select * from {{ ref('stg_toronto__amenities') }}
+),
+
+-- Aggregate amenity types
+amenities_by_year as (
+    select
+        neighbourhood_id,
+        amenity_year as year,
+        sum(case when amenity_type = 'Parks' then amenity_count else 0 end) as parks_count,
+        sum(case when amenity_type = 'Schools' then amenity_count else 0 end) as schools_count,
+        sum(case when amenity_type = 'Transit Stops' then amenity_count else 0 end) as transit_count,
+        sum(case when amenity_type = 'Libraries' then amenity_count else 0 end) as libraries_count,
+        sum(case when amenity_type = 'Community Centres' then amenity_count else 0 end) as community_centres_count,
+        sum(case when amenity_type = 'Recreation' then amenity_count else 0 end) as recreation_count,
+        sum(amenity_count) as total_amenities
+    from amenities
+    group by neighbourhood_id, amenity_year
+),
+
+amenity_scores as (
+    select
+        n.neighbourhood_id,
+        n.neighbourhood_name,
+        n.geometry,
+        n.population,
+        n.land_area_sqkm,
+
+        a.year,
+
+        -- Raw counts
+        a.parks_count,
+        a.schools_count,
+        a.transit_count,
+        a.libraries_count,
+        a.community_centres_count,
+        a.recreation_count,
+        a.total_amenities,
+
+        -- Per 1000 population
+        case when n.population > 0
+            then round(a.parks_count::numeric / n.population * 1000, 3)
+            else null
+        end as parks_per_1000,
+
+        case when n.population > 0
+            then round(a.schools_count::numeric / n.population * 1000, 3)
+            else null
+        end as schools_per_1000,
+
+        case when n.population > 0
+            then round(a.transit_count::numeric / n.population * 1000, 3)
+            else null
+        end as transit_per_1000,
+
+        case when n.population > 0
+            then round(a.total_amenities::numeric / n.population * 1000, 3)
+            else null
+        end as total_amenities_per_1000,
+
+        -- Per square km
+        case when n.land_area_sqkm > 0
+            then round(a.total_amenities::numeric / n.land_area_sqkm, 2)
+            else null
+        end as amenities_per_sqkm
+
+    from neighbourhoods n
+    left join amenities_by_year a on n.neighbourhood_id = a.neighbourhood_id
+)
+
+select * from amenity_scores
--- a/dbt/models/intermediate/int_neighbourhood__crime_summary.sql
+++ b/dbt/models/intermediate/int_neighbourhood__crime_summary.sql
@@ -0,0 +1,81 @@
+-- Intermediate: Aggregated crime by neighbourhood with YoY change
+-- Pivots crime types and calculates year-over-year trends
+-- Grain: One row per neighbourhood per year
+
+with neighbourhoods as (
+    select * from {{ ref('stg_toronto__neighbourhoods') }}
+),
+
+crime as (
+    select * from {{ ref('stg_toronto__crime') }}
+),
+
+-- Aggregate crime types
+crime_by_year as (
+    select
+        neighbourhood_id,
+        crime_year as year,
+        sum(incident_count) as total_incidents,
+        sum(case when crime_type = 'Assault' then incident_count else 0 end) as assault_count,
+        sum(case when crime_type = 'Auto Theft' then incident_count else 0 end) as auto_theft_count,
+        sum(case when crime_type = 'Break and Enter' then incident_count else 0 end) as break_enter_count,
+        sum(case when crime_type = 'Robbery' then incident_count else 0 end) as robbery_count,
+        sum(case when crime_type = 'Theft Over' then incident_count else 0 end) as theft_over_count,
+        sum(case when crime_type = 'Homicide' then incident_count else 0 end) as homicide_count,
+        avg(rate_per_100k) as avg_rate_per_100k
+    from crime
+    group by neighbourhood_id, crime_year
+),
+
+-- Add year-over-year changes
+with_yoy as (
+    select
+        c.*,
+        lag(c.total_incidents, 1) over (
+            partition by c.neighbourhood_id
+            order by c.year
+        ) as prev_year_incidents,
+        round(
+            (c.total_incidents - lag(c.total_incidents, 1) over (
+                partition by c.neighbourhood_id
+                order by c.year
+            ))::numeric /
+            nullif(lag(c.total_incidents, 1) over (
+                partition by c.neighbourhood_id
+                order by c.year
+            ), 0) * 100,
+            2
+        ) as yoy_change_pct
+    from crime_by_year c
+),
+
+crime_summary as (
+    select
+        n.neighbourhood_id,
+        n.neighbourhood_name,
+        n.geometry,
+        n.population,
+
+        w.year,
+        w.total_incidents,
+        w.assault_count,
+        w.auto_theft_count,
+        w.break_enter_count,
+        w.robbery_count,
+        w.theft_over_count,
+        w.homicide_count,
+        w.avg_rate_per_100k,
+        w.yoy_change_pct,
+
+        -- Crime rate per 100K population
+        case
+            when n.population > 0
+            then round(w.total_incidents::numeric / n.population * 100000, 2)
+            else null
+        end as crime_rate_per_100k
+
+    from neighbourhoods n
+    inner join with_yoy w on n.neighbourhood_id = w.neighbourhood_id
+)
+
+select * from crime_summary
--- a/dbt/models/intermediate/int_neighbourhood__demographics.sql
+++ b/dbt/models/intermediate/int_neighbourhood__demographics.sql
@@ -0,0 +1,44 @@
+-- Intermediate: Combined census demographics by neighbourhood
+-- Joins neighbourhoods with census data for demographic analysis
+-- Grain: One row per neighbourhood per census year
+
+with neighbourhoods as (
+    select * from {{ ref('stg_toronto__neighbourhoods') }}
+),
+
+census as (
+    select * from {{ ref('stg_toronto__census') }}
+),
+
+demographics as (
+    select
+        n.neighbourhood_id,
+        n.neighbourhood_name,
+        n.geometry,
+        n.land_area_sqkm,
+
+        c.census_year,
+        c.population,
+        c.population_density,
+        c.median_household_income,
+        c.average_household_income,
+        c.median_age,
+        c.unemployment_rate,
+        c.pct_bachelors_or_higher as education_bachelors_pct,
+        c.average_dwelling_value,
+
+        -- Tenure mix
+        c.pct_owner_occupied,
+        c.pct_renter_occupied,
+
+        -- Income quintile (city-wide comparison)
+        ntile(5) over (
+            partition by c.census_year
+            order by c.median_household_income
+        ) as income_quintile
+
+    from neighbourhoods n
+    left join census c on n.neighbourhood_id = c.neighbourhood_id
+)
+
+select * from demographics
--- a/dbt/models/intermediate/int_neighbourhood__housing.sql
+++ b/dbt/models/intermediate/int_neighbourhood__housing.sql
@@ -0,0 +1,56 @@
+-- Intermediate: Housing indicators by neighbourhood
+-- Combines census housing data with allocated CMHC rental data
+-- Grain: One row per neighbourhood per year
+
+with neighbourhoods as (
+    select * from {{ ref('stg_toronto__neighbourhoods') }}
+),
+
+census as (
+    select * from {{ ref('stg_toronto__census') }}
+),
+
+allocated_rentals as (
+    select * from {{ ref('int_rentals__neighbourhood_allocated') }}
+),
+
+housing as (
+    select
+        n.neighbourhood_id,
+        n.neighbourhood_name,
+        n.geometry,
+
+        coalesce(r.year, c.census_year) as year,
+
+        -- Census housing metrics
+        c.pct_owner_occupied,
+        c.pct_renter_occupied,
+        c.average_dwelling_value,
+        c.median_household_income,
+
+        -- Allocated rental metrics (weighted average from CMHC zones)
+        r.avg_rent_2bed,
+        r.vacancy_rate,
+
+        -- Affordability calculations
+        case
+            when c.median_household_income > 0 and r.avg_rent_2bed > 0
+            then round((r.avg_rent_2bed * 12 / c.median_household_income) * 100, 2)
+            else null
+        end as rent_to_income_pct,
+
+        -- Affordability threshold (30% of income)
+        case
+            when c.median_household_income > 0 and r.avg_rent_2bed > 0
+            then r.avg_rent_2bed * 12 <= c.median_household_income * 0.30
+            else null
+        end as is_affordable
+
+    from neighbourhoods n
+    left join census c on n.neighbourhood_id = c.neighbourhood_id
+    left join allocated_rentals r
+        on n.neighbourhood_id = r.neighbourhood_id
+        and r.year = c.census_year
+)
+
+select * from housing
--- a/dbt/models/intermediate/int_rentals__neighbourhood_allocated.sql
+++ b/dbt/models/intermediate/int_rentals__neighbourhood_allocated.sql
@@ -0,0 +1,73 @@
+-- Intermediate: CMHC rentals allocated to neighbourhoods via area weights
+-- Disaggregates zone-level rental data to neighbourhood level
+-- Grain: One row per neighbourhood per year
+
+with crosswalk as (
+    select * from {{ ref('stg_cmhc__zone_crosswalk') }}
+),
+
+rentals as (
+    select * from {{ ref('int_rentals__annual') }}
+),
+
+neighbourhoods as (
+    select * from {{ ref('stg_toronto__neighbourhoods') }}
+),
+
+-- Allocate rental metrics to neighbourhoods using area weights
+allocated as (
+    select
+        c.neighbourhood_id,
+        r.year,
+        r.bedroom_type,
+
+        -- Weighted average rent (using area weight)
+        sum(r.avg_rent * c.area_weight) as weighted_avg_rent,
+        sum(r.median_rent * c.area_weight) as weighted_median_rent,
+        sum(c.area_weight) as total_weight,
+
+        -- Weighted vacancy rate
+        sum(r.vacancy_rate * c.area_weight) / nullif(sum(c.area_weight), 0) as vacancy_rate,
+
+        -- Weighted rental universe
+        sum(r.rental_universe * c.area_weight) as rental_units_estimate
+
+    from crosswalk c
+    inner join rentals r on c.cmhc_zone_code = r.zone_code
+    group by c.neighbourhood_id, r.year, r.bedroom_type
+),
+
+-- Pivot to get 2-bedroom as primary metric
+pivoted as (
+    select
+        neighbourhood_id,
+        year,
+        max(case when bedroom_type = 'Two Bedroom' then weighted_avg_rent / nullif(total_weight, 0) end) as avg_rent_2bed,
+        max(case when bedroom_type = 'One Bedroom' then weighted_avg_rent / nullif(total_weight, 0) end) as avg_rent_1bed,
+        max(case when bedroom_type = 'Bachelor' then weighted_avg_rent / nullif(total_weight, 0) end) as avg_rent_bachelor,
+        max(case when bedroom_type = 'Three Bedroom +' then weighted_avg_rent / nullif(total_weight, 0) end) as avg_rent_3bed,
+        avg(vacancy_rate) as vacancy_rate,
+        sum(rental_units_estimate) as total_rental_units
+    from allocated
+    group by neighbourhood_id, year
+),
+
+final as (
+    select
+        n.neighbourhood_id,
+        n.neighbourhood_name,
+        n.geometry,
+
+        p.year,
+        round(p.avg_rent_bachelor::numeric, 2) as avg_rent_bachelor,
+        round(p.avg_rent_1bed::numeric, 2) as avg_rent_1bed,
+        round(p.avg_rent_2bed::numeric, 2) as avg_rent_2bed,
+        round(p.avg_rent_3bed::numeric, 2) as avg_rent_3bed,
+        round(p.vacancy_rate::numeric, 2) as vacancy_rate,
+        round(p.total_rental_units::numeric, 0) as total_rental_units
+
+    from neighbourhoods n
+    inner join pivoted p on n.neighbourhood_id = p.neighbourhood_id
+)
+
+select * from final
--- a/dbt/models/marts/_marts.yml
+++ b/dbt/models/marts/_marts.yml
@@ -9,3 +9,127 @@ models:
        tests:
          - unique
          - not_null
+
+  - name: mart_neighbourhood_overview
+    description: "Neighbourhood overview with composite livability score"
+    meta:
+      dashboard_tab: Overview
+    columns:
+      - name: neighbourhood_id
+        description: "Neighbourhood identifier"
+        tests:
+          - not_null
+      - name: neighbourhood_name
+        description: "Official neighbourhood name"
+        tests:
+          - not_null
+      - name: geometry
+        description: "PostGIS geometry for mapping"
+      - name: livability_score
+        description: "Composite score: safety (30%), affordability (40%), amenities (30%)"
+      - name: safety_score
+        description: "Safety component score (0-100)"
+      - name: affordability_score
+        description: "Affordability component score (0-100)"
+      - name: amenity_score
+        description: "Amenity component score (0-100)"
+
+  - name: mart_neighbourhood_housing
+    description: "Housing and affordability metrics by neighbourhood"
+    meta:
+      dashboard_tab: Housing
+    columns:
+      - name: neighbourhood_id
+        description: "Neighbourhood identifier"
+        tests:
+          - not_null
+      - name: neighbourhood_name
+        description: "Official neighbourhood name"
+        tests:
+          - not_null
+      - name: geometry
+        description: "PostGIS geometry for mapping"
+      - name: rent_to_income_pct
+        description: "Rent as percentage of median income"
+      - name: affordability_index
+        description: "100 = city average affordability"
+      - name: rent_yoy_change_pct
+        description: "Year-over-year rent change"
+
+  - name: mart_neighbourhood_safety
+    description: "Crime rates and safety metrics by neighbourhood"
+    meta:
+      dashboard_tab: Safety
+    columns:
+      - name: neighbourhood_id
+        description: "Neighbourhood identifier"
+        tests:
+          - not_null
+      - name: neighbourhood_name
+        description: "Official neighbourhood name"
+        tests:
+          - not_null
+      - name: geometry
+        description: "PostGIS geometry for mapping"
+      - name: crime_rate_per_100k
+        description: "Total crime rate per 100K population"
+      - name: crime_index
+        description: "100 = city average crime rate"
+      - name: safety_tier
+        description: "Safety tier (1=safest, 5=highest crime)"
+        tests:
+          - accepted_values:
+              arguments:
+                values: [1, 2, 3, 4, 5]
+
+  - name: mart_neighbourhood_demographics
+    description: "Demographics and income metrics by neighbourhood"
+    meta:
+      dashboard_tab: Demographics
+    columns:
+      - name: neighbourhood_id
+        description: "Neighbourhood identifier"
+        tests:
+          - not_null
+      - name: neighbourhood_name
+        description: "Official neighbourhood name"
+        tests:
+          - not_null
+      - name: geometry
+        description: "PostGIS geometry for mapping"
+      - name: median_household_income
+        description: "Median household income"
+      - name: income_index
+        description: "100 = city average income"
+      - name: income_quintile
+        description: "Income quintile (1-5)"
+        tests:
+          - accepted_values:
+              arguments:
+                values: [1, 2, 3, 4, 5]
+
+  - name: mart_neighbourhood_amenities
+    description: "Amenity access metrics by neighbourhood"
+    meta:
+      dashboard_tab: Amenities
+    columns:
+      - name: neighbourhood_id
+        description: "Neighbourhood identifier"
+        tests:
+          - not_null
+      - name: neighbourhood_name
+        description: "Official neighbourhood name"
+        tests:
+          - not_null
+      - name: geometry
+        description: "PostGIS geometry for mapping"
+      - name: total_amenities_per_1000
+        description: "Total amenities per 1000 population"
+      - name: amenity_index
+        description: "100 = city average amenities"
+      - name: amenity_tier
+        description: "Amenity tier (1=best, 5=lowest)"
+        tests:
+          - accepted_values:
+              arguments:
+                values: [1, 2, 3, 4, 5]
--- a/dbt/models/marts/mart_neighbourhood_amenities.sql
+++ b/dbt/models/marts/mart_neighbourhood_amenities.sql
@@ -0,0 +1,89 @@
+-- Mart: Neighbourhood Amenities Analysis
+-- Dashboard Tab: Amenities
+-- Grain: One row per neighbourhood per year
+
+with amenities as (
+    select * from {{ ref('int_neighbourhood__amenity_scores') }}
+),
+
+-- City-wide averages for comparison
+city_avg as (
+    select
+        year,
+        avg(parks_per_1000) as city_avg_parks,
+        avg(schools_per_1000) as city_avg_schools,
+        avg(transit_per_1000) as city_avg_transit,
+        avg(total_amenities_per_1000) as city_avg_total_amenities
+    from amenities
+    group by year
+),
+
+final as (
+    select
+        a.neighbourhood_id,
+        a.neighbourhood_name,
+        a.geometry,
+        a.population,
+        a.land_area_sqkm,
+        a.year,
+
+        -- Raw counts
+        a.parks_count,
+        a.schools_count,
+        a.transit_count,
+        a.libraries_count,
+        a.community_centres_count,
+        a.recreation_count,
+        a.total_amenities,
+
+        -- Per 1000 population
+        a.parks_per_1000,
+        a.schools_per_1000,
+        a.transit_per_1000,
+        a.total_amenities_per_1000,
+
+        -- Per square km
+        a.amenities_per_sqkm,
+
+        -- City averages
+        round(ca.city_avg_parks::numeric, 3) as city_avg_parks_per_1000,
+        round(ca.city_avg_schools::numeric, 3) as city_avg_schools_per_1000,
+        round(ca.city_avg_transit::numeric, 3) as city_avg_transit_per_1000,
+
+        -- Amenity index (100 = city average)
+        case
+            when ca.city_avg_total_amenities > 0
+            then round(a.total_amenities_per_1000 / ca.city_avg_total_amenities * 100, 1)
+            else null
+        end as amenity_index,
+
+        -- Category indices
+        case
+            when ca.city_avg_parks > 0
+            then round(a.parks_per_1000 / ca.city_avg_parks * 100, 1)
+            else null
+        end as parks_index,
+
+        case
+            when ca.city_avg_schools > 0
+            then round(a.schools_per_1000 / ca.city_avg_schools * 100, 1)
+            else null
+        end as schools_index,
+
+        case
+            when ca.city_avg_transit > 0
+            then round(a.transit_per_1000 / ca.city_avg_transit * 100, 1)
+            else null
+        end as transit_index,
+
+        -- Amenity tier (1 = best, 5 = lowest)
+        ntile(5) over (
+            partition by a.year
+            order by a.total_amenities_per_1000 desc
+        ) as amenity_tier
+
+    from amenities a
+    left join city_avg ca on a.year = ca.year
+)
+
+select * from final
--- a/dbt/models/marts/mart_neighbourhood_demographics.sql
+++ b/dbt/models/marts/mart_neighbourhood_demographics.sql
@@ -0,0 +1,81 @@
+-- Mart: Neighbourhood Demographics Analysis
+-- Dashboard Tab: Demographics
+-- Grain: One row per neighbourhood per census year
+
+with demographics as (
+    select * from {{ ref('int_neighbourhood__demographics') }}
+),
+
+-- City-wide averages for comparison
+city_avg as (
+    select
+        census_year,
+        avg(median_household_income) as city_avg_income,
+        avg(median_age) as city_avg_age,
+        avg(unemployment_rate) as city_avg_unemployment,
+        avg(education_bachelors_pct) as city_avg_education,
+        avg(population_density) as city_avg_density
+    from demographics
+    group by census_year
+),
+
+final as (
+    select
+        d.neighbourhood_id,
+        d.neighbourhood_name,
+        d.geometry,
+        d.census_year as year,
+
+        -- Population
+        d.population,
+        d.land_area_sqkm,
+        d.population_density,
+
+        -- Income
+        d.median_household_income,
+        d.average_household_income,
+        d.income_quintile,
+
+        -- Income index (100 = city average)
+        case
+            when ca.city_avg_income > 0
+            then round(d.median_household_income / ca.city_avg_income * 100, 1)
+            else null
+        end as income_index,
+
+        -- Demographics
+        d.median_age,
+        d.unemployment_rate,
+        d.education_bachelors_pct,
+
+        -- Age index (100 = city average)
+        case
+            when ca.city_avg_age > 0
+            then round(d.median_age / ca.city_avg_age * 100, 1)
+            else null
+        end as age_index,
+
+        -- Housing tenure
+        d.pct_owner_occupied,
+        d.pct_renter_occupied,
+        d.average_dwelling_value,
+
+        -- Diversity index (using tenure mix as proxy - higher rental = more diverse typically)
+        round(
+            1 - (
+                power(d.pct_owner_occupied / 100, 2) +
+                power(d.pct_renter_occupied / 100, 2)
+            ),
+            3
+        ) * 100 as tenure_diversity_index,
+
+        -- City comparisons
+        round(ca.city_avg_income::numeric, 2) as city_avg_income,
+        round(ca.city_avg_age::numeric, 1) as city_avg_age,
+        round(ca.city_avg_unemployment::numeric, 2) as city_avg_unemployment
+
+    from demographics d
+    left join city_avg ca on d.census_year = ca.census_year
+)
+
+select * from final
--- a/dbt/models/marts/mart_neighbourhood_housing.sql
+++ b/dbt/models/marts/mart_neighbourhood_housing.sql
@@ -0,0 +1,93 @@
+-- Mart: Neighbourhood Housing Analysis
+-- Dashboard Tab: Housing
+-- Grain: One row per neighbourhood per year
+
+with housing as (
+    select * from {{ ref('int_neighbourhood__housing') }}
+),
+
+rentals as (
+    select * from {{ ref('int_rentals__neighbourhood_allocated') }}
+),
+
+demographics as (
+    select * from {{ ref('int_neighbourhood__demographics') }}
+),
+
+-- Add year-over-year rent changes
+with_yoy as (
+    select
+        h.*,
+        r.avg_rent_bachelor,
+        r.avg_rent_1bed,
+        r.avg_rent_3bed,
+        r.total_rental_units,
+        d.income_quintile,
+
+        -- Previous year rent for YoY calculation
+        lag(h.avg_rent_2bed, 1) over (
+            partition by h.neighbourhood_id
+            order by h.year
+        ) as prev_year_rent_2bed
+
+    from housing h
+    left join rentals r
+        on h.neighbourhood_id = r.neighbourhood_id
+        and h.year = r.year
+    left join demographics d
+        on h.neighbourhood_id = d.neighbourhood_id
+        and h.year = d.census_year
+),
+
+final as (
+    select
+        neighbourhood_id,
+        neighbourhood_name,
+        geometry,
+        year,
+
+        -- Tenure mix
+        pct_owner_occupied,
+        pct_renter_occupied,
+
+        -- Housing values
+        average_dwelling_value,
+        median_household_income,
+
+        -- Rental metrics
+        avg_rent_bachelor,
+        avg_rent_1bed,
+        avg_rent_2bed,
+        avg_rent_3bed,
+        vacancy_rate,
+        total_rental_units,
+
+        -- Affordability
+        rent_to_income_pct,
+        is_affordable,
+
+        -- Affordability index (100 = city average)
+        round(
+            rent_to_income_pct / nullif(
+                avg(rent_to_income_pct) over (partition by year),
+                0
+            ) * 100,
+            1
+        ) as affordability_index,
+
+        -- Year-over-year rent change
+        case
+            when prev_year_rent_2bed > 0
+            then round(
+                (avg_rent_2bed - prev_year_rent_2bed) / prev_year_rent_2bed * 100,
+                2
+            )
+            else null
+        end as rent_yoy_change_pct,
+
+        income_quintile
+
+    from with_yoy
+)
+
+select * from final
--- a/dbt/models/marts/mart_neighbourhood_overview.sql
+++ b/dbt/models/marts/mart_neighbourhood_overview.sql
@@ -0,0 +1,110 @@
+-- Mart: Neighbourhood Overview with Composite Livability Score
+-- Dashboard Tab: Overview
+-- Grain: One row per neighbourhood per year
+
+with demographics as (
+    select * from {{ ref('int_neighbourhood__demographics') }}
+),
+
+housing as (
+    select * from {{ ref('int_neighbourhood__housing') }}
+),
+
+crime as (
+    select * from {{ ref('int_neighbourhood__crime_summary') }}
+),
+
+amenities as (
+    select * from {{ ref('int_neighbourhood__amenity_scores') }}
+),
+
+-- Compute percentile ranks for scoring components
+percentiles as (
+    select
+        d.neighbourhood_id,
+        d.neighbourhood_name,
+        d.geometry,
+        d.census_year as year,
+        d.population,
+        d.median_household_income,
+
+        -- Safety score: inverse of crime rate (higher = safer)
+        case
+            when c.crime_rate_per_100k is not null
+            then 100 - percent_rank() over (
+                partition by d.census_year
+                order by c.crime_rate_per_100k
+            ) * 100
+            else null
+        end as safety_score,
+
+        -- Affordability score: inverse of rent-to-income ratio
+        case
+            when h.rent_to_income_pct is not null
+            then 100 - percent_rank() over (
+                partition by d.census_year
+                order by h.rent_to_income_pct
+            ) * 100
+            else null
+        end as affordability_score,
+
+        -- Amenity score: based on amenities per capita
+        case
+            when a.total_amenities_per_1000 is not null
+            then percent_rank() over (
+                partition by d.census_year
+                order by a.total_amenities_per_1000
+            ) * 100
+            else null
+        end as amenity_score,
+
+        -- Raw metrics for reference
+        c.crime_rate_per_100k,
+        h.rent_to_income_pct,
+        h.avg_rent_2bed,
+        a.total_amenities_per_1000
+
+    from demographics d
+    left join housing h
+        on d.neighbourhood_id = h.neighbourhood_id
+        and d.census_year = h.year
+    left join crime c
+        on d.neighbourhood_id = c.neighbourhood_id
+        and d.census_year = c.year
+    left join amenities a
+        on d.neighbourhood_id = a.neighbourhood_id
+        and d.census_year = a.year
+),
+
+final as (
+    select
+        neighbourhood_id,
+        neighbourhood_name,
+        geometry,
+        year,
+        population,
+        median_household_income,
+
+        -- Component scores (0-100)
+        round(safety_score::numeric, 1) as safety_score,
+        round(affordability_score::numeric, 1) as affordability_score,
+        round(amenity_score::numeric, 1) as amenity_score,
+
+        -- Composite livability score: safety (30%), affordability (40%), amenities (30%)
+        round(
+            (coalesce(safety_score, 50) * 0.30 +
+             coalesce(affordability_score, 50) * 0.40 +
+             coalesce(amenity_score, 50) * 0.30)::numeric,
+            1
+        ) as livability_score,
+
+        -- Raw metrics
+        crime_rate_per_100k,
+        rent_to_income_pct,
+        avg_rent_2bed,
+        total_amenities_per_1000
+
+    from percentiles
+)
+
+select * from final
--- a/dbt/models/marts/mart_neighbourhood_safety.sql
+++ b/dbt/models/marts/mart_neighbourhood_safety.sql
@@ -0,0 +1,78 @@
+-- Mart: Neighbourhood Safety Analysis
+-- Dashboard Tab: Safety
+-- Grain: One row per neighbourhood per year
+
+with crime as (
+    select * from {{ ref('int_neighbourhood__crime_summary') }}
+),
+
+-- City-wide averages for comparison
+city_avg as (
+    select
+        year,
+        avg(crime_rate_per_100k) as city_avg_crime_rate,
+        avg(assault_count) as city_avg_assault,
+        avg(auto_theft_count) as city_avg_auto_theft,
+        avg(break_enter_count) as city_avg_break_enter
+    from crime
+    group by year
+),
+
+final as (
+    select
+        c.neighbourhood_id,
+        c.neighbourhood_name,
+        c.geometry,
+        c.population,
+        c.year,
+
+        -- Total crime
+        c.total_incidents,
+        c.crime_rate_per_100k,
+        c.yoy_change_pct as crime_yoy_change_pct,
+
+        -- Crime breakdown
+        c.assault_count,
+        c.auto_theft_count,
+        c.break_enter_count,
+        c.robbery_count,
+        c.theft_over_count,
+        c.homicide_count,
+
+        -- Per 100K rates by type
+        case when c.population > 0
+            then round(c.assault_count::numeric / c.population * 100000, 2)
+            else null
+        end as assault_rate_per_100k,
+
+        case when c.population > 0
+            then round(c.auto_theft_count::numeric / c.population * 100000, 2)
+            else null
+        end as auto_theft_rate_per_100k,
+
+        case when c.population > 0
+            then round(c.break_enter_count::numeric / c.population * 100000, 2)
+            else null
+        end as break_enter_rate_per_100k,
+
+        -- Comparison to city average
+        round(ca.city_avg_crime_rate::numeric, 2) as city_avg_crime_rate,
+
+        -- Crime index (100 = city average)
+        case
+            when ca.city_avg_crime_rate > 0
+            then round(c.crime_rate_per_100k / ca.city_avg_crime_rate * 100, 1)
+            else null
+        end as crime_index,
+
+        -- Safety tier based on crime rate percentile
+        ntile(5) over (
+            partition by c.year
+            order by c.crime_rate_per_100k desc
+        ) as safety_tier
+
+    from crime c
+    left join city_avg ca on c.year = ca.year
+)
+
+select * from final
--- a/dbt/models/staging/_sources.yml
+++ b/dbt/models/staging/_sources.yml
@@ -41,3 +41,59 @@ sources:
        columns:
          - name: event_id
            description: "Primary key"
+
+      - name: fact_census
+        description: "Census demographics by neighbourhood and year"
+        columns:
+          - name: id
+            description: "Primary key"
+          - name: neighbourhood_id
+            description: "Foreign key to dim_neighbourhood"
+          - name: census_year
+            description: "Census year (2016, 2021, etc.)"
+          - name: population
+            description: "Total population"
+          - name: median_household_income
+            description: "Median household income"
+
+      - name: fact_crime
+        description: "Crime statistics by neighbourhood, year, and type"
+        columns:
+          - name: id
+            description: "Primary key"
+          - name: neighbourhood_id
+            description: "Foreign key to dim_neighbourhood"
+          - name: year
+            description: "Statistics year"
+          - name: crime_type
+            description: "Type of crime"
+          - name: count
+            description: "Number of incidents"
+          - name: rate_per_100k
+            description: "Rate per 100,000 population"
+
+      - name: fact_amenities
+        description: "Amenity counts by neighbourhood and type"
+        columns:
+          - name: id
+            description: "Primary key"
+          - name: neighbourhood_id
+            description: "Foreign key to dim_neighbourhood"
+          - name: amenity_type
+            description: "Type of amenity (parks, schools, transit)"
+          - name: count
+            description: "Number of amenities"
+          - name: year
+            description: "Reference year"
+
+      - name: bridge_cmhc_neighbourhood
+        description: "CMHC zone to neighbourhood mapping with area weights"
+        columns:
+          - name: id
+            description: "Primary key"
+          - name: cmhc_zone_code
+            description: "CMHC zone code"
+          - name: neighbourhood_id
+            description: "Neighbourhood ID"
+          - name: weight
+            description: "Proportional area weight (0-1)"
--- a/dbt/models/staging/_staging.yml
+++ b/dbt/models/staging/_staging.yml
@@ -40,3 +40,90 @@ models:
        tests:
          - unique
          - not_null
+
+  - name: stg_toronto__neighbourhoods
+    description: "Staged Toronto neighbourhood dimension (158 official boundaries)"
+    columns:
+      - name: neighbourhood_id
+        description: "Neighbourhood primary key"
+        tests:
+          - unique
+          - not_null
+      - name: neighbourhood_name
+        description: "Official neighbourhood name"
+        tests:
+          - not_null
+      - name: geometry
+        description: "PostGIS geometry (POLYGON)"
+
+  - name: stg_toronto__census
+    description: "Staged census demographics by neighbourhood"
+    columns:
+      - name: census_id
+        description: "Census record identifier"
+        tests:
+          - unique
+          - not_null
+      - name: neighbourhood_id
+        description: "Neighbourhood foreign key"
+        tests:
+          - not_null
+      - name: census_year
+        description: "Census year (2016, 2021)"
+        tests:
+          - not_null
+
+  - name: stg_toronto__crime
+    description: "Staged crime statistics by neighbourhood"
+    columns:
+      - name: crime_id
+        description: "Crime record identifier"
+        tests:
+          - unique
+          - not_null
+      - name: neighbourhood_id
+        description: "Neighbourhood foreign key"
+        tests:
+          - not_null
+      - name: crime_type
+        description: "Type of crime"
+        tests:
+          - not_null
+
+  - name: stg_toronto__amenities
+    description: "Staged amenity counts by neighbourhood"
+    columns:
+      - name: amenity_id
+        description: "Amenity record identifier"
+        tests:
+          - unique
+          - not_null
+      - name: neighbourhood_id
+        description: "Neighbourhood foreign key"
+        tests:
+          - not_null
+      - name: amenity_type
+        description: "Type of amenity"
+        tests:
+          - not_null
+
+  - name: stg_cmhc__zone_crosswalk
+    description: "Staged CMHC zone to neighbourhood crosswalk with area weights"
+    columns:
+      - name: crosswalk_id
+        description: "Crosswalk record identifier"
+        tests:
+          - unique
+          - not_null
+      - name: cmhc_zone_code
+        description: "CMHC zone code"
+        tests:
+          - not_null
+      - name: neighbourhood_id
+        description: "Neighbourhood foreign key"
+        tests:
+          - not_null
+      - name: area_weight
+        description: "Proportional area weight (0-1)"
+        tests:
+          - not_null
--- a/dbt/models/staging/stg_cmhc__zone_crosswalk.sql
+++ b/dbt/models/staging/stg_cmhc__zone_crosswalk.sql
@@ -0,0 +1,18 @@
+-- Staged CMHC zone to neighbourhood crosswalk
+-- Source: bridge_cmhc_neighbourhood table
+-- Grain: One row per zone-neighbourhood intersection
+
+with source as (
+    select * from {{ source('toronto_housing', 'bridge_cmhc_neighbourhood') }}
+),
+
+staged as (
+    select
+        id as crosswalk_id,
+        cmhc_zone_code,
+        neighbourhood_id,
+        weight as area_weight
+    from source
+)
+
+select * from staged
--- a/dbt/models/staging/stg_toronto__amenities.sql
+++ b/dbt/models/staging/stg_toronto__amenities.sql
@@ -0,0 +1,19 @@
+-- Staged amenity counts by neighbourhood
+-- Source: fact_amenities table
+-- Grain: One row per neighbourhood per amenity type per year
+
+with source as (
+    select * from {{ source('toronto_housing', 'fact_amenities') }}
+),
+
+staged as (
+    select
+        id as amenity_id,
+        neighbourhood_id,
+        amenity_type,
+        count as amenity_count,
+        year as amenity_year
+    from source
+)
+
+select * from staged
--- a/dbt/models/staging/stg_toronto__census.sql
+++ b/dbt/models/staging/stg_toronto__census.sql
@@ -0,0 +1,27 @@
+-- Staged census demographics by neighbourhood
+-- Source: fact_census table
+-- Grain: One row per neighbourhood per census year
+
+with source as (
+    select * from {{ source('toronto_housing', 'fact_census') }}
+),
+
+staged as (
+    select
+        id as census_id,
+        neighbourhood_id,
+        census_year,
+        population,
+        population_density,
+        median_household_income,
+        average_household_income,
+        unemployment_rate,
+        pct_bachelors_or_higher,
+        pct_owner_occupied,
+        pct_renter_occupied,
+        median_age,
+        average_dwelling_value
+    from source
+)
+
+select * from staged
--- a/dbt/models/staging/stg_toronto__crime.sql
+++ b/dbt/models/staging/stg_toronto__crime.sql
@@ -0,0 +1,20 @@
+-- Staged crime statistics by neighbourhood
+-- Source: fact_crime table
+-- Grain: One row per neighbourhood per year per crime type
+
+with source as (
+    select * from {{ source('toronto_housing', 'fact_crime') }}
+),
+
+staged as (
+    select
+        id as crime_id,
+        neighbourhood_id,
+        year as crime_year,
+        crime_type,
+        count as incident_count,
+        rate_per_100k
+    from source
+)
+
+select * from staged
--- a/dbt/models/staging/stg_toronto__neighbourhoods.sql
+++ b/dbt/models/staging/stg_toronto__neighbourhoods.sql
@@ -0,0 +1,25 @@
+-- Staged Toronto neighbourhood dimension
+-- Source: dim_neighbourhood table
+-- Grain: One row per neighbourhood (158 total)
+
+with source as (
+    select * from {{ source('toronto_housing', 'dim_neighbourhood') }}
+),
+
+staged as (
+    select
+        neighbourhood_id,
+        name as neighbourhood_name,
+        geometry,
+        population,
+        land_area_sqkm,
+        pop_density_per_sqkm,
+        pct_bachelors_or_higher,
+        median_household_income,
+        pct_owner_occupied,
+        pct_renter_occupied,
+        census_year
+    from source
+)
+
+select * from staged
--- a/dbt/package-lock.yml
+++ b/dbt/package-lock.yml
@@ -0,0 +1,11 @@
+packages:
+  - name: dbt_utils
+    package: dbt-labs/dbt_utils
+    version: 1.3.3
+  - name: dbt_expectations
+    package: calogica/dbt_expectations
+    version: 0.10.4
+  - name: dbt_date
+    package: calogica/dbt_date
+    version: 0.10.1
+sha1_hash: 51a51ab489f7b302c8745ae3c3781271816b01be
--- a/docs/project-lessons-learned/INDEX.md
+++ b/docs/project-lessons-learned/INDEX.md
@@ -0,0 +1,50 @@
+# Project Lessons Learned
+
+This folder contains lessons learned from sprints and development work. These lessons help prevent repeating mistakes and capture valuable insights.
+
+**Note:** This is a temporary local backup while Wiki.js integration is being configured. Once Wiki.js is ready, lessons will be migrated there for better searchability.
+
+---
+
+## Lessons Index
+
+| Date | Sprint/Phase | Title | Tags |
+|------|--------------|-------|------|
+| 2026-01-16 | Phase 4 | [dbt Test Syntax Deprecation](./phase-4-dbt-test-syntax.md) | dbt, testing, yaml, deprecation |
+
+---
+
+## How to Use
+
+### When Starting a Sprint
+1. Review relevant lessons in this folder before implementation
+2. Search by tags or keywords to find applicable insights
+3. Apply prevention strategies from past lessons
+
+### When Closing a Sprint
+1. Document any significant lessons learned
+2. Use the template below
+3. Add entry to the index table above
+
+---
+
+## Lesson Template
+
+```markdown
+# [Sprint/Phase] - [Lesson Title]
+
+## Context
+[What were you trying to do?]
+
+## Problem
+[What went wrong or what insight emerged?]
+
+## Solution
+[How did you solve it?]
+
+## Prevention
+[How can this be avoided in future sprints?]
+
+## Tags
+[Comma-separated tags for search]
+```
--- a/docs/project-lessons-learned/phase-4-dbt-test-syntax.md
+++ b/docs/project-lessons-learned/phase-4-dbt-test-syntax.md
@@ -0,0 +1,38 @@
+# Phase 4 - dbt Test Syntax Deprecation
+
+## Context
+Implementing dbt mart models with `accepted_values` tests for tier columns (safety_tier, income_quintile, amenity_tier) that should only contain values 1-5.
+
+## Problem
+dbt 1.9+ introduced a deprecation warning for generic test arguments. The old syntax:
+
+```yaml
+tests:
+  - accepted_values:
+      values: [1, 2, 3, 4, 5]
+```
+
+Produces deprecation warnings:
+```
+MissingArgumentsPropertyInGenericTestDeprecation: Arguments to generic tests should be nested under the `arguments` property.
+```
+
+## Solution
+Nest test arguments under the `arguments` property:
+
+```yaml
+tests:
+  - accepted_values:
+      arguments:
+        values: [1, 2, 3, 4, 5]
+```
+
+This applies to all generic tests with arguments, not just `accepted_values`.
+
+## Prevention
+- When writing dbt schema YAML files, always use the `arguments:` nesting for generic tests
+- Run `dbt parse --no-partial-parse` to catch all deprecation warnings before they become errors
+- Check dbt changelog when upgrading versions for breaking changes to test syntax
+
+## Tags
+dbt, testing, yaml, deprecation, syntax, schema
--- a/portfolio_app/toronto/loaders/init.py
+++ b/portfolio_app/toronto/loaders/init.py
@@ -1,7 +1,15 @@
 """Database loaders for Toronto housing data."""

+from .amenities import load_amenities, load_amenity_counts
 from .base import bulk_insert, get_session, upsert_by_key
+from .census import load_census_data
 from .cmhc import load_cmhc_record, load_cmhc_rentals
+from .cmhc_crosswalk import (
+    build_cmhc_neighbourhood_crosswalk,
+    disaggregate_zone_value,
+    get_neighbourhood_weights_for_zone,
+)
+from .crime import load_crime_data
 from .dimensions import (
    generate_date_key,
    load_cmhc_zones,
@@ -24,4 +32,13 @@ __all__ = [
    # Fact loaders
    "load_cmhc_rentals",
    "load_cmhc_record",
+    # Phase 3 loaders
+    "load_census_data",
+    "load_crime_data",
+    "load_amenities",
+    "load_amenity_counts",
+    # CMHC crosswalk
+    "build_cmhc_neighbourhood_crosswalk",
+    "get_neighbourhood_weights_for_zone",
+    "disaggregate_zone_value",
 ]
--- a/portfolio_app/toronto/loaders/amenities.py
+++ b/portfolio_app/toronto/loaders/amenities.py
@@ -0,0 +1,93 @@
+"""Loader for amenities data to fact_amenities table."""
+
+from collections import Counter
+
+from sqlalchemy.orm import Session
+
+from portfolio_app.toronto.models import FactAmenities
+from portfolio_app.toronto.schemas import AmenityCount, AmenityRecord
+
+from .base import get_session, upsert_by_key
+
+
+def load_amenities(
+    records: list[AmenityRecord],
+    year: int,
+    session: Session | None = None,
+) -> int:
+    """Load amenity records to fact_amenities table.
+
+    Aggregates individual amenity records into counts by neighbourhood
+    and amenity type before loading.
+
+    Args:
+        records: List of validated AmenityRecord schemas.
+        year: Year to associate with the amenity counts.
+        session: Optional existing session.
+
+    Returns:
+        Number of records loaded (inserted + updated).
+    """
+    # Aggregate records by neighbourhood and amenity type
+    counts: Counter[tuple[int, str]] = Counter()
+    for r in records:
+        key = (r.neighbourhood_id, r.amenity_type.value)
+        counts[key] += 1
+
+    # Convert to AmenityCount schemas then to models
+    def _load(sess: Session) -> int:
+        models = []
+        for (neighbourhood_id, amenity_type), count in counts.items():
+            model = FactAmenities(
+                neighbourhood_id=neighbourhood_id,
+                amenity_type=amenity_type,
+                count=count,
+                year=year,
+            )
+            models.append(model)
+
+        inserted, updated = upsert_by_key(
+            sess, FactAmenities, models, ["neighbourhood_id", "amenity_type", "year"]
+        )
+        return inserted + updated
+
+    if session:
+        return _load(session)
+    with get_session() as sess:
+        return _load(sess)
+
+
+def load_amenity_counts(
+    records: list[AmenityCount],
+    session: Session | None = None,
+) -> int:
+    """Load pre-aggregated amenity counts to fact_amenities table.
+
+    Args:
+        records: List of validated AmenityCount schemas.
+        session: Optional existing session.
+
+    Returns:
+        Number of records loaded (inserted + updated).
+    """
+
+    def _load(sess: Session) -> int:
+        models = []
+        for r in records:
+            model = FactAmenities(
+                neighbourhood_id=r.neighbourhood_id,
+                amenity_type=r.amenity_type.value,
+                count=r.count,
+                year=r.year,
+            )
+            models.append(model)
+
+        inserted, updated = upsert_by_key(
+            sess, FactAmenities, models, ["neighbourhood_id", "amenity_type", "year"]
+        )
+        return inserted + updated
+
+    if session:
+        return _load(session)
+    with get_session() as sess:
+        return _load(sess)
--- a/portfolio_app/toronto/loaders/census.py
+++ b/portfolio_app/toronto/loaders/census.py
@@ -0,0 +1,68 @@
+"""Loader for census data to fact_census table."""
+
+from sqlalchemy.orm import Session
+
+from portfolio_app.toronto.models import FactCensus
+from portfolio_app.toronto.schemas import CensusRecord
+
+from .base import get_session, upsert_by_key
+
+
+def load_census_data(
+    records: list[CensusRecord],
+    session: Session | None = None,
+) -> int:
+    """Load census records to fact_census table.
+
+    Args:
+        records: List of validated CensusRecord schemas.
+        session: Optional existing session.
+
+    Returns:
+        Number of records loaded (inserted + updated).
+    """
+
+    def _load(sess: Session) -> int:
+        models = []
+        for r in records:
+            model = FactCensus(
+                neighbourhood_id=r.neighbourhood_id,
+                census_year=r.census_year,
+                population=r.population,
+                population_density=float(r.population_density)
+                if r.population_density
+                else None,
+                median_household_income=float(r.median_household_income)
+                if r.median_household_income
+                else None,
+                average_household_income=float(r.average_household_income)
+                if r.average_household_income
+                else None,
+                unemployment_rate=float(r.unemployment_rate)
+                if r.unemployment_rate
+                else None,
+                pct_bachelors_or_higher=float(r.pct_bachelors_or_higher)
+                if r.pct_bachelors_or_higher
+                else None,
+                pct_owner_occupied=float(r.pct_owner_occupied)
+                if r.pct_owner_occupied
+                else None,
+                pct_renter_occupied=float(r.pct_renter_occupied)
+                if r.pct_renter_occupied
+                else None,
+                median_age=float(r.median_age) if r.median_age else None,
+                average_dwelling_value=float(r.average_dwelling_value)
+                if r.average_dwelling_value
+                else None,
+            )
+            models.append(model)
+
+        inserted, updated = upsert_by_key(
+            sess, FactCensus, models, ["neighbourhood_id", "census_year"]
+        )
+        return inserted + updated
+
+    if session:
+        return _load(session)
+    with get_session() as sess:
+        return _load(sess)
--- a/portfolio_app/toronto/loaders/cmhc_crosswalk.py
+++ b/portfolio_app/toronto/loaders/cmhc_crosswalk.py
@@ -0,0 +1,131 @@
+"""Loader for CMHC zone to neighbourhood crosswalk with area weights."""
+
+from sqlalchemy import text
+from sqlalchemy.orm import Session
+
+from .base import get_session
+
+
+def build_cmhc_neighbourhood_crosswalk(
+    session: Session | None = None,
+) -> int:
+    """Calculate area overlap weights between CMHC zones and neighbourhoods.
+
+    Uses PostGIS ST_Intersection and ST_Area functions to compute the
+    proportion of each CMHC zone that overlaps with each neighbourhood.
+    This enables disaggregation of CMHC zone-level data to neighbourhood level.
+
+    The function is idempotent - it clears existing crosswalk data before
+    rebuilding.
+
+    Args:
+        session: Optional existing session.
+
+    Returns:
+        Number of bridge records created.
+
+    Note:
+        Requires both dim_cmhc_zone and dim_neighbourhood tables to have
+        geometry columns populated with valid PostGIS geometries.
+    """
+
+    def _build(sess: Session) -> int:
+        # Clear existing crosswalk data
+        sess.execute(text("DELETE FROM bridge_cmhc_neighbourhood"))
+
+        # Calculate overlap weights using PostGIS
+        # Weight = area of intersection / total area of CMHC zone
+        crosswalk_query = text(
+            """
+            INSERT INTO bridge_cmhc_neighbourhood (cmhc_zone_code, neighbourhood_id, weight)
+            SELECT
+                z.zone_code,
+                n.neighbourhood_id,
+                CASE
+                    WHEN ST_Area(z.geometry::geography) > 0 THEN
+                        ST_Area(ST_Intersection(z.geometry, n.geometry)::geography) /
+                        ST_Area(z.geometry::geography)
+                    ELSE 0
+                END as weight
+            FROM dim_cmhc_zone z
+            JOIN dim_neighbourhood n
+                ON ST_Intersects(z.geometry, n.geometry)
+            WHERE
+                z.geometry IS NOT NULL
+                AND n.geometry IS NOT NULL
+                AND ST_Area(ST_Intersection(z.geometry, n.geometry)::geography) > 0
+        """
+        )
+
+        sess.execute(crosswalk_query)
+
+        # Count records created
+        count_result = sess.execute(
+            text("SELECT COUNT(*) FROM bridge_cmhc_neighbourhood")
+        )
+        count = count_result.scalar() or 0
+
+        return int(count)
+
+    if session:
+        return _build(session)
+    with get_session() as sess:
+        return _build(sess)
+
+
+def get_neighbourhood_weights_for_zone(
+    zone_code: str,
+    session: Session | None = None,
+) -> list[tuple[int, float]]:
+    """Get neighbourhood weights for a specific CMHC zone.
+
+    Args:
+        zone_code: CMHC zone code.
+        session: Optional existing session.
+
+    Returns:
+        List of (neighbourhood_id, weight) tuples.
+    """
+
+    def _get(sess: Session) -> list[tuple[int, float]]:
+        result = sess.execute(
+            text(
+                """
+                SELECT neighbourhood_id, weight
+                FROM bridge_cmhc_neighbourhood
+                WHERE cmhc_zone_code = :zone_code
+                ORDER BY weight DESC
+            """
+            ),
+            {"zone_code": zone_code},
+        )
+        return [(int(row[0]), float(row[1])) for row in result]
+
+    if session:
+        return _get(session)
+    with get_session() as sess:
+        return _get(sess)
+
+
+def disaggregate_zone_value(
+    zone_code: str,
+    value: float,
+    session: Session | None = None,
+) -> dict[int, float]:
+    """Disaggregate a CMHC zone value to neighbourhoods using weights.
+
+    Args:
+        zone_code: CMHC zone code.
+        value: Value to disaggregate (e.g., average rent).
+        session: Optional existing session.
+
+    Returns:
+        Dictionary mapping neighbourhood_id to weighted value.
+
+    Note:
+        For averages (like rent), the weighted value represents the
+        contribution from this zone. To get a neighbourhood's total,
+        sum contributions from all overlapping zones.
+    """
+    weights = get_neighbourhood_weights_for_zone(zone_code, session)
+    return {neighbourhood_id: value * weight for neighbourhood_id, weight in weights}
--- a/portfolio_app/toronto/loaders/crime.py
+++ b/portfolio_app/toronto/loaders/crime.py
@@ -0,0 +1,45 @@
+"""Loader for crime data to fact_crime table."""
+
+from sqlalchemy.orm import Session
+
+from portfolio_app.toronto.models import FactCrime
+from portfolio_app.toronto.schemas import CrimeRecord
+
+from .base import get_session, upsert_by_key
+
+
+def load_crime_data(
+    records: list[CrimeRecord],
+    session: Session | None = None,
+) -> int:
+    """Load crime records to fact_crime table.
+
+    Args:
+        records: List of validated CrimeRecord schemas.
+        session: Optional existing session.
+
+    Returns:
+        Number of records loaded (inserted + updated).
+    """
+
+    def _load(sess: Session) -> int:
+        models = []
+        for r in records:
+            model = FactCrime(
+                neighbourhood_id=r.neighbourhood_id,
+                year=r.year,
+                crime_type=r.crime_type.value,
+                count=r.count,
+                rate_per_100k=float(r.rate_per_100k) if r.rate_per_100k else None,
+            )
+            models.append(model)
+
+        inserted, updated = upsert_by_key(
+            sess, FactCrime, models, ["neighbourhood_id", "year", "crime_type"]
+        )
+        return inserted + updated
+
+    if session:
+        return _load(session)
+    with get_session() as sess:
+        return _load(sess)
--- a/portfolio_app/toronto/models/init.py
+++ b/portfolio_app/toronto/models/init.py
@@ -7,7 +7,13 @@ from .dimensions import (
    DimPolicyEvent,
    DimTime,
 )
-from .facts import FactRentals
+from .facts import (
+    BridgeCMHCNeighbourhood,
+    FactAmenities,
+    FactCensus,
+    FactCrime,
+    FactRentals,
+)

 __all__ = [
    # Base
@@ -22,4 +28,9 @@ __all__ = [
    "DimPolicyEvent",
    # Facts
    "FactRentals",
+    "FactCensus",
+    "FactCrime",
+    "FactAmenities",
+    # Bridge tables
+    "BridgeCMHCNeighbourhood",
 ]
--- a/portfolio_app/toronto/models/facts.py
+++ b/portfolio_app/toronto/models/facts.py
@@ -1,11 +1,117 @@
 """SQLAlchemy models for fact tables."""

-from sqlalchemy import ForeignKey, Integer, Numeric, String
+from sqlalchemy import ForeignKey, Index, Integer, Numeric, String
 from sqlalchemy.orm import Mapped, mapped_column, relationship

 from .base import Base


+class BridgeCMHCNeighbourhood(Base):
+    """Bridge table for CMHC zone to neighbourhood mapping with area weights.
+
+    Enables disaggregation of CMHC zone-level rental data to neighbourhood level
+    using area-based proportional weights computed via PostGIS.
+    """
+
+    __tablename__ = "bridge_cmhc_neighbourhood"
+
+    id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
+    cmhc_zone_code: Mapped[str] = mapped_column(String(10), nullable=False)
+    neighbourhood_id: Mapped[int] = mapped_column(Integer, nullable=False)
+    weight: Mapped[float] = mapped_column(
+        Numeric(5, 4), nullable=False
+    )  # 0.0000 to 1.0000
+
+    __table_args__ = (
+        Index("ix_bridge_cmhc_zone", "cmhc_zone_code"),
+        Index("ix_bridge_neighbourhood", "neighbourhood_id"),
+    )
+
+
+class FactCensus(Base):
+    """Census statistics by neighbourhood and year.
+
+    Grain: One row per neighbourhood per census year.
+    """
+
+    __tablename__ = "fact_census"
+
+    id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
+    neighbourhood_id: Mapped[int] = mapped_column(Integer, nullable=False)
+    census_year: Mapped[int] = mapped_column(Integer, nullable=False)
+    population: Mapped[int | None] = mapped_column(Integer, nullable=True)
+    population_density: Mapped[float | None] = mapped_column(
+        Numeric(10, 2), nullable=True
+    )
+    median_household_income: Mapped[float | None] = mapped_column(
+        Numeric(12, 2), nullable=True
+    )
+    average_household_income: Mapped[float | None] = mapped_column(
+        Numeric(12, 2), nullable=True
+    )
+    unemployment_rate: Mapped[float | None] = mapped_column(
+        Numeric(5, 2), nullable=True
+    )
+    pct_bachelors_or_higher: Mapped[float | None] = mapped_column(
+        Numeric(5, 2), nullable=True
+    )
+    pct_owner_occupied: Mapped[float | None] = mapped_column(
+        Numeric(5, 2), nullable=True
+    )
+    pct_renter_occupied: Mapped[float | None] = mapped_column(
+        Numeric(5, 2), nullable=True
+    )
+    median_age: Mapped[float | None] = mapped_column(Numeric(5, 2), nullable=True)
+    average_dwelling_value: Mapped[float | None] = mapped_column(
+        Numeric(12, 2), nullable=True
+    )
+
+    __table_args__ = (
+        Index("ix_fact_census_neighbourhood_year", "neighbourhood_id", "census_year"),
+    )
+
+
+class FactCrime(Base):
+    """Crime statistics by neighbourhood and year.
+
+    Grain: One row per neighbourhood per year per crime type.
+    """
+
+    __tablename__ = "fact_crime"
+
+    id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
+    neighbourhood_id: Mapped[int] = mapped_column(Integer, nullable=False)
+    year: Mapped[int] = mapped_column(Integer, nullable=False)
+    crime_type: Mapped[str] = mapped_column(String(50), nullable=False)
+    count: Mapped[int] = mapped_column(Integer, nullable=False)
+    rate_per_100k: Mapped[float | None] = mapped_column(Numeric(10, 2), nullable=True)
+
+    __table_args__ = (
+        Index("ix_fact_crime_neighbourhood_year", "neighbourhood_id", "year"),
+        Index("ix_fact_crime_type", "crime_type"),
+    )
+
+
+class FactAmenities(Base):
+    """Amenity counts by neighbourhood.
+
+    Grain: One row per neighbourhood per amenity type per year.
+    """
+
+    __tablename__ = "fact_amenities"
+
+    id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
+    neighbourhood_id: Mapped[int] = mapped_column(Integer, nullable=False)
+    amenity_type: Mapped[str] = mapped_column(String(50), nullable=False)
+    count: Mapped[int] = mapped_column(Integer, nullable=False)
+    year: Mapped[int] = mapped_column(Integer, nullable=False)
+
+    __table_args__ = (
+        Index("ix_fact_amenities_neighbourhood_year", "neighbourhood_id", "year"),
+        Index("ix_fact_amenities_type", "amenity_type"),
+    )
+
+
 class FactRentals(Base):
    """Fact table for CMHC rental market data.

--- a/portfolio_app/toronto/parsers/init.py
+++ b/portfolio_app/toronto/parsers/init.py
@@ -6,6 +6,8 @@ from .geo import (
    NeighbourhoodParser,
    load_geojson,
 )
+from .toronto_open_data import TorontoOpenDataParser
+from .toronto_police import TorontoPoliceParser

 __all__ = [
    "CMHCParser",
@@ -13,4 +15,7 @@ __all__ = [
    "CMHCZoneParser",
    "NeighbourhoodParser",
    "load_geojson",
+    # API parsers (Phase 3)
+    "TorontoOpenDataParser",
+    "TorontoPoliceParser",
 ]
--- a/portfolio_app/toronto/parsers/toronto_open_data.py
+++ b/portfolio_app/toronto/parsers/toronto_open_data.py
@@ -0,0 +1,391 @@
+"""Parser for Toronto Open Data CKAN API.
+
+Fetches neighbourhood boundaries, census profiles, and amenities data
+from the City of Toronto's Open Data Portal.
+
+API Documentation: https://open.toronto.ca/dataset/
+"""
+
+import json
+import logging
+from decimal import Decimal
+from pathlib import Path
+from typing import Any
+
+import httpx
+
+from portfolio_app.toronto.schemas import (
+    AmenityRecord,
+    AmenityType,
+    CensusRecord,
+    NeighbourhoodRecord,
+)
+
+logger = logging.getLogger(__name__)
+
+
+class TorontoOpenDataParser:
+    """Parser for Toronto Open Data CKAN API.
+
+    Provides methods to fetch and parse neighbourhood boundaries, census profiles,
+    and amenities (parks, schools, childcare) from the Toronto Open Data portal.
+    """
+
+    BASE_URL = "https://ckan0.cf.opendata.inter.prod-toronto.ca"
+    API_PATH = "/api/3/action"
+
+    # Dataset package IDs
+    DATASETS = {
+        "neighbourhoods": "neighbourhoods",
+        "neighbourhood_profiles": "neighbourhood-profiles",
+        "parks": "parks",
+        "schools": "school-locations-all-types",
+        "childcare": "licensed-child-care-centres",
+    }
+
+    def __init__(
+        self,
+        cache_dir: Path | None = None,
+        timeout: float = 30.0,
+    ) -> None:
+        """Initialize parser.
+
+        Args:
+            cache_dir: Optional directory for caching API responses.
+            timeout: HTTP request timeout in seconds.
+        """
+        self._cache_dir = cache_dir
+        self._timeout = timeout
+        self._client: httpx.Client | None = None
+
+    @property
+    def client(self) -> httpx.Client:
+        """Lazy-initialize HTTP client."""
+        if self._client is None:
+            self._client = httpx.Client(
+                base_url=self.BASE_URL,
+                timeout=self._timeout,
+                headers={"Accept": "application/json"},
+            )
+        return self._client
+
+    def close(self) -> None:
+        """Close HTTP client."""
+        if self._client is not None:
+            self._client.close()
+            self._client = None
+
+    def __enter__(self) -> "TorontoOpenDataParser":
+        return self
+
+    def __exit__(self, *args: Any) -> None:
+        self.close()
+
+    def _get_package(self, package_id: str) -> dict[str, Any]:
+        """Fetch package metadata from CKAN.
+
+        Args:
+            package_id: The package/dataset ID.
+
+        Returns:
+            Package metadata dictionary.
+        """
+        response = self.client.get(
+            f"{self.API_PATH}/package_show",
+            params={"id": package_id},
+        )
+        response.raise_for_status()
+        result = response.json()
+
+        if not result.get("success"):
+            raise ValueError(f"CKAN API error: {result.get('error', 'Unknown error')}")
+
+        return dict(result["result"])
+
+    def _get_resource_url(
+        self,
+        package_id: str,
+        format_filter: str = "geojson",
+    ) -> str:
+        """Get the download URL for a resource in a package.
+
+        Args:
+            package_id: The package/dataset ID.
+            format_filter: Resource format to filter by (e.g., 'geojson', 'csv').
+
+        Returns:
+            Resource download URL.
+
+        Raises:
+            ValueError: If no matching resource is found.
+        """
+        package = self._get_package(package_id)
+        resources = package.get("resources", [])
+
+        for resource in resources:
+            resource_format = resource.get("format", "").lower()
+            if format_filter.lower() in resource_format:
+                return str(resource["url"])
+
+        available = [r.get("format") for r in resources]
+        raise ValueError(
+            f"No {format_filter} resource in {package_id}. Available: {available}"
+        )
+
+    def _fetch_geojson(self, package_id: str) -> dict[str, Any]:
+        """Fetch GeoJSON data from a package.
+
+        Args:
+            package_id: The package/dataset ID.
+
+        Returns:
+            GeoJSON FeatureCollection.
+        """
+        # Check cache first
+        if self._cache_dir:
+            cache_file = self._cache_dir / f"{package_id}.geojson"
+            if cache_file.exists():
+                logger.debug(f"Loading {package_id} from cache")
+                with open(cache_file, encoding="utf-8") as f:
+                    return dict(json.load(f))
+
+        url = self._get_resource_url(package_id, format_filter="geojson")
+        logger.info(f"Fetching GeoJSON from {url}")
+
+        response = self.client.get(url)
+        response.raise_for_status()
+        data = response.json()
+
+        # Cache the response
+        if self._cache_dir:
+            self._cache_dir.mkdir(parents=True, exist_ok=True)
+            cache_file = self._cache_dir / f"{package_id}.geojson"
+            with open(cache_file, "w", encoding="utf-8") as f:
+                json.dump(data, f)
+
+        return dict(data)
+
+    def _fetch_csv_as_json(self, package_id: str) -> list[dict[str, Any]]:
+        """Fetch CSV data as JSON records via CKAN datastore.
+
+        Args:
+            package_id: The package/dataset ID.
+
+        Returns:
+            List of records as dictionaries.
+        """
+        package = self._get_package(package_id)
+        resources = package.get("resources", [])
+
+        # Find a datastore-enabled resource
+        for resource in resources:
+            if resource.get("datastore_active"):
+                resource_id = resource["id"]
+                break
+        else:
+            raise ValueError(f"No datastore resource in {package_id}")
+
+        # Fetch all records via datastore_search
+        records: list[dict[str, Any]] = []
+        offset = 0
+        limit = 1000
+
+        while True:
+            response = self.client.get(
+                f"{self.API_PATH}/datastore_search",
+                params={"id": resource_id, "limit": limit, "offset": offset},
+            )
+            response.raise_for_status()
+            result = response.json()
+
+            if not result.get("success"):
+                raise ValueError(f"Datastore error: {result.get('error')}")
+
+            batch = result["result"]["records"]
+            records.extend(batch)
+
+            if len(batch) < limit:
+                break
+            offset += limit
+
+        return records
+
+    def get_neighbourhoods(self) -> list[NeighbourhoodRecord]:
+        """Fetch 158 Toronto neighbourhood boundaries.
+
+        Returns:
+            List of validated NeighbourhoodRecord objects.
+        """
+        geojson = self._fetch_geojson(self.DATASETS["neighbourhoods"])
+        features = geojson.get("features", [])
+
+        records = []
+        for feature in features:
+            props = feature.get("properties", {})
+            geometry = feature.get("geometry")
+
+            # Extract area_id from various possible property names
+            area_id = props.get("AREA_ID") or props.get("area_id")
+            if area_id is None:
+                # Try AREA_SHORT_CODE as fallback
+                short_code = props.get("AREA_SHORT_CODE", "")
+                if short_code:
+                    # Extract numeric part
+                    area_id = int("".join(c for c in short_code if c.isdigit()) or "0")
+
+            area_name = (
+                props.get("AREA_NAME")
+                or props.get("area_name")
+                or f"Neighbourhood {area_id}"
+            )
+            area_short_code = props.get("AREA_SHORT_CODE") or props.get(
+                "area_short_code"
+            )
+
+            records.append(
+                NeighbourhoodRecord(
+                    area_id=int(area_id),
+                    area_name=str(area_name),
+                    area_short_code=area_short_code,
+                    geometry=geometry,
+                )
+            )
+
+        logger.info(f"Parsed {len(records)} neighbourhoods")
+        return records
+
+    def get_census_profiles(self, year: int = 2021) -> list[CensusRecord]:
+        """Fetch neighbourhood census profiles.
+
+        Note: Census profile data structure varies by year. This method
+        extracts key demographic indicators where available.
+
+        Args:
+            year: Census year (2016 or 2021).
+
+        Returns:
+            List of validated CensusRecord objects.
+        """
+        # Census profiles are typically in CSV/datastore format
+        try:
+            raw_records = self._fetch_csv_as_json(
+                self.DATASETS["neighbourhood_profiles"]
+            )
+        except ValueError as e:
+            logger.warning(f"Could not fetch census profiles: {e}")
+            return []
+
+        # Census profiles are pivoted - rows are indicators, columns are neighbourhoods
+        # This requires special handling based on the actual data structure
+        logger.info(f"Fetched {len(raw_records)} census profile rows")
+
+        # For now, return empty list - actual implementation depends on data structure
+        # TODO: Implement census profile parsing based on actual data format
+        return []
+
+    def get_parks(self) -> list[AmenityRecord]:
+        """Fetch park locations.
+
+        Returns:
+            List of validated AmenityRecord objects.
+        """
+        return self._fetch_amenities(
+            self.DATASETS["parks"],
+            AmenityType.PARK,
+            name_field="ASSET_NAME",
+            address_field="ADDRESS_FULL",
+        )
+
+    def get_schools(self) -> list[AmenityRecord]:
+        """Fetch school locations.
+
+        Returns:
+            List of validated AmenityRecord objects.
+        """
+        return self._fetch_amenities(
+            self.DATASETS["schools"],
+            AmenityType.SCHOOL,
+            name_field="NAME",
+            address_field="ADDRESS_FULL",
+        )
+
+    def get_childcare_centres(self) -> list[AmenityRecord]:
+        """Fetch licensed childcare centre locations.
+
+        Returns:
+            List of validated AmenityRecord objects.
+        """
+        return self._fetch_amenities(
+            self.DATASETS["childcare"],
+            AmenityType.CHILDCARE,
+            name_field="LOC_NAME",
+            address_field="ADDRESS",
+        )
+
+    def _fetch_amenities(
+        self,
+        package_id: str,
+        amenity_type: AmenityType,
+        name_field: str,
+        address_field: str,
+    ) -> list[AmenityRecord]:
+        """Fetch and parse amenity data from GeoJSON.
+
+        Args:
+            package_id: CKAN package ID.
+            amenity_type: Type of amenity.
+            name_field: Property name containing amenity name.
+            address_field: Property name containing address.
+
+        Returns:
+            List of AmenityRecord objects.
+        """
+        try:
+            geojson = self._fetch_geojson(package_id)
+        except (httpx.HTTPError, ValueError) as e:
+            logger.warning(f"Could not fetch {package_id}: {e}")
+            return []
+
+        features = geojson.get("features", [])
+        records = []
+
+        for feature in features:
+            props = feature.get("properties", {})
+            geometry = feature.get("geometry")
+
+            # Get coordinates from geometry
+            lat, lon = None, None
+            if geometry and geometry.get("type") == "Point":
+                coords = geometry.get("coordinates", [])
+                if len(coords) >= 2:
+                    lon, lat = coords[0], coords[1]
+
+            # Try to determine neighbourhood_id
+            # Many datasets include AREA_ID or similar
+            neighbourhood_id = (
+                props.get("AREA_ID")
+                or props.get("area_id")
+                or props.get("NEIGHBOURHOOD_ID")
+                or 0  # Will need spatial join if not available
+            )
+
+            name = props.get(name_field) or props.get(name_field.lower()) or "Unknown"
+            address = props.get(address_field) or props.get(address_field.lower())
+
+            # Skip if we don't have a neighbourhood assignment
+            if neighbourhood_id == 0:
+                continue
+
+            records.append(
+                AmenityRecord(
+                    neighbourhood_id=int(neighbourhood_id),
+                    amenity_type=amenity_type,
+                    amenity_name=str(name)[:200],
+                    address=str(address)[:300] if address else None,
+                    latitude=Decimal(str(lat)) if lat else None,
+                    longitude=Decimal(str(lon)) if lon else None,
+                )
+            )
+
+        logger.info(f"Parsed {len(records)} {amenity_type.value} records")
+        return records
--- a/portfolio_app/toronto/parsers/toronto_police.py
+++ b/portfolio_app/toronto/parsers/toronto_police.py
@@ -0,0 +1,371 @@
+"""Parser for Toronto Police crime data via CKAN API.
+
+Fetches neighbourhood crime rates and major crime indicators from the
+Toronto Police Service data hosted on Toronto Open Data Portal.
+
+Data Sources:
+- Neighbourhood Crime Rates: Annual crime rates by neighbourhood
+- Major Crime Indicators (MCI): Detailed incident-level data
+"""
+
+import contextlib
+import logging
+from decimal import Decimal
+from typing import Any
+
+import httpx
+
+from portfolio_app.toronto.schemas import CrimeRecord, CrimeType
+
+logger = logging.getLogger(__name__)
+
+
+# Mapping from Toronto Police crime categories to CrimeType enum
+CRIME_TYPE_MAPPING: dict[str, CrimeType] = {
+    "assault": CrimeType.ASSAULT,
+    "assaults": CrimeType.ASSAULT,
+    "auto theft": CrimeType.AUTO_THEFT,
+    "autotheft": CrimeType.AUTO_THEFT,
+    "auto_theft": CrimeType.AUTO_THEFT,
+    "break and enter": CrimeType.BREAK_AND_ENTER,
+    "breakenter": CrimeType.BREAK_AND_ENTER,
+    "break_and_enter": CrimeType.BREAK_AND_ENTER,
+    "homicide": CrimeType.HOMICIDE,
+    "homicides": CrimeType.HOMICIDE,
+    "robbery": CrimeType.ROBBERY,
+    "robberies": CrimeType.ROBBERY,
+    "shooting": CrimeType.SHOOTING,
+    "shootings": CrimeType.SHOOTING,
+    "theft over": CrimeType.THEFT_OVER,
+    "theftover": CrimeType.THEFT_OVER,
+    "theft_over": CrimeType.THEFT_OVER,
+    "theft from motor vehicle": CrimeType.THEFT_FROM_MOTOR_VEHICLE,
+    "theftfrommv": CrimeType.THEFT_FROM_MOTOR_VEHICLE,
+    "theft_from_mv": CrimeType.THEFT_FROM_MOTOR_VEHICLE,
+}
+
+
+def _normalize_crime_type(crime_str: str) -> CrimeType:
+    """Normalize crime type string to CrimeType enum.
+
+    Args:
+        crime_str: Raw crime type string from data source.
+
+    Returns:
+        Matched CrimeType enum value, or CrimeType.OTHER if no match.
+    """
+    normalized = crime_str.lower().strip().replace("-", " ").replace("_", " ")
+    return CRIME_TYPE_MAPPING.get(normalized, CrimeType.OTHER)
+
+
+class TorontoPoliceParser:
+    """Parser for Toronto Police crime data via CKAN API.
+
+    Crime data is hosted on Toronto Open Data Portal but sourced from
+    Toronto Police Service.
+    """
+
+    BASE_URL = "https://ckan0.cf.opendata.inter.prod-toronto.ca"
+    API_PATH = "/api/3/action"
+
+    # Dataset package IDs
+    DATASETS = {
+        "crime_rates": "neighbourhood-crime-rates",
+        "mci": "major-crime-indicators",
+        "shootings": "shootings-firearm-discharges",
+    }
+
+    def __init__(self, timeout: float = 30.0) -> None:
+        """Initialize parser.
+
+        Args:
+            timeout: HTTP request timeout in seconds.
+        """
+        self._timeout = timeout
+        self._client: httpx.Client | None = None
+
+    @property
+    def client(self) -> httpx.Client:
+        """Lazy-initialize HTTP client."""
+        if self._client is None:
+            self._client = httpx.Client(
+                base_url=self.BASE_URL,
+                timeout=self._timeout,
+                headers={"Accept": "application/json"},
+            )
+        return self._client
+
+    def close(self) -> None:
+        """Close HTTP client."""
+        if self._client is not None:
+            self._client.close()
+            self._client = None
+
+    def __enter__(self) -> "TorontoPoliceParser":
+        return self
+
+    def __exit__(self, *args: Any) -> None:
+        self.close()
+
+    def _get_package(self, package_id: str) -> dict[str, Any]:
+        """Fetch package metadata from CKAN."""
+        response = self.client.get(
+            f"{self.API_PATH}/package_show",
+            params={"id": package_id},
+        )
+        response.raise_for_status()
+        result = response.json()
+
+        if not result.get("success"):
+            raise ValueError(f"CKAN API error: {result.get('error', 'Unknown error')}")
+
+        return dict(result["result"])
+
+    def _fetch_datastore_records(
+        self,
+        package_id: str,
+        filters: dict[str, Any] | None = None,
+    ) -> list[dict[str, Any]]:
+        """Fetch records from CKAN datastore.
+
+        Args:
+            package_id: CKAN package ID.
+            filters: Optional filters to apply.
+
+        Returns:
+            List of records as dictionaries.
+        """
+        package = self._get_package(package_id)
+        resources = package.get("resources", [])
+
+        # Find datastore-enabled resource
+        resource_id = None
+        for resource in resources:
+            if resource.get("datastore_active"):
+                resource_id = resource["id"]
+                break
+
+        if not resource_id:
+            raise ValueError(f"No datastore resource in {package_id}")
+
+        # Fetch all records
+        records: list[dict[str, Any]] = []
+        offset = 0
+        limit = 1000
+
+        while True:
+            params: dict[str, Any] = {
+                "id": resource_id,
+                "limit": limit,
+                "offset": offset,
+            }
+            if filters:
+                params["filters"] = str(filters)
+
+            response = self.client.get(
+                f"{self.API_PATH}/datastore_search",
+                params=params,
+            )
+            response.raise_for_status()
+            result = response.json()
+
+            if not result.get("success"):
+                raise ValueError(f"Datastore error: {result.get('error')}")
+
+            batch = result["result"]["records"]
+            records.extend(batch)
+
+            if len(batch) < limit:
+                break
+            offset += limit
+
+        return records
+
+    def get_crime_rates(
+        self,
+        years: list[int] | None = None,
+    ) -> list[CrimeRecord]:
+        """Fetch neighbourhood crime rates.
+
+        The crime rates dataset contains annual counts and rates per 100k
+        population for each neighbourhood.
+
+        Args:
+            years: Optional list of years to filter. If None, fetches all.
+
+        Returns:
+            List of validated CrimeRecord objects.
+        """
+        try:
+            raw_records = self._fetch_datastore_records(self.DATASETS["crime_rates"])
+        except (httpx.HTTPError, ValueError) as e:
+            logger.warning(f"Could not fetch crime rates: {e}")
+            return []
+
+        records = []
+
+        for row in raw_records:
+            # Extract neighbourhood ID (Hood_ID maps to AREA_ID)
+            hood_id = row.get("HOOD_ID") or row.get("Hood_ID") or row.get("hood_id")
+            if not hood_id:
+                continue
+
+            try:
+                neighbourhood_id = int(hood_id)
+            except (ValueError, TypeError):
+                continue
+
+            # Crime rate data typically has columns like:
+            # ASSAULT_2019, ASSAULT_RATE_2019, AUTOTHEFT_2020, etc.
+            # We need to parse column names to extract crime type and year
+
+            for col_name, value in row.items():
+                if value is None or col_name in (
+                    "_id",
+                    "HOOD_ID",
+                    "Hood_ID",
+                    "hood_id",
+                    "AREA_NAME",
+                    "NEIGHBOURHOOD",
+                ):
+                    continue
+
+                # Try to parse column name for crime type and year
+                # Pattern: CRIMETYPE_YEAR or CRIMETYPE_RATE_YEAR
+                parts = col_name.upper().split("_")
+                if len(parts) < 2:
+                    continue
+
+                # Check if last part is a year
+                try:
+                    year = int(parts[-1])
+                    if year < 2014 or year > 2030:
+                        continue
+                except ValueError:
+                    continue
+
+                # Filter by years if specified
+                if years and year not in years:
+                    continue
+
+                # Check if this is a rate column
+                is_rate = "RATE" in parts
+
+                # Extract crime type (everything before RATE/year)
+                if is_rate:
+                    rate_idx = parts.index("RATE")
+                    crime_type_str = "_".join(parts[:rate_idx])
+                else:
+                    crime_type_str = "_".join(parts[:-1])
+
+                crime_type = _normalize_crime_type(crime_type_str)
+
+                try:
+                    numeric_value = Decimal(str(value))
+                except (ValueError, TypeError):
+                    continue
+
+                if is_rate:
+                    # This is a rate column - look for corresponding count
+                    # We'll skip rate-only entries and create records from counts
+                    continue
+
+                # Find corresponding rate if available
+                rate_col = f"{crime_type_str}_RATE_{year}"
+                rate_value = row.get(rate_col)
+                rate_per_100k = None
+                if rate_value is not None:
+                    with contextlib.suppress(ValueError, TypeError):
+                        rate_per_100k = Decimal(str(rate_value))
+
+                records.append(
+                    CrimeRecord(
+                        neighbourhood_id=neighbourhood_id,
+                        year=year,
+                        crime_type=crime_type,
+                        count=int(numeric_value),
+                        rate_per_100k=rate_per_100k,
+                    )
+                )
+
+        logger.info(f"Parsed {len(records)} crime rate records")
+        return records
+
+    def get_major_crime_indicators(
+        self,
+        years: list[int] | None = None,
+    ) -> list[CrimeRecord]:
+        """Fetch major crime indicators (detailed MCI data).
+
+        MCI data contains incident-level records that need to be aggregated
+        by neighbourhood and year.
+
+        Args:
+            years: Optional list of years to filter.
+
+        Returns:
+            List of aggregated CrimeRecord objects.
+        """
+        try:
+            raw_records = self._fetch_datastore_records(self.DATASETS["mci"])
+        except (httpx.HTTPError, ValueError) as e:
+            logger.warning(f"Could not fetch MCI data: {e}")
+            return []
+
+        # Aggregate counts by neighbourhood, year, and crime type
+        aggregates: dict[tuple[int, int, CrimeType], int] = {}
+
+        for row in raw_records:
+            # Extract neighbourhood ID
+            hood_id = (
+                row.get("HOOD_158")
+                or row.get("HOOD_140")
+                or row.get("HOOD_ID")
+                or row.get("Hood_ID")
+            )
+            if not hood_id:
+                continue
+
+            try:
+                neighbourhood_id = int(hood_id)
+            except (ValueError, TypeError):
+                continue
+
+            # Extract year from occurrence date
+            occ_year = row.get("OCC_YEAR") or row.get("REPORT_YEAR")
+            if not occ_year:
+                continue
+
+            try:
+                year = int(occ_year)
+                if year < 2014 or year > 2030:
+                    continue
+            except (ValueError, TypeError):
+                continue
+
+            # Filter by years if specified
+            if years and year not in years:
+                continue
+
+            # Extract crime type
+            mci_category = row.get("MCI_CATEGORY") or row.get("OFFENCE") or ""
+            crime_type = _normalize_crime_type(str(mci_category))
+
+            # Aggregate count
+            key = (neighbourhood_id, year, crime_type)
+            aggregates[key] = aggregates.get(key, 0) + 1
+
+        # Convert aggregates to CrimeRecord objects
+        records = [
+            CrimeRecord(
+                neighbourhood_id=neighbourhood_id,
+                year=year,
+                crime_type=crime_type,
+                count=count,
+                rate_per_100k=None,  # Would need population data to calculate
+            )
+            for (neighbourhood_id, year, crime_type), count in aggregates.items()
+        ]
+
+        logger.info(f"Parsed {len(records)} MCI records (aggregated)")
+        return records
--- a/portfolio_app/toronto/schemas/init.py
+++ b/portfolio_app/toronto/schemas/init.py
@@ -1,5 +1,6 @@
 """Pydantic schemas for Toronto housing data validation."""

+from .amenities import AmenityCount, AmenityRecord, AmenityType
 from .cmhc import BedroomType, CMHCAnnualSurvey, CMHCRentalRecord, ReliabilityCode
 from .dimensions import (
    CMHCZone,
@@ -11,6 +12,7 @@ from .dimensions import (
    PolicyLevel,
    TimeDimension,
 )
+from .neighbourhood import CensusRecord, CrimeRecord, CrimeType, NeighbourhoodRecord

 __all__ = [
    # CMHC
@@ -28,4 +30,13 @@ __all__ = [
    "PolicyCategory",
    "ExpectedDirection",
    "Confidence",
+    # Neighbourhood data (Phase 3)
+    "NeighbourhoodRecord",
+    "CensusRecord",
+    "CrimeRecord",
+    "CrimeType",
+    # Amenities (Phase 3)
+    "AmenityType",
+    "AmenityRecord",
+    "AmenityCount",
 ]
--- a/portfolio_app/toronto/schemas/amenities.py
+++ b/portfolio_app/toronto/schemas/amenities.py
@@ -0,0 +1,60 @@
+"""Pydantic schemas for Toronto amenities data.
+
+Includes schemas for parks, schools, childcare centres, and transit stops.
+"""
+
+from decimal import Decimal
+from enum import Enum
+
+from pydantic import BaseModel, Field
+
+
+class AmenityType(str, Enum):
+    """Types of amenities tracked in the neighbourhood dashboard."""
+
+    PARK = "park"
+    SCHOOL = "school"
+    CHILDCARE = "childcare"
+    TRANSIT_STOP = "transit_stop"
+    LIBRARY = "library"
+    COMMUNITY_CENTRE = "community_centre"
+    HOSPITAL = "hospital"
+
+
+class AmenityRecord(BaseModel):
+    """Amenity location record for a neighbourhood.
+
+    Represents a single amenity (park, school, etc.) with its location
+    and associated neighbourhood.
+    """
+
+    neighbourhood_id: int = Field(
+        ge=1, le=200, description="Neighbourhood ID containing this amenity"
+    )
+    amenity_type: AmenityType = Field(description="Type of amenity")
+    amenity_name: str = Field(max_length=200, description="Name of the amenity")
+    address: str | None = Field(
+        default=None, max_length=300, description="Street address"
+    )
+    latitude: Decimal | None = Field(
+        default=None, ge=-90, le=90, description="Latitude (WGS84)"
+    )
+    longitude: Decimal | None = Field(
+        default=None, ge=-180, le=180, description="Longitude (WGS84)"
+    )
+
+    model_config = {"str_strip_whitespace": True}
+
+
+class AmenityCount(BaseModel):
+    """Aggregated amenity count for a neighbourhood.
+
+    Used for dashboard metrics showing amenity density per neighbourhood.
+    """
+
+    neighbourhood_id: int = Field(ge=1, le=200, description="Neighbourhood ID")
+    amenity_type: AmenityType = Field(description="Type of amenity")
+    count: int = Field(ge=0, description="Number of amenities of this type")
+    year: int = Field(ge=2020, le=2030, description="Year of data snapshot")
+
+    model_config = {"str_strip_whitespace": True}
--- a/portfolio_app/toronto/schemas/neighbourhood.py
+++ b/portfolio_app/toronto/schemas/neighbourhood.py
@@ -0,0 +1,106 @@
+"""Pydantic schemas for Toronto neighbourhood data.
+
+Includes schemas for neighbourhood boundaries, census profiles, and crime statistics.
+"""
+
+from decimal import Decimal
+from enum import Enum
+from typing import Any
+
+from pydantic import BaseModel, Field
+
+
+class CrimeType(str, Enum):
+    """Major crime indicator types from Toronto Police data."""
+
+    ASSAULT = "assault"
+    AUTO_THEFT = "auto_theft"
+    BREAK_AND_ENTER = "break_and_enter"
+    HOMICIDE = "homicide"
+    ROBBERY = "robbery"
+    SHOOTING = "shooting"
+    THEFT_OVER = "theft_over"
+    THEFT_FROM_MOTOR_VEHICLE = "theft_from_motor_vehicle"
+    OTHER = "other"
+
+
+class NeighbourhoodRecord(BaseModel):
+    """Schema for Toronto neighbourhood boundary data.
+
+    Based on City of Toronto's 158 neighbourhoods dataset.
+    AREA_ID maps to neighbourhood_id for consistency with police data (Hood_ID).
+    """
+
+    area_id: int = Field(description="AREA_ID from Toronto Open Data (1-158)")
+    area_name: str = Field(max_length=100, description="Official neighbourhood name")
+    area_short_code: str | None = Field(
+        default=None, max_length=10, description="Short code (e.g., 'E01')"
+    )
+    geometry: dict[str, Any] | None = Field(
+        default=None, description="GeoJSON geometry object"
+    )
+
+    model_config = {"str_strip_whitespace": True}
+
+
+class CensusRecord(BaseModel):
+    """Census profile data for a neighbourhood.
+
+    Contains demographic and socioeconomic indicators from Statistics Canada
+    census data, aggregated to the neighbourhood level.
+    """
+
+    neighbourhood_id: int = Field(
+        ge=1, le=200, description="Neighbourhood ID (AREA_ID)"
+    )
+    census_year: int = Field(ge=2016, le=2030, description="Census year")
+    population: int | None = Field(default=None, ge=0, description="Total population")
+    population_density: Decimal | None = Field(
+        default=None, ge=0, description="Population per square kilometre"
+    )
+    median_household_income: Decimal | None = Field(
+        default=None, ge=0, description="Median household income (CAD)"
+    )
+    average_household_income: Decimal | None = Field(
+        default=None, ge=0, description="Average household income (CAD)"
+    )
+    unemployment_rate: Decimal | None = Field(
+        default=None, ge=0, le=100, description="Unemployment rate percentage"
+    )
+    pct_bachelors_or_higher: Decimal | None = Field(
+        default=None, ge=0, le=100, description="Percentage with bachelor's degree+"
+    )
+    pct_owner_occupied: Decimal | None = Field(
+        default=None, ge=0, le=100, description="Percentage owner-occupied dwellings"
+    )
+    pct_renter_occupied: Decimal | None = Field(
+        default=None, ge=0, le=100, description="Percentage renter-occupied dwellings"
+    )
+    median_age: Decimal | None = Field(
+        default=None, ge=0, le=120, description="Median age of residents"
+    )
+    average_dwelling_value: Decimal | None = Field(
+        default=None, ge=0, description="Average dwelling value (CAD)"
+    )
+
+    model_config = {"str_strip_whitespace": True}
+
+
+class CrimeRecord(BaseModel):
+    """Crime statistics for a neighbourhood.
+
+    Based on Toronto Police neighbourhood crime rates data.
+    Hood_ID in source data maps to neighbourhood_id (AREA_ID).
+    """
+
+    neighbourhood_id: int = Field(
+        ge=1, le=200, description="Neighbourhood ID (Hood_ID -> AREA_ID)"
+    )
+    year: int = Field(ge=2014, le=2030, description="Year of crime statistics")
+    crime_type: CrimeType = Field(description="Type of crime (MCI category)")
+    count: int = Field(ge=0, description="Number of incidents")
+    rate_per_100k: Decimal | None = Field(
+        default=None, ge=0, description="Rate per 100,000 population"
+    )
+
+    model_config = {"str_strip_whitespace": True}