From 81993b23a7223509d898309b471561c10b646e12 Mon Sep 17 00:00:00 2001 From: lmiranda Date: Fri, 16 Jan 2026 10:17:22 -0500 Subject: [PATCH] docs: Update CLAUDE.md and PROJECT_REFERENCE.md for neighbourhood transition CLAUDE.md: - Update project status to Sprint 9 - Remove TRREB references from data model section - Update star schema to reflect current tables - Simplify deferred features section - Update reference documents PROJECT_REFERENCE.md: - Update import examples to neighbourhood-based - Update data sources for neighbourhood dashboard - Update geographic reality diagram - Update star schema - Modernize sprint overview - Update scope boundaries - Update success criteria with completed milestones - Update reference documents Closes #52 Co-Authored-By: Claude Opus 4.5 --- CLAUDE.md | 61 ++++------------------ docs/PROJECT_REFERENCE.md | 104 ++++++++++++++------------------------ 2 files changed, 48 insertions(+), 117 deletions(-) diff --git a/CLAUDE.md b/CLAUDE.md index 49be4ae..1a8f644 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -6,9 +6,8 @@ Working context for Claude Code on the Analytics Portfolio project. ## Project Status -**Current Sprint**: 8 (Portfolio Website Expansion - Complete) -**Next Sprint**: 9 (Neighbourhood Dashboard Transition) -**Phase**: Transitioning to Toronto Neighbourhood Dashboard +**Current Sprint**: 9 (Neighbourhood Dashboard Transition) +**Phase**: Toronto Neighbourhood Dashboard **Branch**: `development` (feature branches merge here) --- @@ -189,27 +188,20 @@ portfolio_app/ ### Geographic Reality (Toronto Housing) ``` -TRREB Districts (~35) - Purchase data (W01, C01, E01...) +City Neighbourhoods (158) - Primary geographic unit for analysis CMHC Zones (~20) - Rental data (Census Tract aligned) -City Neighbourhoods (158) - Enrichment/overlay only ``` -**Critical**: These geographies do NOT align. Display as separate layers—do not force crosswalks. - ### Star Schema | Table | Type | Keys | |-------|------|------| -| `fact_purchases` | Fact | -> dim_time, dim_trreb_district | | `fact_rentals` | Fact | -> dim_time, dim_cmhc_zone | | `dim_time` | Dimension | date_key (PK) | -| `dim_trreb_district` | Dimension | district_key (PK), geometry | | `dim_cmhc_zone` | Dimension | zone_key (PK), geometry | | `dim_neighbourhood` | Dimension | neighbourhood_id (PK), geometry | | `dim_policy_event` | Dimension | event_id (PK) | -**V1 Rule**: `dim_neighbourhood` has NO FK to fact tables—reference overlay only. - ### dbt Layers | Layer | Naming | Purpose | @@ -220,37 +212,15 @@ City Neighbourhoods (158) - Enrichment/overlay only --- -## DO NOT BUILD (Phase 1) +## Deferred Features **Stop and flag if a task seems to require these**: | Feature | Reason | |---------|--------| -| `bridge_district_neighbourhood` table | Area-weighted aggregation is Phase 4 | -| Crime data integration | Deferred to Phase 4 | | Historical boundary reconciliation (140->158) | 2021+ data only for V1 | -| ML prediction models | Energy project scope (Phase 3) | -| Multi-project shared infrastructure | Build first, abstract second (Phase 2) | - ---- - -## Sprint 1 Deliverables - -| Category | Tasks | -|----------|-------| -| **Bootstrap** | Git init, pyproject.toml, .env.example, Makefile, CLAUDE.md | -| **Infrastructure** | Docker Compose (PostgreSQL + PostGIS), scripts/ directory | -| **App Foundation** | portfolio_app/ structure, config.py, error handling | -| **Tests** | tests/ directory, conftest.py, pytest config | -| **Data Acquisition** | Download TRREB PDFs, START boundary digitization (HUMAN task) | - -### Human Tasks (Cannot Automate) - -| Task | Tool | Effort | -|------|------|--------| -| Digitize TRREB district boundaries | QGIS | 3-4 hours | -| Research policy events (10-20) | Manual | 2-3 hours | -| Replace social link placeholders | Manual | 5 minutes | +| ML prediction models | Energy project scope (future phase) | +| Multi-project shared infrastructure | Build first, abstract second | --- @@ -285,21 +255,10 @@ All scripts in `scripts/`: | Document | Location | Use When | |----------|----------|----------| -| Full specification | `docs/PROJECT_REFERENCE.md` | Architecture decisions | -| Data schemas (legacy) | `docs/toronto_housing_dashboard_spec_v5.md` | Reference only - being replaced | -| WBS details (legacy) | `docs/wbs_sprint_plan_v4.md` | Reference only - being replaced | -| **Neighbourhood Dashboard Vision** | `docs/changes/Change-Toronto-Analysis.md` | New dashboard specification | -| **Implementation Plan** | `docs/changes/Change-Toronto-Analysis-Reviewed.md` | Sprint planning, cleanup tasks | +| Project reference | `docs/PROJECT_REFERENCE.md` | Architecture decisions | +| Dashboard vision | `docs/changes/Change-Toronto-Analysis.md` | Dashboard specification | +| Implementation plan | `docs/changes/Change-Toronto-Analysis-Reviewed.md` | Sprint planning | --- -## Pending Transition - -**Note**: This project is transitioning from a TRREB district-based housing dashboard to a comprehensive Toronto Neighbourhood Dashboard (158 neighbourhoods). See the Implementation Plan for details on: -- Files being deprecated (TRREB parsers, schemas, loaders) -- New data sources (Toronto Open Data, Toronto Police, CMHC APIs) -- New dashboard tabs (Overview, Housing, Safety, Demographics, Amenities) - ---- - -*Last Updated: Sprint 8* +*Last Updated: Sprint 9* diff --git a/docs/PROJECT_REFERENCE.md b/docs/PROJECT_REFERENCE.md index 50b29f5..43d2735 100644 --- a/docs/PROJECT_REFERENCE.md +++ b/docs/PROJECT_REFERENCE.md @@ -65,8 +65,8 @@ Two-project analytics portfolio demonstrating end-to-end data engineering, visua | Context | Style | Example | |---------|-------|---------| -| Same directory | Single dot | `from .trreb import TRREBParser` | -| Sibling directory | Double dot | `from ..schemas.trreb import TRREBRecord` | +| Same directory | Single dot | `from .neighbourhood import NeighbourhoodParser` | +| Sibling directory | Double dot | `from ..schemas.neighbourhood import CensusRecord` | | External packages | Absolute | `import pandas as pd` | ### Module Separation @@ -75,7 +75,7 @@ Two-project analytics portfolio demonstrating end-to-end data engineering, visua |-----------|----------|---------| | `schemas/` | Pydantic models | Data validation | | `models/` | SQLAlchemy ORM | Database persistence | -| `parsers/` | PDF/CSV extraction | Raw data ingestion | +| `parsers/` | API/CSV extraction | Raw data ingestion | | `loaders/` | Database operations | Data loading | | `figures/` | Chart factories | Plotly figure generation | | `callbacks/` | Dash callbacks | Per-dashboard, in `pages/{dashboard}/callbacks/` | @@ -145,45 +145,36 @@ portfolio_app/ --- -## Phase 1: Toronto Housing Dashboard +## Phase 1: Toronto Neighbourhood Dashboard ### Data Sources | Track | Source | Format | Geography | Frequency | |-------|--------|--------|-----------|-----------| -| Purchases | TRREB Monthly Reports | PDF | ~35 Districts | Monthly | -| Rentals | CMHC Rental Market Survey | CSV | ~20 Zones | Annual | -| Enrichment | City of Toronto Open Data | GeoJSON/CSV | 158 Neighbourhoods | Census | +| Rentals | CMHC Rental Market Survey | API/CSV | ~20 Zones | Annual | +| Neighbourhoods | City of Toronto Open Data | GeoJSON/CSV | 158 Neighbourhoods | Census | | Policy Events | Curated list | CSV | N/A | Event-based | ### Geographic Reality ``` ┌─────────────────────────────────────────────────────────────────┐ -│ City of Toronto Neighbourhoods (158) │ ← Enrichment only -├─────────────────────────────────────────────────────────────────┤ -│ TRREB Districts (~35) — W01, C01, E01, etc. │ ← Purchase data +│ City of Toronto Neighbourhoods (158) │ ← Primary analysis unit ├─────────────────────────────────────────────────────────────────┤ │ CMHC Zones (~20) — Census Tract aligned │ ← Rental data └─────────────────────────────────────────────────────────────────┘ ``` -**Critical**: These geographies do NOT align. Display as separate layers with toggle—do not force crosswalks. - ### Data Model (Star Schema) | Table | Type | Keys | |-------|------|------| -| `fact_purchases` | Fact | → dim_time, dim_trreb_district | | `fact_rentals` | Fact | → dim_time, dim_cmhc_zone | | `dim_time` | Dimension | date_key (PK) | -| `dim_trreb_district` | Dimension | district_key (PK), geometry | | `dim_cmhc_zone` | Dimension | zone_key (PK), geometry | | `dim_neighbourhood` | Dimension | neighbourhood_id (PK), geometry | | `dim_policy_event` | Dimension | event_id (PK) | -**V1 Rule**: `dim_neighbourhood` has NO FK to fact tables—reference overlay only. - ### dbt Layer Structure | Layer | Naming | Purpose | @@ -198,31 +189,11 @@ portfolio_app/ | Sprint | Focus | Milestone | |--------|-------|-----------| -| 1 | Project bootstrap, start TRREB digitization | — | -| 2 | Bio page, data acquisition | **Launch 1: Bio Live** | -| 3 | Parsers, schemas, models | — | -| 4 | Loaders, dbt | — | -| 5 | Visualization | — | -| 6 | Polish, deploy dashboard | **Launch 2: Dashboard Live** | -| 7 | Buffer | — | - -### Sprint 1 Deliverables - -| Category | Tasks | -|----------|-------| -| **Bootstrap** | Git init, pyproject.toml, .env.example, Makefile, CLAUDE.md | -| **Infrastructure** | Docker Compose (PostgreSQL + PostGIS), scripts/ directory | -| **App Foundation** | portfolio_app/ structure, config.py, error handling | -| **Tests** | tests/ directory, conftest.py, pytest config | -| **Data Acquisition** | Download TRREB PDFs, START boundary digitization (HUMAN task) | - -### Human Tasks (Cannot Automate) - -| Task | Tool | Effort | -|------|------|--------| -| Digitize TRREB district boundaries | QGIS | 3-4 hours | -| Research policy events (10-20) | Manual research | 2-3 hours | -| Replace social link placeholders | Manual | 5 minutes | +| 1-6 | Foundation and initial dashboard | **Launch 1: Bio Live** | +| 7 | Navigation & theme modernization | — | +| 8 | Portfolio website expansion | **Launch 2: Website Live** | +| 9 | Neighbourhood dashboard transition | Cleanup complete | +| 10+ | Dashboard implementation | **Launch 3: Dashboard Live** | --- @@ -230,27 +201,24 @@ portfolio_app/ ### Phase 1 — Build These -- Bio landing page with content from bio_content_v2.md -- TRREB PDF parser -- CMHC CSV processor +- Bio landing page and portfolio website +- CMHC rental data processor +- Toronto neighbourhood data integration - PostgreSQL + PostGIS database layer - Star schema (facts + dimensions) - dbt models with tests - Choropleth visualization (Dash) - Policy event annotation layer -- Neighbourhood overlay (toggle-able) -### Phase 1 — Do NOT Build +### Deferred Features | Feature | Reason | When | |---------|--------|------| -| `bridge_district_neighbourhood` table | Area-weighted aggregation is Phase 4 | After Energy project | -| Crime data integration | Deferred scope | Phase 4 | -| Historical boundary reconciliation (140→158) | 2021+ data only for V1 | Phase 4 | +| Historical boundary reconciliation (140→158) | 2021+ data only for V1 | Future phase | | ML prediction models | Energy project scope | Phase 3 | -| Multi-project shared infrastructure | Build first, abstract second | Phase 2 | +| Multi-project shared infrastructure | Build first, abstract second | Future | -If a task seems to require Phase 3/4 features, **stop and flag it**. +If a task seems to require deferred features, **stop and flag it**. --- @@ -362,19 +330,24 @@ LOG_LEVEL=INFO ## Success Criteria -### Launch 1 (Sprint 2) -- [ ] Bio page accessible via HTTPS -- [ ] All bio content rendered (from bio_content_v2.md) -- [ ] No placeholder text visible -- [ ] Mobile responsive -- [ ] Social links functional +### Launch 1 (Bio Live) +- [x] Bio page accessible via HTTPS +- [x] All bio content rendered +- [x] No placeholder text visible +- [x] Mobile responsive +- [x] Social links functional -### Launch 2 (Sprint 6) -- [ ] Choropleth renders TRREB districts and CMHC zones -- [ ] Purchase/rental mode toggle works +### Launch 2 (Website Live) +- [x] Full portfolio website with navigation +- [x] About, Contact, Projects, Resume, Blog pages +- [x] Dark mode theme support +- [x] Sidebar navigation + +### Launch 3 (Dashboard Live) +- [ ] Choropleth renders neighbourhoods and CMHC zones +- [ ] Rental data visualization works - [ ] Time navigation works - [ ] Policy event markers visible -- [ ] Neighbourhood overlay toggleable - [ ] Methodology documentation published - [ ] Data sources cited @@ -386,11 +359,10 @@ For detailed specifications, see: | Document | Location | Use When | |----------|----------|----------| -| Data schemas | `docs/toronto_housing_spec.md` | Parser/model tasks | -| WBS details | `docs/wbs.md` | Sprint planning | -| Bio content | `docs/bio_content.md` | Building home.py | +| Dashboard vision | `docs/changes/Change-Toronto-Analysis.md` | Dashboard specification | +| Implementation plan | `docs/changes/Change-Toronto-Analysis-Reviewed.md` | Sprint planning | --- -*Reference Version: 1.0* -*Created: January 2026* +*Reference Version: 2.0* +*Updated: Sprint 9*