docs: Update CLAUDE.md and PROJECT_REFERENCE.md for neighbourhood transition
CLAUDE.md: - Update project status to Sprint 9 - Remove TRREB references from data model section - Update star schema to reflect current tables - Simplify deferred features section - Update reference documents PROJECT_REFERENCE.md: - Update import examples to neighbourhood-based - Update data sources for neighbourhood dashboard - Update geographic reality diagram - Update star schema - Modernize sprint overview - Update scope boundaries - Update success criteria with completed milestones - Update reference documents Closes #52 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
61
CLAUDE.md
61
CLAUDE.md
@@ -6,9 +6,8 @@ Working context for Claude Code on the Analytics Portfolio project.
|
|||||||
|
|
||||||
## Project Status
|
## Project Status
|
||||||
|
|
||||||
**Current Sprint**: 8 (Portfolio Website Expansion - Complete)
|
**Current Sprint**: 9 (Neighbourhood Dashboard Transition)
|
||||||
**Next Sprint**: 9 (Neighbourhood Dashboard Transition)
|
**Phase**: Toronto Neighbourhood Dashboard
|
||||||
**Phase**: Transitioning to Toronto Neighbourhood Dashboard
|
|
||||||
**Branch**: `development` (feature branches merge here)
|
**Branch**: `development` (feature branches merge here)
|
||||||
|
|
||||||
---
|
---
|
||||||
@@ -189,27 +188,20 @@ portfolio_app/
|
|||||||
### Geographic Reality (Toronto Housing)
|
### Geographic Reality (Toronto Housing)
|
||||||
|
|
||||||
```
|
```
|
||||||
TRREB Districts (~35) - Purchase data (W01, C01, E01...)
|
City Neighbourhoods (158) - Primary geographic unit for analysis
|
||||||
CMHC Zones (~20) - Rental data (Census Tract aligned)
|
CMHC Zones (~20) - Rental data (Census Tract aligned)
|
||||||
City Neighbourhoods (158) - Enrichment/overlay only
|
|
||||||
```
|
```
|
||||||
|
|
||||||
**Critical**: These geographies do NOT align. Display as separate layers—do not force crosswalks.
|
|
||||||
|
|
||||||
### Star Schema
|
### Star Schema
|
||||||
|
|
||||||
| Table | Type | Keys |
|
| Table | Type | Keys |
|
||||||
|-------|------|------|
|
|-------|------|------|
|
||||||
| `fact_purchases` | Fact | -> dim_time, dim_trreb_district |
|
|
||||||
| `fact_rentals` | Fact | -> dim_time, dim_cmhc_zone |
|
| `fact_rentals` | Fact | -> dim_time, dim_cmhc_zone |
|
||||||
| `dim_time` | Dimension | date_key (PK) |
|
| `dim_time` | Dimension | date_key (PK) |
|
||||||
| `dim_trreb_district` | Dimension | district_key (PK), geometry |
|
|
||||||
| `dim_cmhc_zone` | Dimension | zone_key (PK), geometry |
|
| `dim_cmhc_zone` | Dimension | zone_key (PK), geometry |
|
||||||
| `dim_neighbourhood` | Dimension | neighbourhood_id (PK), geometry |
|
| `dim_neighbourhood` | Dimension | neighbourhood_id (PK), geometry |
|
||||||
| `dim_policy_event` | Dimension | event_id (PK) |
|
| `dim_policy_event` | Dimension | event_id (PK) |
|
||||||
|
|
||||||
**V1 Rule**: `dim_neighbourhood` has NO FK to fact tables—reference overlay only.
|
|
||||||
|
|
||||||
### dbt Layers
|
### dbt Layers
|
||||||
|
|
||||||
| Layer | Naming | Purpose |
|
| Layer | Naming | Purpose |
|
||||||
@@ -220,37 +212,15 @@ City Neighbourhoods (158) - Enrichment/overlay only
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## DO NOT BUILD (Phase 1)
|
## Deferred Features
|
||||||
|
|
||||||
**Stop and flag if a task seems to require these**:
|
**Stop and flag if a task seems to require these**:
|
||||||
|
|
||||||
| Feature | Reason |
|
| Feature | Reason |
|
||||||
|---------|--------|
|
|---------|--------|
|
||||||
| `bridge_district_neighbourhood` table | Area-weighted aggregation is Phase 4 |
|
|
||||||
| Crime data integration | Deferred to Phase 4 |
|
|
||||||
| Historical boundary reconciliation (140->158) | 2021+ data only for V1 |
|
| Historical boundary reconciliation (140->158) | 2021+ data only for V1 |
|
||||||
| ML prediction models | Energy project scope (Phase 3) |
|
| ML prediction models | Energy project scope (future phase) |
|
||||||
| Multi-project shared infrastructure | Build first, abstract second (Phase 2) |
|
| Multi-project shared infrastructure | Build first, abstract second |
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Sprint 1 Deliverables
|
|
||||||
|
|
||||||
| Category | Tasks |
|
|
||||||
|----------|-------|
|
|
||||||
| **Bootstrap** | Git init, pyproject.toml, .env.example, Makefile, CLAUDE.md |
|
|
||||||
| **Infrastructure** | Docker Compose (PostgreSQL + PostGIS), scripts/ directory |
|
|
||||||
| **App Foundation** | portfolio_app/ structure, config.py, error handling |
|
|
||||||
| **Tests** | tests/ directory, conftest.py, pytest config |
|
|
||||||
| **Data Acquisition** | Download TRREB PDFs, START boundary digitization (HUMAN task) |
|
|
||||||
|
|
||||||
### Human Tasks (Cannot Automate)
|
|
||||||
|
|
||||||
| Task | Tool | Effort |
|
|
||||||
|------|------|--------|
|
|
||||||
| Digitize TRREB district boundaries | QGIS | 3-4 hours |
|
|
||||||
| Research policy events (10-20) | Manual | 2-3 hours |
|
|
||||||
| Replace social link placeholders | Manual | 5 minutes |
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -285,21 +255,10 @@ All scripts in `scripts/`:
|
|||||||
|
|
||||||
| Document | Location | Use When |
|
| Document | Location | Use When |
|
||||||
|----------|----------|----------|
|
|----------|----------|----------|
|
||||||
| Full specification | `docs/PROJECT_REFERENCE.md` | Architecture decisions |
|
| Project reference | `docs/PROJECT_REFERENCE.md` | Architecture decisions |
|
||||||
| Data schemas (legacy) | `docs/toronto_housing_dashboard_spec_v5.md` | Reference only - being replaced |
|
| Dashboard vision | `docs/changes/Change-Toronto-Analysis.md` | Dashboard specification |
|
||||||
| WBS details (legacy) | `docs/wbs_sprint_plan_v4.md` | Reference only - being replaced |
|
| Implementation plan | `docs/changes/Change-Toronto-Analysis-Reviewed.md` | Sprint planning |
|
||||||
| **Neighbourhood Dashboard Vision** | `docs/changes/Change-Toronto-Analysis.md` | New dashboard specification |
|
|
||||||
| **Implementation Plan** | `docs/changes/Change-Toronto-Analysis-Reviewed.md` | Sprint planning, cleanup tasks |
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Pending Transition
|
*Last Updated: Sprint 9*
|
||||||
|
|
||||||
**Note**: This project is transitioning from a TRREB district-based housing dashboard to a comprehensive Toronto Neighbourhood Dashboard (158 neighbourhoods). See the Implementation Plan for details on:
|
|
||||||
- Files being deprecated (TRREB parsers, schemas, loaders)
|
|
||||||
- New data sources (Toronto Open Data, Toronto Police, CMHC APIs)
|
|
||||||
- New dashboard tabs (Overview, Housing, Safety, Demographics, Amenities)
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
*Last Updated: Sprint 8*
|
|
||||||
|
|||||||
@@ -65,8 +65,8 @@ Two-project analytics portfolio demonstrating end-to-end data engineering, visua
|
|||||||
|
|
||||||
| Context | Style | Example |
|
| Context | Style | Example |
|
||||||
|---------|-------|---------|
|
|---------|-------|---------|
|
||||||
| Same directory | Single dot | `from .trreb import TRREBParser` |
|
| Same directory | Single dot | `from .neighbourhood import NeighbourhoodParser` |
|
||||||
| Sibling directory | Double dot | `from ..schemas.trreb import TRREBRecord` |
|
| Sibling directory | Double dot | `from ..schemas.neighbourhood import CensusRecord` |
|
||||||
| External packages | Absolute | `import pandas as pd` |
|
| External packages | Absolute | `import pandas as pd` |
|
||||||
|
|
||||||
### Module Separation
|
### Module Separation
|
||||||
@@ -75,7 +75,7 @@ Two-project analytics portfolio demonstrating end-to-end data engineering, visua
|
|||||||
|-----------|----------|---------|
|
|-----------|----------|---------|
|
||||||
| `schemas/` | Pydantic models | Data validation |
|
| `schemas/` | Pydantic models | Data validation |
|
||||||
| `models/` | SQLAlchemy ORM | Database persistence |
|
| `models/` | SQLAlchemy ORM | Database persistence |
|
||||||
| `parsers/` | PDF/CSV extraction | Raw data ingestion |
|
| `parsers/` | API/CSV extraction | Raw data ingestion |
|
||||||
| `loaders/` | Database operations | Data loading |
|
| `loaders/` | Database operations | Data loading |
|
||||||
| `figures/` | Chart factories | Plotly figure generation |
|
| `figures/` | Chart factories | Plotly figure generation |
|
||||||
| `callbacks/` | Dash callbacks | Per-dashboard, in `pages/{dashboard}/callbacks/` |
|
| `callbacks/` | Dash callbacks | Per-dashboard, in `pages/{dashboard}/callbacks/` |
|
||||||
@@ -145,45 +145,36 @@ portfolio_app/
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Phase 1: Toronto Housing Dashboard
|
## Phase 1: Toronto Neighbourhood Dashboard
|
||||||
|
|
||||||
### Data Sources
|
### Data Sources
|
||||||
|
|
||||||
| Track | Source | Format | Geography | Frequency |
|
| Track | Source | Format | Geography | Frequency |
|
||||||
|-------|--------|--------|-----------|-----------|
|
|-------|--------|--------|-----------|-----------|
|
||||||
| Purchases | TRREB Monthly Reports | PDF | ~35 Districts | Monthly |
|
| Rentals | CMHC Rental Market Survey | API/CSV | ~20 Zones | Annual |
|
||||||
| Rentals | CMHC Rental Market Survey | CSV | ~20 Zones | Annual |
|
| Neighbourhoods | City of Toronto Open Data | GeoJSON/CSV | 158 Neighbourhoods | Census |
|
||||||
| Enrichment | City of Toronto Open Data | GeoJSON/CSV | 158 Neighbourhoods | Census |
|
|
||||||
| Policy Events | Curated list | CSV | N/A | Event-based |
|
| Policy Events | Curated list | CSV | N/A | Event-based |
|
||||||
|
|
||||||
### Geographic Reality
|
### Geographic Reality
|
||||||
|
|
||||||
```
|
```
|
||||||
┌─────────────────────────────────────────────────────────────────┐
|
┌─────────────────────────────────────────────────────────────────┐
|
||||||
│ City of Toronto Neighbourhoods (158) │ ← Enrichment only
|
│ City of Toronto Neighbourhoods (158) │ ← Primary analysis unit
|
||||||
├─────────────────────────────────────────────────────────────────┤
|
|
||||||
│ TRREB Districts (~35) — W01, C01, E01, etc. │ ← Purchase data
|
|
||||||
├─────────────────────────────────────────────────────────────────┤
|
├─────────────────────────────────────────────────────────────────┤
|
||||||
│ CMHC Zones (~20) — Census Tract aligned │ ← Rental data
|
│ CMHC Zones (~20) — Census Tract aligned │ ← Rental data
|
||||||
└─────────────────────────────────────────────────────────────────┘
|
└─────────────────────────────────────────────────────────────────┘
|
||||||
```
|
```
|
||||||
|
|
||||||
**Critical**: These geographies do NOT align. Display as separate layers with toggle—do not force crosswalks.
|
|
||||||
|
|
||||||
### Data Model (Star Schema)
|
### Data Model (Star Schema)
|
||||||
|
|
||||||
| Table | Type | Keys |
|
| Table | Type | Keys |
|
||||||
|-------|------|------|
|
|-------|------|------|
|
||||||
| `fact_purchases` | Fact | → dim_time, dim_trreb_district |
|
|
||||||
| `fact_rentals` | Fact | → dim_time, dim_cmhc_zone |
|
| `fact_rentals` | Fact | → dim_time, dim_cmhc_zone |
|
||||||
| `dim_time` | Dimension | date_key (PK) |
|
| `dim_time` | Dimension | date_key (PK) |
|
||||||
| `dim_trreb_district` | Dimension | district_key (PK), geometry |
|
|
||||||
| `dim_cmhc_zone` | Dimension | zone_key (PK), geometry |
|
| `dim_cmhc_zone` | Dimension | zone_key (PK), geometry |
|
||||||
| `dim_neighbourhood` | Dimension | neighbourhood_id (PK), geometry |
|
| `dim_neighbourhood` | Dimension | neighbourhood_id (PK), geometry |
|
||||||
| `dim_policy_event` | Dimension | event_id (PK) |
|
| `dim_policy_event` | Dimension | event_id (PK) |
|
||||||
|
|
||||||
**V1 Rule**: `dim_neighbourhood` has NO FK to fact tables—reference overlay only.
|
|
||||||
|
|
||||||
### dbt Layer Structure
|
### dbt Layer Structure
|
||||||
|
|
||||||
| Layer | Naming | Purpose |
|
| Layer | Naming | Purpose |
|
||||||
@@ -198,31 +189,11 @@ portfolio_app/
|
|||||||
|
|
||||||
| Sprint | Focus | Milestone |
|
| Sprint | Focus | Milestone |
|
||||||
|--------|-------|-----------|
|
|--------|-------|-----------|
|
||||||
| 1 | Project bootstrap, start TRREB digitization | — |
|
| 1-6 | Foundation and initial dashboard | **Launch 1: Bio Live** |
|
||||||
| 2 | Bio page, data acquisition | **Launch 1: Bio Live** |
|
| 7 | Navigation & theme modernization | — |
|
||||||
| 3 | Parsers, schemas, models | — |
|
| 8 | Portfolio website expansion | **Launch 2: Website Live** |
|
||||||
| 4 | Loaders, dbt | — |
|
| 9 | Neighbourhood dashboard transition | Cleanup complete |
|
||||||
| 5 | Visualization | — |
|
| 10+ | Dashboard implementation | **Launch 3: Dashboard Live** |
|
||||||
| 6 | Polish, deploy dashboard | **Launch 2: Dashboard Live** |
|
|
||||||
| 7 | Buffer | — |
|
|
||||||
|
|
||||||
### Sprint 1 Deliverables
|
|
||||||
|
|
||||||
| Category | Tasks |
|
|
||||||
|----------|-------|
|
|
||||||
| **Bootstrap** | Git init, pyproject.toml, .env.example, Makefile, CLAUDE.md |
|
|
||||||
| **Infrastructure** | Docker Compose (PostgreSQL + PostGIS), scripts/ directory |
|
|
||||||
| **App Foundation** | portfolio_app/ structure, config.py, error handling |
|
|
||||||
| **Tests** | tests/ directory, conftest.py, pytest config |
|
|
||||||
| **Data Acquisition** | Download TRREB PDFs, START boundary digitization (HUMAN task) |
|
|
||||||
|
|
||||||
### Human Tasks (Cannot Automate)
|
|
||||||
|
|
||||||
| Task | Tool | Effort |
|
|
||||||
|------|------|--------|
|
|
||||||
| Digitize TRREB district boundaries | QGIS | 3-4 hours |
|
|
||||||
| Research policy events (10-20) | Manual research | 2-3 hours |
|
|
||||||
| Replace social link placeholders | Manual | 5 minutes |
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -230,27 +201,24 @@ portfolio_app/
|
|||||||
|
|
||||||
### Phase 1 — Build These
|
### Phase 1 — Build These
|
||||||
|
|
||||||
- Bio landing page with content from bio_content_v2.md
|
- Bio landing page and portfolio website
|
||||||
- TRREB PDF parser
|
- CMHC rental data processor
|
||||||
- CMHC CSV processor
|
- Toronto neighbourhood data integration
|
||||||
- PostgreSQL + PostGIS database layer
|
- PostgreSQL + PostGIS database layer
|
||||||
- Star schema (facts + dimensions)
|
- Star schema (facts + dimensions)
|
||||||
- dbt models with tests
|
- dbt models with tests
|
||||||
- Choropleth visualization (Dash)
|
- Choropleth visualization (Dash)
|
||||||
- Policy event annotation layer
|
- Policy event annotation layer
|
||||||
- Neighbourhood overlay (toggle-able)
|
|
||||||
|
|
||||||
### Phase 1 — Do NOT Build
|
### Deferred Features
|
||||||
|
|
||||||
| Feature | Reason | When |
|
| Feature | Reason | When |
|
||||||
|---------|--------|------|
|
|---------|--------|------|
|
||||||
| `bridge_district_neighbourhood` table | Area-weighted aggregation is Phase 4 | After Energy project |
|
| Historical boundary reconciliation (140→158) | 2021+ data only for V1 | Future phase |
|
||||||
| Crime data integration | Deferred scope | Phase 4 |
|
|
||||||
| Historical boundary reconciliation (140→158) | 2021+ data only for V1 | Phase 4 |
|
|
||||||
| ML prediction models | Energy project scope | Phase 3 |
|
| ML prediction models | Energy project scope | Phase 3 |
|
||||||
| Multi-project shared infrastructure | Build first, abstract second | Phase 2 |
|
| Multi-project shared infrastructure | Build first, abstract second | Future |
|
||||||
|
|
||||||
If a task seems to require Phase 3/4 features, **stop and flag it**.
|
If a task seems to require deferred features, **stop and flag it**.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -362,19 +330,24 @@ LOG_LEVEL=INFO
|
|||||||
|
|
||||||
## Success Criteria
|
## Success Criteria
|
||||||
|
|
||||||
### Launch 1 (Sprint 2)
|
### Launch 1 (Bio Live)
|
||||||
- [ ] Bio page accessible via HTTPS
|
- [x] Bio page accessible via HTTPS
|
||||||
- [ ] All bio content rendered (from bio_content_v2.md)
|
- [x] All bio content rendered
|
||||||
- [ ] No placeholder text visible
|
- [x] No placeholder text visible
|
||||||
- [ ] Mobile responsive
|
- [x] Mobile responsive
|
||||||
- [ ] Social links functional
|
- [x] Social links functional
|
||||||
|
|
||||||
### Launch 2 (Sprint 6)
|
### Launch 2 (Website Live)
|
||||||
- [ ] Choropleth renders TRREB districts and CMHC zones
|
- [x] Full portfolio website with navigation
|
||||||
- [ ] Purchase/rental mode toggle works
|
- [x] About, Contact, Projects, Resume, Blog pages
|
||||||
|
- [x] Dark mode theme support
|
||||||
|
- [x] Sidebar navigation
|
||||||
|
|
||||||
|
### Launch 3 (Dashboard Live)
|
||||||
|
- [ ] Choropleth renders neighbourhoods and CMHC zones
|
||||||
|
- [ ] Rental data visualization works
|
||||||
- [ ] Time navigation works
|
- [ ] Time navigation works
|
||||||
- [ ] Policy event markers visible
|
- [ ] Policy event markers visible
|
||||||
- [ ] Neighbourhood overlay toggleable
|
|
||||||
- [ ] Methodology documentation published
|
- [ ] Methodology documentation published
|
||||||
- [ ] Data sources cited
|
- [ ] Data sources cited
|
||||||
|
|
||||||
@@ -386,11 +359,10 @@ For detailed specifications, see:
|
|||||||
|
|
||||||
| Document | Location | Use When |
|
| Document | Location | Use When |
|
||||||
|----------|----------|----------|
|
|----------|----------|----------|
|
||||||
| Data schemas | `docs/toronto_housing_spec.md` | Parser/model tasks |
|
| Dashboard vision | `docs/changes/Change-Toronto-Analysis.md` | Dashboard specification |
|
||||||
| WBS details | `docs/wbs.md` | Sprint planning |
|
| Implementation plan | `docs/changes/Change-Toronto-Analysis-Reviewed.md` | Sprint planning |
|
||||||
| Bio content | `docs/bio_content.md` | Building home.py |
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
*Reference Version: 1.0*
|
*Reference Version: 2.0*
|
||||||
*Created: January 2026*
|
*Updated: Sprint 9*
|
||||||
|
|||||||
Reference in New Issue
Block a user