staging #96
76
CLAUDE.md
76
CLAUDE.md
@@ -6,8 +6,9 @@ Working context for Claude Code on the Analytics Portfolio project.
|
||||
|
||||
## Project Status
|
||||
|
||||
**Current Sprint**: 7 (Navigation & Theme Modernization)
|
||||
**Phase**: 1 - Toronto Housing Dashboard
|
||||
**Current Sprint**: 8 (Portfolio Website Expansion - Complete)
|
||||
**Next Sprint**: 9 (Neighbourhood Dashboard Transition)
|
||||
**Phase**: Transitioning to Toronto Neighbourhood Dashboard
|
||||
**Branch**: `development` (feature branches merge here)
|
||||
|
||||
---
|
||||
@@ -33,7 +34,10 @@ make ci # Run all checks
|
||||
1. Create feature branch FROM `development`: `git checkout -b feature/{sprint}-{description}`
|
||||
2. Work and commit on feature branch
|
||||
3. Merge INTO `development` when complete
|
||||
4. `development` -> `staging` -> `main` for releases
|
||||
4. Delete the feature branch after merge (keep branches clean)
|
||||
5. `development` -> `staging` -> `main` for releases
|
||||
|
||||
**CRITICAL: NEVER DELETE the `development` branch. It is the main integration branch.**
|
||||
|
||||
---
|
||||
|
||||
@@ -43,8 +47,8 @@ make ci # Run all checks
|
||||
|
||||
| Context | Style | Example |
|
||||
|---------|-------|---------|
|
||||
| Same directory | Single dot | `from .trreb import TRREBParser` |
|
||||
| Sibling directory | Double dot | `from ..schemas.trreb import TRREBRecord` |
|
||||
| Same directory | Single dot | `from .neighbourhood import NeighbourhoodRecord` |
|
||||
| Sibling directory | Double dot | `from ..schemas.neighbourhood import CensusRecord` |
|
||||
| External packages | Absolute | `import pandas as pd` |
|
||||
|
||||
### Module Responsibilities
|
||||
@@ -53,7 +57,7 @@ make ci # Run all checks
|
||||
|-----------|----------|---------|
|
||||
| `schemas/` | Pydantic models | Data validation |
|
||||
| `models/` | SQLAlchemy ORM | Database persistence |
|
||||
| `parsers/` | PDF/CSV extraction | Raw data ingestion |
|
||||
| `parsers/` | API/CSV extraction | Raw data ingestion |
|
||||
| `loaders/` | Database operations | Data loading |
|
||||
| `figures/` | Chart factories | Plotly figure generation |
|
||||
| `callbacks/` | Dash callbacks | In `pages/{dashboard}/callbacks/` |
|
||||
@@ -101,18 +105,43 @@ portfolio_app/
|
||||
├── app.py # Dash app factory with Pages routing
|
||||
├── config.py # Pydantic BaseSettings
|
||||
├── assets/ # CSS, images (auto-served)
|
||||
│ └── sidebar.css # Navigation styling
|
||||
├── callbacks/ # Global callbacks
|
||||
│ ├── sidebar.py # Sidebar toggle
|
||||
│ └── theme.py # Dark/light theme
|
||||
├── pages/
|
||||
│ ├── home.py # Bio landing page -> /
|
||||
│ ├── about.py # About page -> /about
|
||||
│ ├── contact.py # Contact form -> /contact
|
||||
│ ├── health.py # Health endpoint -> /health
|
||||
│ ├── projects.py # Project showcase -> /projects
|
||||
│ ├── resume.py # Resume/CV -> /resume
|
||||
│ ├── blog/
|
||||
│ │ ├── index.py # Blog listing -> /blog
|
||||
│ │ └── article.py # Blog article -> /blog/{slug}
|
||||
│ └── toronto/
|
||||
│ ├── dashboard.py # Layout only -> /toronto
|
||||
│ └── callbacks/ # Interaction logic
|
||||
├── components/ # Shared UI (navbar, footer, cards)
|
||||
│ ├── dashboard.py # Dashboard -> /toronto
|
||||
│ ├── methodology.py # Methodology -> /toronto/methodology
|
||||
│ └── callbacks/ # Dashboard interactions
|
||||
├── components/ # Shared UI (sidebar, cards, controls)
|
||||
│ ├── metric_card.py # KPI card component
|
||||
│ ├── map_controls.py # Map control panel
|
||||
│ ├── sidebar.py # Navigation sidebar
|
||||
│ └── time_slider.py # Time range selector
|
||||
├── figures/ # Shared chart factories
|
||||
│ ├── choropleth.py # Map visualizations
|
||||
│ ├── summary_cards.py # KPI figures
|
||||
│ └── time_series.py # Trend charts
|
||||
├── content/ # Markdown content
|
||||
│ └── blog/ # Blog articles
|
||||
├── toronto/ # Toronto data logic
|
||||
│ ├── parsers/
|
||||
│ ├── loaders/
|
||||
│ ├── schemas/ # Pydantic
|
||||
│ └── models/ # SQLAlchemy
|
||||
│ ├── models/ # SQLAlchemy
|
||||
│ └── demo_data.py # Sample data
|
||||
├── utils/ # Utilities
|
||||
│ └── markdown_loader.py # Markdown processing
|
||||
└── errors/
|
||||
```
|
||||
|
||||
@@ -121,7 +150,15 @@ portfolio_app/
|
||||
| URL | Page | Sprint |
|
||||
|-----|------|--------|
|
||||
| `/` | Bio landing page | 2 |
|
||||
| `/toronto` | Toronto Housing Dashboard | 6 |
|
||||
| `/about` | About page | 8 |
|
||||
| `/contact` | Contact form | 8 |
|
||||
| `/health` | Health endpoint | 8 |
|
||||
| `/projects` | Project showcase | 8 |
|
||||
| `/resume` | Resume/CV | 8 |
|
||||
| `/blog` | Blog listing | 8 |
|
||||
| `/blog/{slug}` | Blog article | 8 |
|
||||
| `/toronto` | Toronto Dashboard | 6 |
|
||||
| `/toronto/methodology` | Dashboard methodology | 6 |
|
||||
|
||||
---
|
||||
|
||||
@@ -249,9 +286,20 @@ All scripts in `scripts/`:
|
||||
| Document | Location | Use When |
|
||||
|----------|----------|----------|
|
||||
| Full specification | `docs/PROJECT_REFERENCE.md` | Architecture decisions |
|
||||
| Data schemas | `docs/toronto_housing_dashboard_spec_v5.md` | Parser/model tasks |
|
||||
| WBS details | `docs/wbs_sprint_plan_v4.md` | Sprint planning |
|
||||
| Data schemas (legacy) | `docs/toronto_housing_dashboard_spec_v5.md` | Reference only - being replaced |
|
||||
| WBS details (legacy) | `docs/wbs_sprint_plan_v4.md` | Reference only - being replaced |
|
||||
| **Neighbourhood Dashboard Vision** | `docs/changes/Change-Toronto-Analysis.md` | New dashboard specification |
|
||||
| **Implementation Plan** | `docs/changes/Change-Toronto-Analysis-Reviewed.md` | Sprint planning, cleanup tasks |
|
||||
|
||||
---
|
||||
|
||||
*Last Updated: Sprint 7*
|
||||
## Pending Transition
|
||||
|
||||
**Note**: This project is transitioning from a TRREB district-based housing dashboard to a comprehensive Toronto Neighbourhood Dashboard (158 neighbourhoods). See the Implementation Plan for details on:
|
||||
- Files being deprecated (TRREB parsers, schemas, loaders)
|
||||
- New data sources (Toronto Open Data, Toronto Police, CMHC APIs)
|
||||
- New dashboard tabs (Overview, Housing, Safety, Demographics, Amenities)
|
||||
|
||||
---
|
||||
|
||||
*Last Updated: Sprint 8*
|
||||
|
||||
759
docs/changes/Change-Toronto-Analysis-Reviewed.md
Normal file
759
docs/changes/Change-Toronto-Analysis-Reviewed.md
Normal file
@@ -0,0 +1,759 @@
|
||||
# Toronto Neighbourhood Dashboard — Implementation Plan
|
||||
|
||||
**Document Type:** Change Implementation Plan
|
||||
**Target:** Transition from TRREB-based to Neighbourhood-based Dashboard
|
||||
**Author:** Claude Code
|
||||
**Version:** 1.0 | January 2026
|
||||
**Status:** Awaiting Approval
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
This plan details the transition from the current TRREB district-based housing dashboard to a comprehensive Toronto Neighbourhood Dashboard built around the city's 158 official neighbourhoods. The change simplifies geographic alignment, improves data availability through open APIs, and expands analytical scope to include housing, safety, demographics, and amenities.
|
||||
|
||||
**Key Changes:**
|
||||
- Geographic foundation shifts from TRREB districts (~35) to City Neighbourhoods (158)
|
||||
- Data sources transition from PDF parsing to open APIs (Toronto Open Data, CMHC, Toronto Police)
|
||||
- Dashboard expands from housing-only to 5 thematic tabs
|
||||
- Star schema redesigned around neighbourhood as the central dimension
|
||||
|
||||
---
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Phase 1: Repository Cleanup](#phase-1-repository-cleanup)
|
||||
2. [Phase 2: Documentation Updates](#phase-2-documentation-updates)
|
||||
3. [Phase 3: Data Pipeline Implementation](#phase-3-data-pipeline-implementation)
|
||||
4. [Phase 4: dbt Model Restructuring](#phase-4-dbt-model-restructuring)
|
||||
5. [Phase 5: Dashboard Implementation](#phase-5-dashboard-implementation)
|
||||
6. [Phase 6: Jupyter Notebooks](#phase-6-jupyter-notebooks)
|
||||
7. [Phase 7: Final Documentation Review](#phase-7-final-documentation-review)
|
||||
8. [Phase 8: Commit and Merge Strategy](#phase-8-commit-and-merge-strategy)
|
||||
|
||||
---
|
||||
|
||||
## Phase 1: Repository Cleanup
|
||||
|
||||
### 1.1 Files to DELETE (TRREB-Specific)
|
||||
|
||||
These files are specific to the old TRREB district-based approach and will be completely removed:
|
||||
|
||||
| File | Reason for Deletion |
|
||||
|------|---------------------|
|
||||
| `portfolio_app/toronto/schemas/trreb.py` | TRREB schema obsolete - replacing with neighbourhood-based |
|
||||
| `portfolio_app/toronto/parsers/trreb.py` | PDF parsing no longer needed - using APIs |
|
||||
| `portfolio_app/toronto/loaders/trreb.py` | TRREB loading logic obsolete |
|
||||
| `dbt/models/staging/stg_trreb__purchases.sql` | TRREB staging model obsolete |
|
||||
| `dbt/models/intermediate/int_purchases__monthly.sql` | TRREB-based intermediate obsolete |
|
||||
| `dbt/models/marts/mart_toronto_purchases.sql` | Will be rebuilt for neighbourhood grain |
|
||||
|
||||
### 1.2 Files to MODIFY (Remove TRREB References)
|
||||
|
||||
| File | Changes Required |
|
||||
|------|------------------|
|
||||
| `portfolio_app/toronto/schemas/__init__.py` | Remove TRREB imports |
|
||||
| `portfolio_app/toronto/parsers/__init__.py` | Remove TRREB parser imports |
|
||||
| `portfolio_app/toronto/loaders/__init__.py` | Remove TRREB loader imports |
|
||||
| `portfolio_app/toronto/models/facts.py` | Remove `FactPurchases` model (rebuild later) |
|
||||
| `portfolio_app/toronto/models/dimensions.py` | Remove `DimTRREBDistrict` model |
|
||||
| `portfolio_app/toronto/demo_data.py` | Remove TRREB demo districts, rebuild for neighbourhoods |
|
||||
| `dbt/models/sources.yml` | Remove TRREB source definitions |
|
||||
| `dbt/models/schema.yml` | Remove TRREB model documentation |
|
||||
|
||||
### 1.3 Files to KEEP (Reusable Infrastructure)
|
||||
|
||||
| File | Why Keep |
|
||||
|------|----------|
|
||||
| `portfolio_app/toronto/schemas/cmhc.py` | CMHC data still used (requires zone-to-neighbourhood mapping) |
|
||||
| `portfolio_app/toronto/parsers/cmhc.py` | CMHC parser reusable with modifications |
|
||||
| `portfolio_app/toronto/loaders/cmhc.py` | Loader patterns reusable |
|
||||
| `portfolio_app/toronto/loaders/base.py` | Generic database utilities |
|
||||
| `portfolio_app/toronto/loaders/dimensions.py` | Dimension loading patterns reusable |
|
||||
| `portfolio_app/toronto/models/base.py` | SQLAlchemy base class |
|
||||
| `portfolio_app/toronto/models/facts.py` | Keep `FactRentals`, refactor |
|
||||
| `portfolio_app/toronto/models/dimensions.py` | Keep `DimTime`, `DimNeighbourhood`, refactor others |
|
||||
| `portfolio_app/figures/*.py` | All chart factories reusable |
|
||||
| `portfolio_app/components/*.py` | All UI components reusable |
|
||||
|
||||
### 1.4 Cleanup Commands
|
||||
|
||||
```bash
|
||||
# Files to delete
|
||||
rm portfolio_app/toronto/schemas/trreb.py
|
||||
rm portfolio_app/toronto/parsers/trreb.py
|
||||
rm portfolio_app/toronto/loaders/trreb.py
|
||||
rm dbt/models/staging/stg_trreb__purchases.sql
|
||||
rm dbt/models/intermediate/int_purchases__monthly.sql
|
||||
rm dbt/models/marts/mart_toronto_purchases.sql
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: Documentation Updates
|
||||
|
||||
### 2.1 Primary Documentation Files
|
||||
|
||||
| Document | Current State | Required Updates |
|
||||
|----------|---------------|------------------|
|
||||
| `CLAUDE.md` (project) | References TRREB, old sprint structure | Complete rewrite of Data Model section |
|
||||
| `docs/PROJECT_REFERENCE.md` | Full spec references TRREB | Update architecture, data sources |
|
||||
| `docs/toronto_housing_dashboard_spec_v5.md` | TRREB/CMHC spec | Replace with neighbourhood spec |
|
||||
| `docs/wbs_sprint_plan_v4.md` | Old sprint plan | New sprint plan for neighbourhood implementation |
|
||||
|
||||
### 2.2 CLAUDE.md Updates Required
|
||||
|
||||
**Section: Data Model Overview**
|
||||
- Remove: TRREB Districts (~35) reference
|
||||
- Remove: "These geographies do NOT align" note (now unified)
|
||||
- Update: Star schema to neighbourhood-centric model
|
||||
- Update: dbt layers description
|
||||
|
||||
**Section: Star Schema**
|
||||
Replace with:
|
||||
|
||||
| Table | Type | Keys |
|
||||
|-------|------|------|
|
||||
| `dim_neighbourhood` | Central Dimension | neighbourhood_id (PK), geometry |
|
||||
| `dim_time` | Dimension | date_key (PK) |
|
||||
| `dim_cmhc_zone` | Bridge | zone_key (PK), neighbourhood mapping |
|
||||
| `fact_census` | Fact | -> dim_neighbourhood, dim_time |
|
||||
| `fact_crime` | Fact | -> dim_neighbourhood, dim_time |
|
||||
| `fact_rentals` | Fact | -> dim_cmhc_zone, dim_time |
|
||||
| `fact_amenities` | Fact | -> dim_neighbourhood |
|
||||
|
||||
**Section: DO NOT BUILD**
|
||||
Update to reflect new scope constraints.
|
||||
|
||||
**Section: Module Responsibilities**
|
||||
Update parsers description: "API extraction" instead of "PDF/CSV extraction"
|
||||
|
||||
### 2.3 New Reference Documents to Create
|
||||
|
||||
| Document | Purpose |
|
||||
|----------|---------|
|
||||
| `docs/neighbourhood_dashboard_spec_v1.md` | New dashboard specification (from Change-Toronto-Analysis.md) |
|
||||
| `docs/data_source_inventory.md` | API endpoints, data dictionaries, refresh schedules |
|
||||
| `docs/cmhc_neighbourhood_crosswalk.md` | CMHC zone to neighbourhood mapping methodology |
|
||||
|
||||
---
|
||||
|
||||
## Phase 3: Data Pipeline Implementation
|
||||
|
||||
### 3.1 New Schema Files
|
||||
|
||||
#### `portfolio_app/toronto/schemas/neighbourhood.py`
|
||||
```python
|
||||
"""Pydantic schemas for neighbourhood-level data."""
|
||||
from pydantic import BaseModel, Field
|
||||
from datetime import date
|
||||
from typing import Optional
|
||||
|
||||
class NeighbourhoodRecord(BaseModel):
|
||||
"""Core neighbourhood dimension record."""
|
||||
area_id: int = Field(..., description="Toronto Open Data AREA_ID")
|
||||
name: str
|
||||
population: Optional[int] = None
|
||||
land_area_sqkm: Optional[float] = None
|
||||
|
||||
class CensusRecord(BaseModel):
|
||||
"""Census indicator for a neighbourhood."""
|
||||
neighbourhood_id: int
|
||||
census_year: int
|
||||
indicator_name: str
|
||||
indicator_value: float
|
||||
|
||||
class CrimeRecord(BaseModel):
|
||||
"""Crime statistics for a neighbourhood."""
|
||||
neighbourhood_id: int
|
||||
year: int
|
||||
mci_category: str # Assault, B&E, Auto Theft, Robbery, Theft Over
|
||||
count: int
|
||||
rate_per_100k: float
|
||||
```
|
||||
|
||||
#### `portfolio_app/toronto/schemas/amenities.py`
|
||||
```python
|
||||
"""Pydantic schemas for amenity data."""
|
||||
from pydantic import BaseModel
|
||||
from typing import Optional
|
||||
from enum import Enum
|
||||
|
||||
class AmenityType(str, Enum):
|
||||
PARK = "park"
|
||||
SCHOOL = "school"
|
||||
CHILDCARE = "childcare"
|
||||
TTC_STOP = "ttc_stop"
|
||||
|
||||
class AmenityRecord(BaseModel):
|
||||
"""Point amenity within a neighbourhood."""
|
||||
neighbourhood_id: int
|
||||
amenity_type: AmenityType
|
||||
name: str
|
||||
latitude: float
|
||||
longitude: float
|
||||
attributes: Optional[dict] = None # Type-specific attributes
|
||||
```
|
||||
|
||||
### 3.2 New Parser Files
|
||||
|
||||
#### `portfolio_app/toronto/parsers/toronto_open_data.py`
|
||||
```python
|
||||
"""Parser for Toronto Open Data Portal APIs."""
|
||||
# Endpoints:
|
||||
# - Neighbourhoods GeoJSON
|
||||
# - Neighbourhood Profiles CSV
|
||||
# - Parks
|
||||
# - Schools
|
||||
# - Child Care Centres
|
||||
|
||||
class TorontoOpenDataParser:
|
||||
BASE_URL = "https://ckan0.cf.opendata.inter.prod-toronto.ca"
|
||||
|
||||
def fetch_neighbourhoods_geojson(self) -> dict: ...
|
||||
def fetch_neighbourhood_profiles(self) -> list[CensusRecord]: ...
|
||||
def fetch_parks(self) -> list[AmenityRecord]: ...
|
||||
def fetch_schools(self) -> list[AmenityRecord]: ...
|
||||
def fetch_childcare(self) -> list[AmenityRecord]: ...
|
||||
```
|
||||
|
||||
#### `portfolio_app/toronto/parsers/toronto_police.py`
|
||||
```python
|
||||
"""Parser for Toronto Police Service Open Data Portal."""
|
||||
# Endpoints:
|
||||
# - Neighbourhood Crime Rates
|
||||
# - Major Crime Indicators
|
||||
# - Shootings & Firearm Discharges
|
||||
|
||||
class TorontoPoliceParser:
|
||||
BASE_URL = "https://data.torontopolice.on.ca"
|
||||
|
||||
def fetch_neighbourhood_crime_rates(self, year: int) -> list[CrimeRecord]: ...
|
||||
def fetch_mci_details(self, year: int) -> list[CrimeRecord]: ...
|
||||
def fetch_shootings(self, year: int) -> list[dict]: ...
|
||||
```
|
||||
|
||||
### 3.3 New Model Files
|
||||
|
||||
#### `portfolio_app/toronto/models/dimensions.py` (Updated)
|
||||
```python
|
||||
"""Dimension models - neighbourhood as central dimension."""
|
||||
|
||||
class DimNeighbourhood(Base):
|
||||
"""158 Toronto neighbourhoods - CENTRAL DIMENSION."""
|
||||
__tablename__ = "dim_neighbourhood"
|
||||
|
||||
neighbourhood_id = Column(Integer, primary_key=True) # AREA_ID from GeoJSON
|
||||
name = Column(String(100), nullable=False)
|
||||
geometry = Column(Geometry("POLYGON", srid=4326))
|
||||
population = Column(Integer)
|
||||
land_area_sqkm = Column(Float)
|
||||
pop_density_per_sqkm = Column(Float)
|
||||
|
||||
class DimCMHCZone(Base):
|
||||
"""15 CMHC zones with neighbourhood mapping."""
|
||||
__tablename__ = "dim_cmhc_zone"
|
||||
|
||||
zone_key = Column(Integer, primary_key=True, autoincrement=True)
|
||||
zone_code = Column(String(10), unique=True, nullable=False)
|
||||
zone_name = Column(String(100), nullable=False)
|
||||
geometry = Column(Geometry("POLYGON", srid=4326))
|
||||
|
||||
class BridgeCMHCNeighbourhood(Base):
|
||||
"""Many-to-many: CMHC zones to neighbourhoods with area weights."""
|
||||
__tablename__ = "bridge_cmhc_neighbourhood"
|
||||
|
||||
zone_key = Column(Integer, ForeignKey("dim_cmhc_zone.zone_key"), primary_key=True)
|
||||
neighbourhood_id = Column(Integer, ForeignKey("dim_neighbourhood.neighbourhood_id"), primary_key=True)
|
||||
area_weight = Column(Float) # Proportion of neighbourhood in zone
|
||||
```
|
||||
|
||||
#### `portfolio_app/toronto/models/facts.py` (Updated)
|
||||
```python
|
||||
"""Fact tables - all keyed to neighbourhood or CMHC zone."""
|
||||
|
||||
class FactCensus(Base):
|
||||
"""Census indicators by neighbourhood."""
|
||||
__tablename__ = "fact_census"
|
||||
|
||||
id = Column(Integer, primary_key=True, autoincrement=True)
|
||||
neighbourhood_id = Column(Integer, ForeignKey("dim_neighbourhood.neighbourhood_id"))
|
||||
date_key = Column(Integer, ForeignKey("dim_time.date_key"))
|
||||
indicator_name = Column(String(100))
|
||||
indicator_value = Column(Float)
|
||||
|
||||
class FactCrime(Base):
|
||||
"""Crime statistics by neighbourhood."""
|
||||
__tablename__ = "fact_crime"
|
||||
|
||||
id = Column(Integer, primary_key=True, autoincrement=True)
|
||||
neighbourhood_id = Column(Integer, ForeignKey("dim_neighbourhood.neighbourhood_id"))
|
||||
date_key = Column(Integer, ForeignKey("dim_time.date_key"))
|
||||
mci_category = Column(String(50))
|
||||
incident_count = Column(Integer)
|
||||
rate_per_100k = Column(Float)
|
||||
|
||||
class FactAmenities(Base):
|
||||
"""Amenity counts by neighbourhood (snapshot)."""
|
||||
__tablename__ = "fact_amenities"
|
||||
|
||||
neighbourhood_id = Column(Integer, ForeignKey("dim_neighbourhood.neighbourhood_id"), primary_key=True)
|
||||
parks_count = Column(Integer)
|
||||
parks_area_sqm = Column(Float)
|
||||
schools_count = Column(Integer)
|
||||
childcare_spaces = Column(Integer)
|
||||
ttc_stops_count = Column(Integer)
|
||||
snapshot_date = Column(Date)
|
||||
```
|
||||
|
||||
### 3.4 New Loader Files
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `portfolio_app/toronto/loaders/neighbourhoods.py` | Load GeoJSON boundaries |
|
||||
| `portfolio_app/toronto/loaders/census.py` | Load neighbourhood profiles |
|
||||
| `portfolio_app/toronto/loaders/crime.py` | Load crime statistics |
|
||||
| `portfolio_app/toronto/loaders/amenities.py` | Load parks, schools, childcare |
|
||||
| `portfolio_app/toronto/loaders/cmhc_crosswalk.py` | Build CMHC-neighbourhood bridge |
|
||||
|
||||
---
|
||||
|
||||
## Phase 4: dbt Model Restructuring
|
||||
|
||||
### 4.1 New Staging Models
|
||||
|
||||
| Model | Source | Purpose |
|
||||
|-------|--------|---------|
|
||||
| `stg_toronto__neighbourhoods` | dim_neighbourhood | Clean neighbourhood dimension |
|
||||
| `stg_toronto__census` | fact_census | Pivoted census indicators |
|
||||
| `stg_toronto__crime` | fact_crime | Cleaned crime data |
|
||||
| `stg_toronto__amenities` | fact_amenities | Amenity counts |
|
||||
| `stg_cmhc__rentals` | fact_rentals | (Keep existing, modify) |
|
||||
| `stg_cmhc__zone_crosswalk` | bridge_cmhc_neighbourhood | Zone-neighbourhood mapping |
|
||||
|
||||
### 4.2 New Intermediate Models
|
||||
|
||||
| Model | Purpose |
|
||||
|-------|---------|
|
||||
| `int_neighbourhood__demographics` | Combined census demographics |
|
||||
| `int_neighbourhood__housing` | Housing indicators from census |
|
||||
| `int_neighbourhood__crime_summary` | Aggregated crime by type |
|
||||
| `int_neighbourhood__amenity_scores` | Normalized amenity metrics |
|
||||
| `int_rentals__neighbourhood_allocated` | CMHC rentals allocated to neighbourhoods |
|
||||
|
||||
### 4.3 New Mart Models
|
||||
|
||||
| Model | Purpose | Tab |
|
||||
|-------|---------|-----|
|
||||
| `mart_neighbourhood_overview` | Composite livability scores | Overview |
|
||||
| `mart_neighbourhood_housing` | Affordability metrics | Housing |
|
||||
| `mart_neighbourhood_safety` | Crime rates and trends | Safety |
|
||||
| `mart_neighbourhood_demographics` | Population, income, diversity | Demographics |
|
||||
| `mart_neighbourhood_amenities` | Parks, schools, transit access | Amenities |
|
||||
| `mart_dashboard_kpis` | Pre-computed KPI values | All tabs |
|
||||
|
||||
### 4.4 dbt Sources Configuration
|
||||
|
||||
```yaml
|
||||
# dbt/models/sources.yml
|
||||
version: 2
|
||||
|
||||
sources:
|
||||
- name: toronto
|
||||
schema: public
|
||||
tables:
|
||||
- name: dim_neighbourhood
|
||||
identifier: dim_neighbourhood
|
||||
- name: dim_time
|
||||
identifier: dim_time
|
||||
- name: dim_cmhc_zone
|
||||
identifier: dim_cmhc_zone
|
||||
- name: bridge_cmhc_neighbourhood
|
||||
identifier: bridge_cmhc_neighbourhood
|
||||
- name: fact_census
|
||||
identifier: fact_census
|
||||
- name: fact_crime
|
||||
identifier: fact_crime
|
||||
- name: fact_rentals
|
||||
identifier: fact_rentals
|
||||
- name: fact_amenities
|
||||
identifier: fact_amenities
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phase 5: Dashboard Implementation
|
||||
|
||||
### 5.1 Tab Structure
|
||||
|
||||
```
|
||||
pages/toronto/
|
||||
├── dashboard.py # Main layout with tab navigation
|
||||
├── methodology.py # Keep existing
|
||||
├── tabs/
|
||||
│ ├── __init__.py
|
||||
│ ├── overview.py # Tab 1: Composite livability
|
||||
│ ├── housing.py # Tab 2: Affordability
|
||||
│ ├── safety.py # Tab 3: Crime
|
||||
│ ├── demographics.py # Tab 4: Population
|
||||
│ └── amenities.py # Tab 5: Services
|
||||
└── callbacks/
|
||||
├── __init__.py
|
||||
├── map_callbacks.py # Choropleth interactions
|
||||
├── chart_callbacks.py # Supporting charts
|
||||
└── selection_callbacks.py # Neighbourhood selection
|
||||
```
|
||||
|
||||
### 5.2 Shared Components per Tab
|
||||
|
||||
Each tab follows the same layout pattern:
|
||||
|
||||
1. **Choropleth Map** (left, 60% width)
|
||||
- 158 neighbourhoods
|
||||
- Color by selected metric
|
||||
- Click to select neighbourhood
|
||||
|
||||
2. **KPI Cards** (right, 40% width)
|
||||
- 3-4 contextual KPIs
|
||||
- Update on neighbourhood selection
|
||||
|
||||
3. **Supporting Charts** (bottom row)
|
||||
- Chart 1: Context/trend visualization
|
||||
- Chart 2: Comparison/ranking visualization
|
||||
|
||||
4. **Details Panel** (collapsible)
|
||||
- All metrics for selected neighbourhood
|
||||
|
||||
### 5.3 Graphs by Tab
|
||||
|
||||
#### Tab 1: Overview
|
||||
| Graph ID | Type | Data Source |
|
||||
|----------|------|-------------|
|
||||
| `overview-choropleth` | Choropleth | mart_neighbourhood_overview |
|
||||
| `overview-top-bottom` | Horizontal Bar | mart_neighbourhood_overview |
|
||||
| `overview-income-crime-scatter` | Scatter | mart_neighbourhood_overview |
|
||||
|
||||
#### Tab 2: Housing & Affordability
|
||||
| Graph ID | Type | Data Source |
|
||||
|----------|------|-------------|
|
||||
| `housing-choropleth` | Choropleth | mart_neighbourhood_housing |
|
||||
| `housing-rent-trend` | Line | mart_neighbourhood_housing (historical) |
|
||||
| `housing-dwelling-types` | Pie/Bar | mart_neighbourhood_housing |
|
||||
|
||||
#### Tab 3: Safety
|
||||
| Graph ID | Type | Data Source |
|
||||
|----------|------|-------------|
|
||||
| `safety-choropleth` | Choropleth | mart_neighbourhood_safety |
|
||||
| `safety-crime-breakdown` | Stacked Bar | mart_neighbourhood_safety |
|
||||
| `safety-trend` | Line | mart_neighbourhood_safety (5-year) |
|
||||
|
||||
#### Tab 4: Demographics
|
||||
| Graph ID | Type | Data Source |
|
||||
|----------|------|-------------|
|
||||
| `demographics-choropleth` | Choropleth | mart_neighbourhood_demographics |
|
||||
| `demographics-age-pyramid` | Population Pyramid | mart_neighbourhood_demographics |
|
||||
| `demographics-languages` | Horizontal Bar | mart_neighbourhood_demographics |
|
||||
|
||||
#### Tab 5: Amenities & Services
|
||||
| Graph ID | Type | Data Source |
|
||||
|----------|------|-------------|
|
||||
| `amenities-choropleth` | Choropleth | mart_neighbourhood_amenities |
|
||||
| `amenities-radar` | Radar | mart_neighbourhood_amenities |
|
||||
| `amenities-transit` | Bar | mart_neighbourhood_amenities |
|
||||
|
||||
---
|
||||
|
||||
## Phase 6: Jupyter Notebooks
|
||||
|
||||
### 6.1 Notebook Structure
|
||||
|
||||
Create one notebook per graph following this template:
|
||||
|
||||
```
|
||||
notebooks/
|
||||
├── README.md # Notebook index and conventions
|
||||
├── overview/
|
||||
│ ├── 01_overview_choropleth.ipynb
|
||||
│ ├── 02_overview_top_bottom.ipynb
|
||||
│ └── 03_overview_income_crime_scatter.ipynb
|
||||
├── housing/
|
||||
│ ├── 01_housing_choropleth.ipynb
|
||||
│ ├── 02_housing_rent_trend.ipynb
|
||||
│ └── 03_housing_dwelling_types.ipynb
|
||||
├── safety/
|
||||
│ ├── 01_safety_choropleth.ipynb
|
||||
│ ├── 02_safety_crime_breakdown.ipynb
|
||||
│ └── 03_safety_trend.ipynb
|
||||
├── demographics/
|
||||
│ ├── 01_demographics_choropleth.ipynb
|
||||
│ ├── 02_demographics_age_pyramid.ipynb
|
||||
│ └── 03_demographics_languages.ipynb
|
||||
└── amenities/
|
||||
├── 01_amenities_choropleth.ipynb
|
||||
├── 02_amenities_radar.ipynb
|
||||
└── 03_amenities_transit.ipynb
|
||||
```
|
||||
|
||||
### 6.2 Notebook Template
|
||||
|
||||
Each notebook follows this structure:
|
||||
|
||||
```markdown
|
||||
# [Graph Name] — Data Reference & Visualization
|
||||
|
||||
## 1. Data Reference
|
||||
|
||||
### 1.1 Source Tables
|
||||
- List all source tables/marts used
|
||||
- Explain the grain of each table
|
||||
|
||||
### 1.2 Query
|
||||
```sql
|
||||
-- The exact query that feeds this visualization
|
||||
SELECT ...
|
||||
FROM ...
|
||||
```
|
||||
|
||||
### 1.3 Data Pipeline Steps
|
||||
1. Step 1: Description of transformation
|
||||
2. Step 2: Description of aggregation
|
||||
3. Step 3: Description of final shaping
|
||||
|
||||
### 1.4 Sample Data
|
||||
```python
|
||||
import pandas as pd
|
||||
from sqlalchemy import create_engine
|
||||
|
||||
engine = create_engine(DATABASE_URL)
|
||||
df = pd.read_sql(query, engine)
|
||||
df.head(10)
|
||||
```
|
||||
|
||||
## 2. Data Visualization
|
||||
|
||||
### 2.1 Import Graph Factory
|
||||
```python
|
||||
from portfolio_app.figures.choropleth import create_choropleth_figure
|
||||
# or appropriate figure factory
|
||||
```
|
||||
|
||||
### 2.2 Create Visualization
|
||||
```python
|
||||
fig = create_choropleth_figure(
|
||||
geojson=geojson_data,
|
||||
data=df.to_dict('records'),
|
||||
...
|
||||
)
|
||||
fig.show()
|
||||
```
|
||||
|
||||
### 2.3 Interpretation Notes
|
||||
- What this visualization shows
|
||||
- Key insights from the data
|
||||
- Caveats or limitations
|
||||
```
|
||||
|
||||
### 6.3 Notebooks to Create (15 Total)
|
||||
|
||||
| # | Notebook | Graph | Tab |
|
||||
|---|----------|-------|-----|
|
||||
| 1 | `overview/01_overview_choropleth.ipynb` | Livability score map | Overview |
|
||||
| 2 | `overview/02_overview_top_bottom.ipynb` | Top/Bottom 10 bar chart | Overview |
|
||||
| 3 | `overview/03_overview_income_crime_scatter.ipynb` | Income vs Crime scatter | Overview |
|
||||
| 4 | `housing/01_housing_choropleth.ipynb` | Affordability index map | Housing |
|
||||
| 5 | `housing/02_housing_rent_trend.ipynb` | 5-year rent trend line | Housing |
|
||||
| 6 | `housing/03_housing_dwelling_types.ipynb` | Dwelling type breakdown | Housing |
|
||||
| 7 | `safety/01_safety_choropleth.ipynb` | Crime rate map | Safety |
|
||||
| 8 | `safety/02_safety_crime_breakdown.ipynb` | Crime type stacked bar | Safety |
|
||||
| 9 | `safety/03_safety_trend.ipynb` | 5-year crime trend line | Safety |
|
||||
| 10 | `demographics/01_demographics_choropleth.ipynb` | Income distribution map | Demographics |
|
||||
| 11 | `demographics/02_demographics_age_pyramid.ipynb` | Age distribution pyramid | Demographics |
|
||||
| 12 | `demographics/03_demographics_languages.ipynb` | Top languages bar chart | Demographics |
|
||||
| 13 | `amenities/01_amenities_choropleth.ipynb` | Park area per capita map | Amenities |
|
||||
| 14 | `amenities/02_amenities_radar.ipynb` | Amenity density radar | Amenities |
|
||||
| 15 | `amenities/03_amenities_transit.ipynb` | Transit accessibility bar | Amenities |
|
||||
|
||||
---
|
||||
|
||||
## Phase 7: Final Documentation Review
|
||||
|
||||
### 7.1 Documentation Audit Checklist
|
||||
|
||||
After all implementation is complete, perform a comprehensive audit:
|
||||
|
||||
#### Project CLAUDE.md
|
||||
- [ ] Project Status reflects current sprint
|
||||
- [ ] Run Commands are accurate
|
||||
- [ ] Code Conventions match actual code
|
||||
- [ ] Application Structure matches filesystem
|
||||
- [ ] URL Routing matches registered pages
|
||||
- [ ] Tech Stack versions are accurate
|
||||
- [ ] Data Model matches SQLAlchemy models
|
||||
- [ ] dbt Layers match actual models
|
||||
- [ ] Reference Documents section is current
|
||||
|
||||
#### docs/PROJECT_REFERENCE.md
|
||||
- [ ] Architecture diagrams are current
|
||||
- [ ] Data source inventory is complete
|
||||
- [ ] API endpoints documented
|
||||
- [ ] Refresh schedules documented
|
||||
|
||||
#### docs/neighbourhood_dashboard_spec_v1.md
|
||||
- [ ] All tabs documented
|
||||
- [ ] All graphs documented
|
||||
- [ ] Data sources for each graph documented
|
||||
- [ ] Colour palettes specified
|
||||
|
||||
#### README.md
|
||||
- [ ] Project description accurate
|
||||
- [ ] Installation instructions work
|
||||
- [ ] Quick start guide is functional
|
||||
|
||||
### 7.2 App Structure Verification
|
||||
|
||||
Verify documentation matches actual app by running:
|
||||
|
||||
```bash
|
||||
# Generate actual page routes
|
||||
grep -r "dash.register_page" portfolio_app/pages/ --include="*.py"
|
||||
|
||||
# Generate actual model classes
|
||||
grep -r "class.*Base" portfolio_app/toronto/models/ --include="*.py"
|
||||
|
||||
# Generate actual dbt models
|
||||
ls -la dbt/models/**/*.sql
|
||||
```
|
||||
|
||||
### 7.3 Final Documentation Updates
|
||||
|
||||
Based on audit findings, update:
|
||||
1. All file path references
|
||||
2. All URL route tables
|
||||
3. All model/schema references
|
||||
4. All dbt model references
|
||||
5. Sprint number and status
|
||||
|
||||
---
|
||||
|
||||
## Phase 8: Commit and Merge Strategy
|
||||
|
||||
### 8.1 Branch Strategy
|
||||
|
||||
```
|
||||
development (base)
|
||||
└── feature/neighbourhood-dashboard-transition
|
||||
├── Commit 1: Cleanup - Remove TRREB files
|
||||
├── Commit 2: Schemas - New neighbourhood schemas
|
||||
├── Commit 3: Models - Updated SQLAlchemy models
|
||||
├── Commit 4: Parsers - API parsers implementation
|
||||
├── Commit 5: Loaders - Data loading functions
|
||||
├── Commit 6: dbt - New staging/intermediate/mart models
|
||||
├── Commit 7: Dashboard - Tab implementations
|
||||
├── Commit 8: Callbacks - Dashboard interactivity
|
||||
├── Commit 9: Notebooks - All 15 Jupyter notebooks
|
||||
├── Commit 10: Documentation - Updated docs
|
||||
└── Commit 11: Final review - Documentation audit fixes
|
||||
```
|
||||
|
||||
### 8.2 Commit Messages
|
||||
|
||||
Follow conventional commits format:
|
||||
|
||||
```
|
||||
feat: Add neighbourhood-based schemas and models
|
||||
fix: Remove obsolete TRREB pipeline
|
||||
docs: Update CLAUDE.md for neighbourhood dashboard
|
||||
refactor: Restructure dbt models for neighbourhood grain
|
||||
test: Add tests for new parsers and loaders
|
||||
```
|
||||
|
||||
### 8.3 Merge Process
|
||||
|
||||
1. Create feature branch from development
|
||||
2. Implement all phases with atomic commits
|
||||
3. Run full CI checks: `make ci`
|
||||
4. Create PR to development
|
||||
5. Squash merge with comprehensive message
|
||||
6. Delete feature branch
|
||||
7. Tag release: `v2.0.0-neighbourhood-dashboard`
|
||||
|
||||
---
|
||||
|
||||
## Implementation Timeline
|
||||
|
||||
### Sprint 9: Cleanup & Foundation
|
||||
- [ ] Phase 1: Repository cleanup (delete TRREB files)
|
||||
- [ ] Phase 2: Documentation updates (CLAUDE.md, specs)
|
||||
- [ ] Phase 3.1: New schemas created
|
||||
|
||||
### Sprint 10: Data Pipeline
|
||||
- [ ] Phase 3.2: Parsers implementation (API integrations)
|
||||
- [ ] Phase 3.3: Models implementation (SQLAlchemy)
|
||||
- [ ] Phase 3.4: Loaders implementation
|
||||
|
||||
### Sprint 11: dbt & Dashboard
|
||||
- [ ] Phase 4: dbt model restructuring
|
||||
- [ ] Phase 5.1: Dashboard layout with tabs
|
||||
- [ ] Phase 5.2: Choropleth maps per tab
|
||||
|
||||
### Sprint 12: Interactivity & Charts
|
||||
- [ ] Phase 5.3: Supporting charts implementation
|
||||
- [ ] Phase 5.4: Callbacks and interactivity
|
||||
- [ ] Phase 6.1-6.5: Jupyter notebooks (3 per sprint day)
|
||||
|
||||
### Sprint 13: Documentation & Release
|
||||
- [ ] Phase 6.6-6.15: Remaining notebooks
|
||||
- [ ] Phase 7: Final documentation review
|
||||
- [ ] Phase 8: Commit, merge, and tag release
|
||||
|
||||
---
|
||||
|
||||
## Appendix A: Files Inventory Summary
|
||||
|
||||
### Files to DELETE: 6
|
||||
### Files to MODIFY: 8
|
||||
### Files to CREATE: ~45
|
||||
- Schemas: 2
|
||||
- Parsers: 2
|
||||
- Models: 2 (modifications)
|
||||
- Loaders: 5
|
||||
- dbt models: 15
|
||||
- Dashboard tabs: 5
|
||||
- Callbacks: 3
|
||||
- Notebooks: 15
|
||||
- Documentation: 3
|
||||
|
||||
---
|
||||
|
||||
## Appendix B: Risk Mitigation
|
||||
|
||||
| Risk | Mitigation |
|
||||
|------|------------|
|
||||
| CMHC zone-neighbourhood mapping inaccuracy | Document methodology, use area-weighted allocation |
|
||||
| API rate limits | Implement caching, respect rate limits, store locally |
|
||||
| Census data staleness (5-year cycle) | Document data vintage, display last update prominently |
|
||||
| Geographic boundary changes | Lock to 2024 158-neighbourhood boundaries |
|
||||
|
||||
---
|
||||
|
||||
## Appendix C: Testing Strategy
|
||||
|
||||
### Unit Tests
|
||||
- Schema validation tests
|
||||
- Parser output format tests
|
||||
- Loader idempotency tests
|
||||
|
||||
### Integration Tests
|
||||
- End-to-end: API -> Parse -> Load -> dbt -> Dashboard
|
||||
- Database constraint tests
|
||||
- dbt model tests (unique, not_null, relationships)
|
||||
|
||||
### Visual Regression Tests
|
||||
- Screenshot comparison for each tab
|
||||
- Choropleth rendering tests
|
||||
|
||||
---
|
||||
|
||||
*Document Version: 1.0*
|
||||
*Created: January 2026*
|
||||
*Author: Claude Code*
|
||||
Reference in New Issue
Block a user