# CLAUDE.md Working context for Claude Code on the Analytics Portfolio project. --- ## Project Status **Current Sprint**: 7 (Navigation & Theme Modernization) **Phase**: 1 - Toronto Housing Dashboard **Branch**: `development` (feature branches merge here) --- ## Quick Reference ### Run Commands ```bash make setup # Install deps, create .env, init pre-commit make docker-up # Start PostgreSQL + PostGIS make docker-down # Stop containers make db-init # Initialize database schema make run # Start Dash dev server make test # Run pytest make lint # Run ruff linter make format # Run ruff formatter make ci # Run all checks ``` ### Branch Workflow 1. Create feature branch FROM `development`: `git checkout -b feature/{sprint}-{description}` 2. Work and commit on feature branch 3. Merge INTO `development` when complete 4. `development` -> `staging` -> `main` for releases --- ## Code Conventions ### Import Style | Context | Style | Example | |---------|-------|---------| | Same directory | Single dot | `from .trreb import TRREBParser` | | Sibling directory | Double dot | `from ..schemas.trreb import TRREBRecord` | | External packages | Absolute | `import pandas as pd` | ### Module Responsibilities | Directory | Contains | Purpose | |-----------|----------|---------| | `schemas/` | Pydantic models | Data validation | | `models/` | SQLAlchemy ORM | Database persistence | | `parsers/` | PDF/CSV extraction | Raw data ingestion | | `loaders/` | Database operations | Data loading | | `figures/` | Chart factories | Plotly figure generation | | `callbacks/` | Dash callbacks | In `pages/{dashboard}/callbacks/` | | `errors/` | Exceptions + handlers | Error handling | ### Type Hints Use Python 3.10+ style: ```python def process(items: list[str], config: dict[str, int] | None = None) -> bool: ... ``` ### Error Handling ```python # errors/exceptions.py class PortfolioError(Exception): """Base exception.""" class ParseError(PortfolioError): """PDF/CSV parsing failed.""" class ValidationError(PortfolioError): """Pydantic or business rule validation failed.""" class LoadError(PortfolioError): """Database load operation failed.""" ``` ### Code Standards - Single responsibility functions with verb naming - Early returns over deep nesting - Google-style docstrings only for non-obvious behavior - Module-level constants for magic values - Pydantic BaseSettings for runtime config --- ## Application Structure ``` portfolio_app/ ├── app.py # Dash app factory with Pages routing ├── config.py # Pydantic BaseSettings ├── assets/ # CSS, images (auto-served) ├── pages/ │ ├── home.py # Bio landing page -> / │ └── toronto/ │ ├── dashboard.py # Layout only -> /toronto │ └── callbacks/ # Interaction logic ├── components/ # Shared UI (navbar, footer, cards) ├── figures/ # Shared chart factories ├── toronto/ # Toronto data logic │ ├── parsers/ │ ├── loaders/ │ ├── schemas/ # Pydantic │ └── models/ # SQLAlchemy └── errors/ ``` ### URL Routing | URL | Page | Sprint | |-----|------|--------| | `/` | Bio landing page | 2 | | `/toronto` | Toronto Housing Dashboard | 6 | --- ## Tech Stack (Locked) | Layer | Technology | Version | |-------|------------|---------| | Database | PostgreSQL + PostGIS | 16.x | | Validation | Pydantic | >=2.0 | | ORM | SQLAlchemy | >=2.0 (2.0-style API only) | | Transformation | dbt-postgres | >=1.7 | | Data Processing | Pandas | >=2.1 | | Geospatial | GeoPandas + Shapely | >=0.14 | | Visualization | Dash + Plotly | >=2.14 | | UI Components | dash-mantine-components | Latest stable | | Testing | pytest | >=7.0 | | Python | 3.11+ | Via pyenv | **Notes**: - SQLAlchemy 2.0 + Pydantic 2.0 only (never mix 1.x APIs) - PostGIS extension required in database - Docker Compose V2 format (no `version` field) --- ## Data Model Overview ### Geographic Reality (Toronto Housing) ``` TRREB Districts (~35) - Purchase data (W01, C01, E01...) CMHC Zones (~20) - Rental data (Census Tract aligned) City Neighbourhoods (158) - Enrichment/overlay only ``` **Critical**: These geographies do NOT align. Display as separate layers—do not force crosswalks. ### Star Schema | Table | Type | Keys | |-------|------|------| | `fact_purchases` | Fact | -> dim_time, dim_trreb_district | | `fact_rentals` | Fact | -> dim_time, dim_cmhc_zone | | `dim_time` | Dimension | date_key (PK) | | `dim_trreb_district` | Dimension | district_key (PK), geometry | | `dim_cmhc_zone` | Dimension | zone_key (PK), geometry | | `dim_neighbourhood` | Dimension | neighbourhood_id (PK), geometry | | `dim_policy_event` | Dimension | event_id (PK) | **V1 Rule**: `dim_neighbourhood` has NO FK to fact tables—reference overlay only. ### dbt Layers | Layer | Naming | Purpose | |-------|--------|---------| | Staging | `stg_{source}__{entity}` | 1:1 source, cleaned, typed | | Intermediate | `int_{domain}__{transform}` | Business logic | | Marts | `mart_{domain}` | Final analytical tables | --- ## DO NOT BUILD (Phase 1) **Stop and flag if a task seems to require these**: | Feature | Reason | |---------|--------| | `bridge_district_neighbourhood` table | Area-weighted aggregation is Phase 4 | | Crime data integration | Deferred to Phase 4 | | Historical boundary reconciliation (140->158) | 2021+ data only for V1 | | ML prediction models | Energy project scope (Phase 3) | | Multi-project shared infrastructure | Build first, abstract second (Phase 2) | --- ## Sprint 1 Deliverables | Category | Tasks | |----------|-------| | **Bootstrap** | Git init, pyproject.toml, .env.example, Makefile, CLAUDE.md | | **Infrastructure** | Docker Compose (PostgreSQL + PostGIS), scripts/ directory | | **App Foundation** | portfolio_app/ structure, config.py, error handling | | **Tests** | tests/ directory, conftest.py, pytest config | | **Data Acquisition** | Download TRREB PDFs, START boundary digitization (HUMAN task) | ### Human Tasks (Cannot Automate) | Task | Tool | Effort | |------|------|--------| | Digitize TRREB district boundaries | QGIS | 3-4 hours | | Research policy events (10-20) | Manual | 2-3 hours | | Replace social link placeholders | Manual | 5 minutes | --- ## Environment Variables Required in `.env`: ```bash DATABASE_URL=postgresql://user:pass@localhost:5432/portfolio POSTGRES_USER=portfolio POSTGRES_PASSWORD= POSTGRES_DB=portfolio DASH_DEBUG=true SECRET_KEY= LOG_LEVEL=INFO ``` --- ## Script Standards All scripts in `scripts/`: - Include usage comments at top - Idempotent where possible - Exit codes: 0 = success, 1 = error - Use `set -euo pipefail` for bash - Log to stdout, errors to stderr --- ## Reference Documents | Document | Location | Use When | |----------|----------|----------| | Full specification | `docs/PROJECT_REFERENCE.md` | Architecture decisions | | Data schemas | `docs/toronto_housing_dashboard_spec_v5.md` | Parser/model tasks | | WBS details | `docs/wbs_sprint_plan_v4.md` | Sprint planning | --- *Last Updated: Sprint 7*