7.2 KiB
7.2 KiB
CLAUDE.md
Working context for Claude Code on the Analytics Portfolio project.
Project Status
Current Sprint: 1 (Project Bootstrap)
Phase: 1 - Toronto Housing Dashboard
Branch: development (feature branches merge here)
Quick Reference
Run Commands
make setup # Install deps, create .env, init pre-commit
make docker-up # Start PostgreSQL + PostGIS
make docker-down # Stop containers
make db-init # Initialize database schema
make run # Start Dash dev server
make test # Run pytest
make lint # Run ruff linter
make format # Run ruff formatter
make ci # Run all checks
Branch Workflow
- Create feature branch FROM
development:git checkout -b feature/{sprint}-{description} - Work and commit on feature branch
- Merge INTO
developmentwhen complete development->staging->mainfor releases
Code Conventions
Import Style
| Context | Style | Example |
|---|---|---|
| Same directory | Single dot | from .trreb import TRREBParser |
| Sibling directory | Double dot | from ..schemas.trreb import TRREBRecord |
| External packages | Absolute | import pandas as pd |
Module Responsibilities
| Directory | Contains | Purpose |
|---|---|---|
schemas/ |
Pydantic models | Data validation |
models/ |
SQLAlchemy ORM | Database persistence |
parsers/ |
PDF/CSV extraction | Raw data ingestion |
loaders/ |
Database operations | Data loading |
figures/ |
Chart factories | Plotly figure generation |
callbacks/ |
Dash callbacks | In pages/{dashboard}/callbacks/ |
errors/ |
Exceptions + handlers | Error handling |
Type Hints
Use Python 3.10+ style:
def process(items: list[str], config: dict[str, int] | None = None) -> bool:
...
Error Handling
# errors/exceptions.py
class PortfolioError(Exception):
"""Base exception."""
class ParseError(PortfolioError):
"""PDF/CSV parsing failed."""
class ValidationError(PortfolioError):
"""Pydantic or business rule validation failed."""
class LoadError(PortfolioError):
"""Database load operation failed."""
Code Standards
- Single responsibility functions with verb naming
- Early returns over deep nesting
- Google-style docstrings only for non-obvious behavior
- Module-level constants for magic values
- Pydantic BaseSettings for runtime config
Application Structure
portfolio_app/
├── app.py # Dash app factory with Pages routing
├── config.py # Pydantic BaseSettings
├── assets/ # CSS, images (auto-served)
├── pages/
│ ├── home.py # Bio landing page -> /
│ └── toronto/
│ ├── dashboard.py # Layout only -> /toronto
│ └── callbacks/ # Interaction logic
├── components/ # Shared UI (navbar, footer, cards)
├── figures/ # Shared chart factories
├── toronto/ # Toronto data logic
│ ├── parsers/
│ ├── loaders/
│ ├── schemas/ # Pydantic
│ └── models/ # SQLAlchemy
└── errors/
URL Routing
| URL | Page | Sprint |
|---|---|---|
/ |
Bio landing page | 2 |
/toronto |
Toronto Housing Dashboard | 6 |
Tech Stack (Locked)
| Layer | Technology | Version |
|---|---|---|
| Database | PostgreSQL + PostGIS | 16.x |
| Validation | Pydantic | >=2.0 |
| ORM | SQLAlchemy | >=2.0 (2.0-style API only) |
| Transformation | dbt-postgres | >=1.7 |
| Data Processing | Pandas | >=2.1 |
| Geospatial | GeoPandas + Shapely | >=0.14 |
| Visualization | Dash + Plotly | >=2.14 |
| UI Components | dash-mantine-components | Latest stable |
| Testing | pytest | >=7.0 |
| Python | 3.11+ | Via pyenv |
Notes:
- SQLAlchemy 2.0 + Pydantic 2.0 only (never mix 1.x APIs)
- PostGIS extension required in database
- Docker Compose V2 format (no
versionfield)
Data Model Overview
Geographic Reality (Toronto Housing)
TRREB Districts (~35) - Purchase data (W01, C01, E01...)
CMHC Zones (~20) - Rental data (Census Tract aligned)
City Neighbourhoods (158) - Enrichment/overlay only
Critical: These geographies do NOT align. Display as separate layers—do not force crosswalks.
Star Schema
| Table | Type | Keys |
|---|---|---|
fact_purchases |
Fact | -> dim_time, dim_trreb_district |
fact_rentals |
Fact | -> dim_time, dim_cmhc_zone |
dim_time |
Dimension | date_key (PK) |
dim_trreb_district |
Dimension | district_key (PK), geometry |
dim_cmhc_zone |
Dimension | zone_key (PK), geometry |
dim_neighbourhood |
Dimension | neighbourhood_id (PK), geometry |
dim_policy_event |
Dimension | event_id (PK) |
V1 Rule: dim_neighbourhood has NO FK to fact tables—reference overlay only.
dbt Layers
| Layer | Naming | Purpose |
|---|---|---|
| Staging | stg_{source}__{entity} |
1:1 source, cleaned, typed |
| Intermediate | int_{domain}__{transform} |
Business logic |
| Marts | mart_{domain} |
Final analytical tables |
DO NOT BUILD (Phase 1)
Stop and flag if a task seems to require these:
| Feature | Reason |
|---|---|
bridge_district_neighbourhood table |
Area-weighted aggregation is Phase 4 |
| Crime data integration | Deferred to Phase 4 |
| Historical boundary reconciliation (140->158) | 2021+ data only for V1 |
| ML prediction models | Energy project scope (Phase 3) |
| Multi-project shared infrastructure | Build first, abstract second (Phase 2) |
Sprint 1 Deliverables
| Category | Tasks |
|---|---|
| Bootstrap | Git init, pyproject.toml, .env.example, Makefile, CLAUDE.md |
| Infrastructure | Docker Compose (PostgreSQL + PostGIS), scripts/ directory |
| App Foundation | portfolio_app/ structure, config.py, error handling |
| Tests | tests/ directory, conftest.py, pytest config |
| Data Acquisition | Download TRREB PDFs, START boundary digitization (HUMAN task) |
Human Tasks (Cannot Automate)
| Task | Tool | Effort |
|---|---|---|
| Digitize TRREB district boundaries | QGIS | 3-4 hours |
| Research policy events (10-20) | Manual | 2-3 hours |
| Replace social link placeholders | Manual | 5 minutes |
Environment Variables
Required in .env:
DATABASE_URL=postgresql://user:pass@localhost:5432/portfolio
POSTGRES_USER=portfolio
POSTGRES_PASSWORD=<secure>
POSTGRES_DB=portfolio
DASH_DEBUG=true
SECRET_KEY=<random>
LOG_LEVEL=INFO
Script Standards
All scripts in scripts/:
- Include usage comments at top
- Idempotent where possible
- Exit codes: 0 = success, 1 = error
- Use
set -euo pipefailfor bash - Log to stdout, errors to stderr
Reference Documents
| Document | Location | Use When |
|---|---|---|
| Full specification | docs/PROJECT_REFERENCE.md |
Architecture decisions |
| Data schemas | docs/toronto_housing_dashboard_spec_v5.md |
Parser/model tasks |
| WBS details | docs/wbs_sprint_plan_v4.md |
Sprint planning |
Last Updated: Sprint 1