- Create comprehensive transition plan (Change-Toronto-Analysis-Reviewed.md) covering cleanup, new data pipeline, dbt restructuring, and dashboard tabs - Update CLAUDE.md to reflect current app structure (Sprint 8 pages) - Add reference to new documentation in CLAUDE.md - Update import examples from TRREB to neighbourhood-based - Mark legacy docs as being replaced - Add Jupyter notebook requirements (one per graph with data reference) - Add CRITICAL rule: NEVER delete development branch Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
306 lines
9.8 KiB
Markdown
306 lines
9.8 KiB
Markdown
# CLAUDE.md
|
|
|
|
Working context for Claude Code on the Analytics Portfolio project.
|
|
|
|
---
|
|
|
|
## Project Status
|
|
|
|
**Current Sprint**: 8 (Portfolio Website Expansion - Complete)
|
|
**Next Sprint**: 9 (Neighbourhood Dashboard Transition)
|
|
**Phase**: Transitioning to Toronto Neighbourhood Dashboard
|
|
**Branch**: `development` (feature branches merge here)
|
|
|
|
---
|
|
|
|
## Quick Reference
|
|
|
|
### Run Commands
|
|
|
|
```bash
|
|
make setup # Install deps, create .env, init pre-commit
|
|
make docker-up # Start PostgreSQL + PostGIS
|
|
make docker-down # Stop containers
|
|
make db-init # Initialize database schema
|
|
make run # Start Dash dev server
|
|
make test # Run pytest
|
|
make lint # Run ruff linter
|
|
make format # Run ruff formatter
|
|
make ci # Run all checks
|
|
```
|
|
|
|
### Branch Workflow
|
|
|
|
1. Create feature branch FROM `development`: `git checkout -b feature/{sprint}-{description}`
|
|
2. Work and commit on feature branch
|
|
3. Merge INTO `development` when complete
|
|
4. Delete the feature branch after merge (keep branches clean)
|
|
5. `development` -> `staging` -> `main` for releases
|
|
|
|
**CRITICAL: NEVER DELETE the `development` branch. It is the main integration branch.**
|
|
|
|
---
|
|
|
|
## Code Conventions
|
|
|
|
### Import Style
|
|
|
|
| Context | Style | Example |
|
|
|---------|-------|---------|
|
|
| Same directory | Single dot | `from .neighbourhood import NeighbourhoodRecord` |
|
|
| Sibling directory | Double dot | `from ..schemas.neighbourhood import CensusRecord` |
|
|
| External packages | Absolute | `import pandas as pd` |
|
|
|
|
### Module Responsibilities
|
|
|
|
| Directory | Contains | Purpose |
|
|
|-----------|----------|---------|
|
|
| `schemas/` | Pydantic models | Data validation |
|
|
| `models/` | SQLAlchemy ORM | Database persistence |
|
|
| `parsers/` | API/CSV extraction | Raw data ingestion |
|
|
| `loaders/` | Database operations | Data loading |
|
|
| `figures/` | Chart factories | Plotly figure generation |
|
|
| `callbacks/` | Dash callbacks | In `pages/{dashboard}/callbacks/` |
|
|
| `errors/` | Exceptions + handlers | Error handling |
|
|
|
|
### Type Hints
|
|
|
|
Use Python 3.10+ style:
|
|
```python
|
|
def process(items: list[str], config: dict[str, int] | None = None) -> bool:
|
|
...
|
|
```
|
|
|
|
### Error Handling
|
|
|
|
```python
|
|
# errors/exceptions.py
|
|
class PortfolioError(Exception):
|
|
"""Base exception."""
|
|
|
|
class ParseError(PortfolioError):
|
|
"""PDF/CSV parsing failed."""
|
|
|
|
class ValidationError(PortfolioError):
|
|
"""Pydantic or business rule validation failed."""
|
|
|
|
class LoadError(PortfolioError):
|
|
"""Database load operation failed."""
|
|
```
|
|
|
|
### Code Standards
|
|
|
|
- Single responsibility functions with verb naming
|
|
- Early returns over deep nesting
|
|
- Google-style docstrings only for non-obvious behavior
|
|
- Module-level constants for magic values
|
|
- Pydantic BaseSettings for runtime config
|
|
|
|
---
|
|
|
|
## Application Structure
|
|
|
|
```
|
|
portfolio_app/
|
|
├── app.py # Dash app factory with Pages routing
|
|
├── config.py # Pydantic BaseSettings
|
|
├── assets/ # CSS, images (auto-served)
|
|
│ └── sidebar.css # Navigation styling
|
|
├── callbacks/ # Global callbacks
|
|
│ ├── sidebar.py # Sidebar toggle
|
|
│ └── theme.py # Dark/light theme
|
|
├── pages/
|
|
│ ├── home.py # Bio landing page -> /
|
|
│ ├── about.py # About page -> /about
|
|
│ ├── contact.py # Contact form -> /contact
|
|
│ ├── health.py # Health endpoint -> /health
|
|
│ ├── projects.py # Project showcase -> /projects
|
|
│ ├── resume.py # Resume/CV -> /resume
|
|
│ ├── blog/
|
|
│ │ ├── index.py # Blog listing -> /blog
|
|
│ │ └── article.py # Blog article -> /blog/{slug}
|
|
│ └── toronto/
|
|
│ ├── dashboard.py # Dashboard -> /toronto
|
|
│ ├── methodology.py # Methodology -> /toronto/methodology
|
|
│ └── callbacks/ # Dashboard interactions
|
|
├── components/ # Shared UI (sidebar, cards, controls)
|
|
│ ├── metric_card.py # KPI card component
|
|
│ ├── map_controls.py # Map control panel
|
|
│ ├── sidebar.py # Navigation sidebar
|
|
│ └── time_slider.py # Time range selector
|
|
├── figures/ # Shared chart factories
|
|
│ ├── choropleth.py # Map visualizations
|
|
│ ├── summary_cards.py # KPI figures
|
|
│ └── time_series.py # Trend charts
|
|
├── content/ # Markdown content
|
|
│ └── blog/ # Blog articles
|
|
├── toronto/ # Toronto data logic
|
|
│ ├── parsers/
|
|
│ ├── loaders/
|
|
│ ├── schemas/ # Pydantic
|
|
│ ├── models/ # SQLAlchemy
|
|
│ └── demo_data.py # Sample data
|
|
├── utils/ # Utilities
|
|
│ └── markdown_loader.py # Markdown processing
|
|
└── errors/
|
|
```
|
|
|
|
### URL Routing
|
|
|
|
| URL | Page | Sprint |
|
|
|-----|------|--------|
|
|
| `/` | Bio landing page | 2 |
|
|
| `/about` | About page | 8 |
|
|
| `/contact` | Contact form | 8 |
|
|
| `/health` | Health endpoint | 8 |
|
|
| `/projects` | Project showcase | 8 |
|
|
| `/resume` | Resume/CV | 8 |
|
|
| `/blog` | Blog listing | 8 |
|
|
| `/blog/{slug}` | Blog article | 8 |
|
|
| `/toronto` | Toronto Dashboard | 6 |
|
|
| `/toronto/methodology` | Dashboard methodology | 6 |
|
|
|
|
---
|
|
|
|
## Tech Stack (Locked)
|
|
|
|
| Layer | Technology | Version |
|
|
|-------|------------|---------|
|
|
| Database | PostgreSQL + PostGIS | 16.x |
|
|
| Validation | Pydantic | >=2.0 |
|
|
| ORM | SQLAlchemy | >=2.0 (2.0-style API only) |
|
|
| Transformation | dbt-postgres | >=1.7 |
|
|
| Data Processing | Pandas | >=2.1 |
|
|
| Geospatial | GeoPandas + Shapely | >=0.14 |
|
|
| Visualization | Dash + Plotly | >=2.14 |
|
|
| UI Components | dash-mantine-components | Latest stable |
|
|
| Testing | pytest | >=7.0 |
|
|
| Python | 3.11+ | Via pyenv |
|
|
|
|
**Notes**:
|
|
- SQLAlchemy 2.0 + Pydantic 2.0 only (never mix 1.x APIs)
|
|
- PostGIS extension required in database
|
|
- Docker Compose V2 format (no `version` field)
|
|
|
|
---
|
|
|
|
## Data Model Overview
|
|
|
|
### Geographic Reality (Toronto Housing)
|
|
|
|
```
|
|
TRREB Districts (~35) - Purchase data (W01, C01, E01...)
|
|
CMHC Zones (~20) - Rental data (Census Tract aligned)
|
|
City Neighbourhoods (158) - Enrichment/overlay only
|
|
```
|
|
|
|
**Critical**: These geographies do NOT align. Display as separate layers—do not force crosswalks.
|
|
|
|
### Star Schema
|
|
|
|
| Table | Type | Keys |
|
|
|-------|------|------|
|
|
| `fact_purchases` | Fact | -> dim_time, dim_trreb_district |
|
|
| `fact_rentals` | Fact | -> dim_time, dim_cmhc_zone |
|
|
| `dim_time` | Dimension | date_key (PK) |
|
|
| `dim_trreb_district` | Dimension | district_key (PK), geometry |
|
|
| `dim_cmhc_zone` | Dimension | zone_key (PK), geometry |
|
|
| `dim_neighbourhood` | Dimension | neighbourhood_id (PK), geometry |
|
|
| `dim_policy_event` | Dimension | event_id (PK) |
|
|
|
|
**V1 Rule**: `dim_neighbourhood` has NO FK to fact tables—reference overlay only.
|
|
|
|
### dbt Layers
|
|
|
|
| Layer | Naming | Purpose |
|
|
|-------|--------|---------|
|
|
| Staging | `stg_{source}__{entity}` | 1:1 source, cleaned, typed |
|
|
| Intermediate | `int_{domain}__{transform}` | Business logic |
|
|
| Marts | `mart_{domain}` | Final analytical tables |
|
|
|
|
---
|
|
|
|
## DO NOT BUILD (Phase 1)
|
|
|
|
**Stop and flag if a task seems to require these**:
|
|
|
|
| Feature | Reason |
|
|
|---------|--------|
|
|
| `bridge_district_neighbourhood` table | Area-weighted aggregation is Phase 4 |
|
|
| Crime data integration | Deferred to Phase 4 |
|
|
| Historical boundary reconciliation (140->158) | 2021+ data only for V1 |
|
|
| ML prediction models | Energy project scope (Phase 3) |
|
|
| Multi-project shared infrastructure | Build first, abstract second (Phase 2) |
|
|
|
|
---
|
|
|
|
## Sprint 1 Deliverables
|
|
|
|
| Category | Tasks |
|
|
|----------|-------|
|
|
| **Bootstrap** | Git init, pyproject.toml, .env.example, Makefile, CLAUDE.md |
|
|
| **Infrastructure** | Docker Compose (PostgreSQL + PostGIS), scripts/ directory |
|
|
| **App Foundation** | portfolio_app/ structure, config.py, error handling |
|
|
| **Tests** | tests/ directory, conftest.py, pytest config |
|
|
| **Data Acquisition** | Download TRREB PDFs, START boundary digitization (HUMAN task) |
|
|
|
|
### Human Tasks (Cannot Automate)
|
|
|
|
| Task | Tool | Effort |
|
|
|------|------|--------|
|
|
| Digitize TRREB district boundaries | QGIS | 3-4 hours |
|
|
| Research policy events (10-20) | Manual | 2-3 hours |
|
|
| Replace social link placeholders | Manual | 5 minutes |
|
|
|
|
---
|
|
|
|
## Environment Variables
|
|
|
|
Required in `.env`:
|
|
|
|
```bash
|
|
DATABASE_URL=postgresql://user:pass@localhost:5432/portfolio
|
|
POSTGRES_USER=portfolio
|
|
POSTGRES_PASSWORD=<secure>
|
|
POSTGRES_DB=portfolio
|
|
DASH_DEBUG=true
|
|
SECRET_KEY=<random>
|
|
LOG_LEVEL=INFO
|
|
```
|
|
|
|
---
|
|
|
|
## Script Standards
|
|
|
|
All scripts in `scripts/`:
|
|
- Include usage comments at top
|
|
- Idempotent where possible
|
|
- Exit codes: 0 = success, 1 = error
|
|
- Use `set -euo pipefail` for bash
|
|
- Log to stdout, errors to stderr
|
|
|
|
---
|
|
|
|
## Reference Documents
|
|
|
|
| Document | Location | Use When |
|
|
|----------|----------|----------|
|
|
| Full specification | `docs/PROJECT_REFERENCE.md` | Architecture decisions |
|
|
| Data schemas (legacy) | `docs/toronto_housing_dashboard_spec_v5.md` | Reference only - being replaced |
|
|
| WBS details (legacy) | `docs/wbs_sprint_plan_v4.md` | Reference only - being replaced |
|
|
| **Neighbourhood Dashboard Vision** | `docs/changes/Change-Toronto-Analysis.md` | New dashboard specification |
|
|
| **Implementation Plan** | `docs/changes/Change-Toronto-Analysis-Reviewed.md` | Sprint planning, cleanup tasks |
|
|
|
|
---
|
|
|
|
## Pending Transition
|
|
|
|
**Note**: This project is transitioning from a TRREB district-based housing dashboard to a comprehensive Toronto Neighbourhood Dashboard (158 neighbourhoods). See the Implementation Plan for details on:
|
|
- Files being deprecated (TRREB parsers, schemas, loaders)
|
|
- New data sources (Toronto Open Data, Toronto Police, CMHC APIs)
|
|
- New dashboard tabs (Overview, Housing, Safety, Demographics, Amenities)
|
|
|
|
---
|
|
|
|
*Last Updated: Sprint 8*
|