Files
personal-portfolio/CLAUDE.md
lmiranda 053acf6436 feat: Implement Phase 3 neighbourhood data model
Add schemas, parsers, loaders, and models for Toronto neighbourhood-centric
data including census profiles, crime statistics, and amenities.

Schemas:
- NeighbourhoodRecord, CensusRecord, CrimeRecord, CrimeType
- AmenityType, AmenityRecord, AmenityCount

Models:
- BridgeCMHCNeighbourhood (zone-to-neighbourhood mapping with weights)
- FactCensus, FactCrime, FactAmenities

Parsers:
- TorontoOpenDataParser (CKAN API for neighbourhoods, census, amenities)
- TorontoPoliceParser (crime rates, MCI data)

Loaders:
- load_census_data, load_crime_data, load_amenities
- build_cmhc_neighbourhood_crosswalk (PostGIS area weights)

Also updates CLAUDE.md with projman plugin workflow documentation.

Closes #53, #54, #55, #56, #57, #58, #59

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-16 11:07:13 -05:00

9.5 KiB

CLAUDE.md

Working context for Claude Code on the Analytics Portfolio project.


Project Status

Current Sprint: 9 (Neighbourhood Dashboard Transition) Phase: Toronto Neighbourhood Dashboard Branch: development (feature branches merge here)


Quick Reference

Run Commands

make setup          # Install deps, create .env, init pre-commit
make docker-up      # Start PostgreSQL + PostGIS
make docker-down    # Stop containers
make db-init        # Initialize database schema
make run            # Start Dash dev server
make test           # Run pytest
make lint           # Run ruff linter
make format         # Run ruff formatter
make ci             # Run all checks

Branch Workflow

  1. Create feature branch FROM development: git checkout -b feature/{sprint}-{description}
  2. Work and commit on feature branch
  3. Merge INTO development when complete
  4. Delete the feature branch after merge (keep branches clean)
  5. development -> staging -> main for releases

CRITICAL: NEVER DELETE the development branch. It is the main integration branch.


Code Conventions

Import Style

Context Style Example
Same directory Single dot from .neighbourhood import NeighbourhoodRecord
Sibling directory Double dot from ..schemas.neighbourhood import CensusRecord
External packages Absolute import pandas as pd

Module Responsibilities

Directory Contains Purpose
schemas/ Pydantic models Data validation
models/ SQLAlchemy ORM Database persistence
parsers/ API/CSV extraction Raw data ingestion
loaders/ Database operations Data loading
figures/ Chart factories Plotly figure generation
callbacks/ Dash callbacks In pages/{dashboard}/callbacks/
errors/ Exceptions + handlers Error handling

Type Hints

Use Python 3.10+ style:

def process(items: list[str], config: dict[str, int] | None = None) -> bool:
    ...

Error Handling

# errors/exceptions.py
class PortfolioError(Exception):
    """Base exception."""

class ParseError(PortfolioError):
    """PDF/CSV parsing failed."""

class ValidationError(PortfolioError):
    """Pydantic or business rule validation failed."""

class LoadError(PortfolioError):
    """Database load operation failed."""

Code Standards

  • Single responsibility functions with verb naming
  • Early returns over deep nesting
  • Google-style docstrings only for non-obvious behavior
  • Module-level constants for magic values
  • Pydantic BaseSettings for runtime config

Application Structure

portfolio_app/
├── app.py                    # Dash app factory with Pages routing
├── config.py                 # Pydantic BaseSettings
├── assets/                   # CSS, images (auto-served)
│   └── sidebar.css          # Navigation styling
├── callbacks/               # Global callbacks
│   ├── sidebar.py           # Sidebar toggle
│   └── theme.py             # Dark/light theme
├── pages/
│   ├── home.py              # Bio landing page -> /
│   ├── about.py             # About page -> /about
│   ├── contact.py           # Contact form -> /contact
│   ├── health.py            # Health endpoint -> /health
│   ├── projects.py          # Project showcase -> /projects
│   ├── resume.py            # Resume/CV -> /resume
│   ├── blog/
│   │   ├── index.py         # Blog listing -> /blog
│   │   └── article.py       # Blog article -> /blog/{slug}
│   └── toronto/
│       ├── dashboard.py     # Dashboard -> /toronto
│       ├── methodology.py   # Methodology -> /toronto/methodology
│       └── callbacks/       # Dashboard interactions
├── components/              # Shared UI (sidebar, cards, controls)
│   ├── metric_card.py       # KPI card component
│   ├── map_controls.py      # Map control panel
│   ├── sidebar.py           # Navigation sidebar
│   └── time_slider.py       # Time range selector
├── figures/                 # Shared chart factories
│   ├── choropleth.py        # Map visualizations
│   ├── summary_cards.py     # KPI figures
│   └── time_series.py       # Trend charts
├── content/                 # Markdown content
│   └── blog/                # Blog articles
├── toronto/                 # Toronto data logic
│   ├── parsers/
│   ├── loaders/
│   ├── schemas/             # Pydantic
│   ├── models/              # SQLAlchemy
│   └── demo_data.py         # Sample data
├── utils/                   # Utilities
│   └── markdown_loader.py   # Markdown processing
└── errors/

URL Routing

URL Page Sprint
/ Bio landing page 2
/about About page 8
/contact Contact form 8
/health Health endpoint 8
/projects Project showcase 8
/resume Resume/CV 8
/blog Blog listing 8
/blog/{slug} Blog article 8
/toronto Toronto Dashboard 6
/toronto/methodology Dashboard methodology 6

Tech Stack (Locked)

Layer Technology Version
Database PostgreSQL + PostGIS 16.x
Validation Pydantic >=2.0
ORM SQLAlchemy >=2.0 (2.0-style API only)
Transformation dbt-postgres >=1.7
Data Processing Pandas >=2.1
Geospatial GeoPandas + Shapely >=0.14
Visualization Dash + Plotly >=2.14
UI Components dash-mantine-components Latest stable
Testing pytest >=7.0
Python 3.11+ Via pyenv

Notes:

  • SQLAlchemy 2.0 + Pydantic 2.0 only (never mix 1.x APIs)
  • PostGIS extension required in database
  • Docker Compose V2 format (no version field)

Data Model Overview

Geographic Reality (Toronto Housing)

City Neighbourhoods (158) - Primary geographic unit for analysis
CMHC Zones (~20)          - Rental data (Census Tract aligned)

Star Schema

Table Type Keys
fact_rentals Fact -> dim_time, dim_cmhc_zone
dim_time Dimension date_key (PK)
dim_cmhc_zone Dimension zone_key (PK), geometry
dim_neighbourhood Dimension neighbourhood_id (PK), geometry
dim_policy_event Dimension event_id (PK)

dbt Layers

Layer Naming Purpose
Staging stg_{source}__{entity} 1:1 source, cleaned, typed
Intermediate int_{domain}__{transform} Business logic
Marts mart_{domain} Final analytical tables

Deferred Features

Stop and flag if a task seems to require these:

Feature Reason
Historical boundary reconciliation (140->158) 2021+ data only for V1
ML prediction models Energy project scope (future phase)
Multi-project shared infrastructure Build first, abstract second

Environment Variables

Required in .env:

DATABASE_URL=postgresql://user:pass@localhost:5432/portfolio
POSTGRES_USER=portfolio
POSTGRES_PASSWORD=<secure>
POSTGRES_DB=portfolio
DASH_DEBUG=true
SECRET_KEY=<random>
LOG_LEVEL=INFO

Script Standards

All scripts in scripts/:

  • Include usage comments at top
  • Idempotent where possible
  • Exit codes: 0 = success, 1 = error
  • Use set -euo pipefail for bash
  • Log to stdout, errors to stderr

Reference Documents

Document Location Use When
Project reference docs/PROJECT_REFERENCE.md Architecture decisions
Dashboard vision docs/changes/Change-Toronto-Analysis.md Dashboard specification
Implementation plan docs/changes/Change-Toronto-Analysis-Reviewed.md Sprint planning

Projman Plugin Workflow

CRITICAL: Always use the projman plugin for sprint and task management.

When to Use Projman Skills

Skill Trigger Purpose
/projman:sprint-plan New sprint or phase implementation Architecture analysis + Gitea issue creation
/projman:sprint-start Beginning implementation work Load lessons learned, start execution
/projman:sprint-status Check progress Review blockers and completion status
/projman:sprint-close Sprint completion Capture lessons learned to Wiki.js

Default Behavior

When user requests implementation work:

  1. ALWAYS start with /projman:sprint-plan before writing code
  2. Create Gitea issues with proper labels and acceptance criteria
  3. Use /projman:sprint-start to begin execution with lessons learned
  4. Track progress via Gitea issue comments
  5. Close sprint with /projman:sprint-close to document lessons

Gitea Repository

  • Repo: lmiranda/personal-portfolio
  • Host: gitea.hotserv.cloud
  • Note: lmiranda is a user account (not org), so label lookup may require repo-level labels

MCP Tools Available

Gitea:

  • list_issues, get_issue, create_issue, update_issue, add_comment
  • get_labels, suggest_labels

Wiki.js:

  • search_lessons, create_lesson, search_pages, get_page

Issue Structure

Every Gitea issue should include:

  • Overview: Brief description
  • Files to Create/Modify: Explicit paths
  • Acceptance Criteria: Checkboxes
  • Technical Notes: Implementation hints
  • Labels: Listed in body (workaround for label API issues)

Last Updated: Sprint 9