Template

Files

lmiranda 053acf6436 feat: Implement Phase 3 neighbourhood data model

Add schemas, parsers, loaders, and models for Toronto neighbourhood-centric
data including census profiles, crime statistics, and amenities.

Schemas:
- NeighbourhoodRecord, CensusRecord, CrimeRecord, CrimeType
- AmenityType, AmenityRecord, AmenityCount

Models:
- BridgeCMHCNeighbourhood (zone-to-neighbourhood mapping with weights)
- FactCensus, FactCrime, FactAmenities

Parsers:
- TorontoOpenDataParser (CKAN API for neighbourhoods, census, amenities)
- TorontoPoliceParser (crime rates, MCI data)

Loaders:
- load_census_data, load_crime_data, load_amenities
- build_cmhc_neighbourhood_crosswalk (PostGIS area weights)

Also updates CLAUDE.md with projman plugin workflow documentation.

Closes #53, #54, #55, #56, #57, #58, #59

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-01-16 11:07:13 -05:00

9.5 KiB

Raw Blame History

CLAUDE.md

Working context for Claude Code on the Analytics Portfolio project.

Project Status

Current Sprint: 9 (Neighbourhood Dashboard Transition) Phase: Toronto Neighbourhood Dashboard Branch: development (feature branches merge here)

Quick Reference

Run Commands

make setup          # Install deps, create .env, init pre-commit
make docker-up      # Start PostgreSQL + PostGIS
make docker-down    # Stop containers
make db-init        # Initialize database schema
make run            # Start Dash dev server
make test           # Run pytest
make lint           # Run ruff linter
make format         # Run ruff formatter
make ci             # Run all checks

Branch Workflow

Create feature branch FROM development: git checkout -b feature/{sprint}-{description}
Work and commit on feature branch
Merge INTO development when complete
Delete the feature branch after merge (keep branches clean)
development -> staging -> main for releases

CRITICAL: NEVER DELETE the development branch. It is the main integration branch.

Code Conventions

Import Style

Context	Style	Example
Same directory	Single dot	`from .neighbourhood import NeighbourhoodRecord`
Sibling directory	Double dot	`from ..schemas.neighbourhood import CensusRecord`
External packages	Absolute	`import pandas as pd`

Module Responsibilities

Directory	Contains	Purpose
`schemas/`	Pydantic models	Data validation
`models/`	SQLAlchemy ORM	Database persistence
`parsers/`	API/CSV extraction	Raw data ingestion
`loaders/`	Database operations	Data loading
`figures/`	Chart factories	Plotly figure generation
`callbacks/`	Dash callbacks	In `pages/{dashboard}/callbacks/`
`errors/`	Exceptions + handlers	Error handling

Type Hints

Use Python 3.10+ style:

def process(items: list[str], config: dict[str, int] | None = None) -> bool:
    ...

Error Handling

# errors/exceptions.py
class PortfolioError(Exception):
    """Base exception."""

class ParseError(PortfolioError):
    """PDF/CSV parsing failed."""

class ValidationError(PortfolioError):
    """Pydantic or business rule validation failed."""

class LoadError(PortfolioError):
    """Database load operation failed."""

Code Standards

Single responsibility functions with verb naming
Early returns over deep nesting
Google-style docstrings only for non-obvious behavior
Module-level constants for magic values
Pydantic BaseSettings for runtime config

Application Structure

portfolio_app/
├── app.py                    # Dash app factory with Pages routing
├── config.py                 # Pydantic BaseSettings
├── assets/                   # CSS, images (auto-served)
│   └── sidebar.css          # Navigation styling
├── callbacks/               # Global callbacks
│   ├── sidebar.py           # Sidebar toggle
│   └── theme.py             # Dark/light theme
├── pages/
│   ├── home.py              # Bio landing page -> /
│   ├── about.py             # About page -> /about
│   ├── contact.py           # Contact form -> /contact
│   ├── health.py            # Health endpoint -> /health
│   ├── projects.py          # Project showcase -> /projects
│   ├── resume.py            # Resume/CV -> /resume
│   ├── blog/
│   │   ├── index.py         # Blog listing -> /blog
│   │   └── article.py       # Blog article -> /blog/{slug}
│   └── toronto/
│       ├── dashboard.py     # Dashboard -> /toronto
│       ├── methodology.py   # Methodology -> /toronto/methodology
│       └── callbacks/       # Dashboard interactions
├── components/              # Shared UI (sidebar, cards, controls)
│   ├── metric_card.py       # KPI card component
│   ├── map_controls.py      # Map control panel
│   ├── sidebar.py           # Navigation sidebar
│   └── time_slider.py       # Time range selector
├── figures/                 # Shared chart factories
│   ├── choropleth.py        # Map visualizations
│   ├── summary_cards.py     # KPI figures
│   └── time_series.py       # Trend charts
├── content/                 # Markdown content
│   └── blog/                # Blog articles
├── toronto/                 # Toronto data logic
│   ├── parsers/
│   ├── loaders/
│   ├── schemas/             # Pydantic
│   ├── models/              # SQLAlchemy
│   └── demo_data.py         # Sample data
├── utils/                   # Utilities
│   └── markdown_loader.py   # Markdown processing
└── errors/

URL Routing

URL	Page	Sprint
`/`	Bio landing page	2
`/about`	About page	8
`/contact`	Contact form	8
`/health`	Health endpoint	8
`/projects`	Project showcase	8
`/resume`	Resume/CV	8
`/blog`	Blog listing	8
`/blog/{slug}`	Blog article	8
`/toronto`	Toronto Dashboard	6
`/toronto/methodology`	Dashboard methodology	6

Tech Stack (Locked)

Layer	Technology	Version
Database	PostgreSQL + PostGIS	16.x
Validation	Pydantic	>=2.0
ORM	SQLAlchemy	>=2.0 (2.0-style API only)
Transformation	dbt-postgres	>=1.7
Data Processing	Pandas	>=2.1
Geospatial	GeoPandas + Shapely	>=0.14
Visualization	Dash + Plotly	>=2.14
UI Components	dash-mantine-components	Latest stable
Testing	pytest	>=7.0
Python	3.11+	Via pyenv

Notes:

SQLAlchemy 2.0 + Pydantic 2.0 only (never mix 1.x APIs)
PostGIS extension required in database
Docker Compose V2 format (no version field)

Data Model Overview

Geographic Reality (Toronto Housing)

City Neighbourhoods (158) - Primary geographic unit for analysis
CMHC Zones (~20)          - Rental data (Census Tract aligned)

Star Schema

Table	Type	Keys
`fact_rentals`	Fact	-> dim_time, dim_cmhc_zone
`dim_time`	Dimension	date_key (PK)
`dim_cmhc_zone`	Dimension	zone_key (PK), geometry
`dim_neighbourhood`	Dimension	neighbourhood_id (PK), geometry
`dim_policy_event`	Dimension	event_id (PK)

dbt Layers

Layer	Naming	Purpose
Staging	`stg_{source}__{entity}`	1:1 source, cleaned, typed
Intermediate	`int_{domain}__{transform}`	Business logic
Marts	`mart_{domain}`	Final analytical tables

Deferred Features

Stop and flag if a task seems to require these:

Feature	Reason
Historical boundary reconciliation (140->158)	2021+ data only for V1
ML prediction models	Energy project scope (future phase)
Multi-project shared infrastructure	Build first, abstract second

Environment Variables

Required in .env:

DATABASE_URL=postgresql://user:pass@localhost:5432/portfolio
POSTGRES_USER=portfolio
POSTGRES_PASSWORD=<secure>
POSTGRES_DB=portfolio
DASH_DEBUG=true
SECRET_KEY=<random>
LOG_LEVEL=INFO

Script Standards

All scripts in scripts/:

Include usage comments at top
Idempotent where possible
Exit codes: 0 = success, 1 = error
Use set -euo pipefail for bash
Log to stdout, errors to stderr

Reference Documents

Document	Location	Use When
Project reference	`docs/PROJECT_REFERENCE.md`	Architecture decisions
Dashboard vision	`docs/changes/Change-Toronto-Analysis.md`	Dashboard specification
Implementation plan	`docs/changes/Change-Toronto-Analysis-Reviewed.md`	Sprint planning

Projman Plugin Workflow

CRITICAL: Always use the projman plugin for sprint and task management.

When to Use Projman Skills

Skill	Trigger	Purpose
`/projman:sprint-plan`	New sprint or phase implementation	Architecture analysis + Gitea issue creation
`/projman:sprint-start`	Beginning implementation work	Load lessons learned, start execution
`/projman:sprint-status`	Check progress	Review blockers and completion status
`/projman:sprint-close`	Sprint completion	Capture lessons learned to Wiki.js

Default Behavior

When user requests implementation work:

ALWAYS start with /projman:sprint-plan before writing code
Create Gitea issues with proper labels and acceptance criteria
Use /projman:sprint-start to begin execution with lessons learned
Track progress via Gitea issue comments
Close sprint with /projman:sprint-close to document lessons

Gitea Repository

Repo: lmiranda/personal-portfolio
Host: gitea.hotserv.cloud
Note: lmiranda is a user account (not org), so label lookup may require repo-level labels

MCP Tools Available

Gitea:

list_issues, get_issue, create_issue, update_issue, add_comment
get_labels, suggest_labels

Wiki.js:

search_lessons, create_lesson, search_pages, get_page

Issue Structure

Every Gitea issue should include:

Overview: Brief description
Files to Create/Modify: Explicit paths
Acceptance Criteria: Checkboxes
Technical Notes: Implementation hints
Labels: Listed in body (workaround for label API issues)

Last Updated: Sprint 9

9.5 KiB Raw Blame History