- Add StatCan CMHC parser to fetch rental data from Statistics Canada API - Create year spine (2014-2025) as time dimension driver instead of census - Add CMA-level rental and income intermediate models - Update mart_neighbourhood_overview to use rental years as base - Fix neighbourhood_service queries to match dbt schema - Add CMHC data loading to pipeline script Data now flows correctly: 158 neighbourhoods × 12 years = 1,896 records Rent data available 2019-2025, crime data 2014-2024 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
11 KiB
CLAUDE.md
Working context for Claude Code on the Analytics Portfolio project.
Project Status
Last Completed Sprint: 9 (Neighbourhood Dashboard Transition)
Current State: Ready for deployment sprint or new features
Branch: development (feature branches merge here)
Quick Reference
Run Commands
make setup # Install deps, create .env, init pre-commit
make docker-up # Start PostgreSQL + PostGIS (auto-detects x86/ARM)
make docker-down # Stop containers
make db-init # Initialize database schema
make run # Start Dash dev server
make test # Run pytest
make lint # Run ruff linter
make format # Run ruff formatter
make ci # Run all checks
Branch Workflow
- Create feature branch FROM
development:git checkout -b feature/{sprint}-{description} - Work and commit on feature branch
- Merge INTO
developmentwhen complete - Delete the feature branch after merge (keep branches clean)
development->staging->mainfor releases
CRITICAL: NEVER DELETE the development branch. It is the main integration branch.
Code Conventions
Import Style
| Context | Style | Example |
|---|---|---|
| Same directory | Single dot | from .neighbourhood import NeighbourhoodRecord |
| Sibling directory | Double dot | from ..schemas.neighbourhood import CensusRecord |
| External packages | Absolute | import pandas as pd |
Module Responsibilities
| Directory | Contains | Purpose |
|---|---|---|
schemas/ |
Pydantic models | Data validation |
models/ |
SQLAlchemy ORM | Database persistence |
parsers/ |
API/CSV extraction | Raw data ingestion |
loaders/ |
Database operations | Data loading |
figures/ |
Chart factories | Plotly figure generation |
callbacks/ |
Dash callbacks | In pages/{dashboard}/callbacks/ |
errors/ |
Exceptions + handlers | Error handling |
Type Hints
Use Python 3.10+ style:
def process(items: list[str], config: dict[str, int] | None = None) -> bool:
...
Error Handling
# errors/exceptions.py
class PortfolioError(Exception):
"""Base exception."""
class ParseError(PortfolioError):
"""PDF/CSV parsing failed."""
class ValidationError(PortfolioError):
"""Pydantic or business rule validation failed."""
class LoadError(PortfolioError):
"""Database load operation failed."""
Code Standards
- Single responsibility functions with verb naming
- Early returns over deep nesting
- Google-style docstrings only for non-obvious behavior
- Module-level constants for magic values
- Pydantic BaseSettings for runtime config
Application Structure
portfolio_app/
├── app.py # Dash app factory with Pages routing
├── config.py # Pydantic BaseSettings
├── assets/ # CSS, images (auto-served)
│ └── sidebar.css # Navigation styling
├── callbacks/ # Global callbacks
│ ├── sidebar.py # Sidebar toggle
│ └── theme.py # Dark/light theme
├── pages/
│ ├── home.py # Bio landing page -> /
│ ├── about.py # About page -> /about
│ ├── contact.py # Contact form -> /contact
│ ├── health.py # Health endpoint -> /health
│ ├── projects.py # Project showcase -> /projects
│ ├── resume.py # Resume/CV -> /resume
│ ├── blog/
│ │ ├── index.py # Blog listing -> /blog
│ │ └── article.py # Blog article -> /blog/{slug}
│ └── toronto/
│ ├── dashboard.py # Dashboard -> /toronto
│ ├── methodology.py # Methodology -> /toronto/methodology
│ ├── tabs/ # 5 tab layouts (overview, housing, safety, demographics, amenities)
│ └── callbacks/ # Dashboard interactions
├── components/ # Shared UI (sidebar, cards, controls)
│ ├── metric_card.py # KPI card component
│ ├── map_controls.py # Map control panel
│ ├── sidebar.py # Navigation sidebar
│ └── time_slider.py # Time range selector
├── figures/ # Shared chart factories
│ ├── choropleth.py # Map visualizations
│ ├── bar_charts.py # Ranking, stacked, horizontal bars
│ ├── scatter.py # Scatter and bubble plots
│ ├── radar.py # Radar/spider charts
│ ├── demographics.py # Age pyramids, donut charts
│ ├── time_series.py # Trend lines
│ └── summary_cards.py # KPI figures
├── content/ # Markdown content
│ └── blog/ # Blog articles
├── toronto/ # Toronto data logic
│ ├── parsers/
│ ├── loaders/
│ ├── schemas/ # Pydantic
│ ├── models/ # SQLAlchemy
│ └── demo_data.py # Sample data
├── utils/ # Utilities
│ └── markdown_loader.py # Markdown processing
└── errors/
notebooks/ # Data documentation (Phase 6)
├── README.md # Template and usage guide
├── overview/ # Overview tab notebooks (3)
├── housing/ # Housing tab notebooks (3)
├── safety/ # Safety tab notebooks (3)
├── demographics/ # Demographics tab notebooks (3)
└── amenities/ # Amenities tab notebooks (3)
URL Routing
| URL | Page | Sprint |
|---|---|---|
/ |
Bio landing page | 2 |
/about |
About page | 8 |
/contact |
Contact form | 8 |
/health |
Health endpoint | 8 |
/projects |
Project showcase | 8 |
/resume |
Resume/CV | 8 |
/blog |
Blog listing | 8 |
/blog/{slug} |
Blog article | 8 |
/toronto |
Toronto Dashboard | 6 |
/toronto/methodology |
Dashboard methodology | 6 |
Tech Stack (Locked)
| Layer | Technology | Version |
|---|---|---|
| Database | PostgreSQL + PostGIS | 16.x |
| Validation | Pydantic | >=2.0 |
| ORM | SQLAlchemy | >=2.0 (2.0-style API only) |
| Transformation | dbt-postgres | >=1.7 |
| Data Processing | Pandas | >=2.1 |
| Geospatial | GeoPandas + Shapely | >=0.14 |
| Visualization | Dash + Plotly | >=2.14 |
| UI Components | dash-mantine-components | Latest stable |
| Testing | pytest | >=7.0 |
| Python | 3.11+ | Via pyenv |
Notes:
- SQLAlchemy 2.0 + Pydantic 2.0 only (never mix 1.x APIs)
- PostGIS extension required in database
- Docker Compose V2 format (no
versionfield) - Multi-architecture support:
make docker-upauto-detects CPU architecture and uses the appropriate PostGIS image (x86_64:postgis/postgis, ARM64:imresamu/postgis)
Data Model Overview
Geographic Reality (Toronto Housing)
City Neighbourhoods (158) - Primary geographic unit for analysis
CMHC Zones (~20) - Rental data (Census Tract aligned)
Star Schema
| Table | Type | Keys |
|---|---|---|
fact_rentals |
Fact | -> dim_time, dim_cmhc_zone |
dim_time |
Dimension | date_key (PK) |
dim_cmhc_zone |
Dimension | zone_key (PK), geometry |
dim_neighbourhood |
Dimension | neighbourhood_id (PK), geometry |
dim_policy_event |
Dimension | event_id (PK) |
dbt Layers
| Layer | Naming | Purpose |
|---|---|---|
| Staging | stg_{source}__{entity} |
1:1 source, cleaned, typed |
| Intermediate | int_{domain}__{transform} |
Business logic |
| Marts | mart_{domain} |
Final analytical tables |
Deferred Features
Stop and flag if a task seems to require these:
| Feature | Reason |
|---|---|
| Historical boundary reconciliation (140->158) | 2021+ data only for V1 |
| ML prediction models | Energy project scope (future phase) |
| Multi-project shared infrastructure | Build first, abstract second |
Environment Variables
Required in .env:
DATABASE_URL=postgresql://user:pass@localhost:5432/portfolio
POSTGRES_USER=portfolio
POSTGRES_PASSWORD=<secure>
POSTGRES_DB=portfolio
DASH_DEBUG=true
SECRET_KEY=<random>
LOG_LEVEL=INFO
Script Standards
All scripts in scripts/:
- Include usage comments at top
- Idempotent where possible
- Exit codes: 0 = success, 1 = error
- Use
set -euo pipefailfor bash - Log to stdout, errors to stderr
Reference Documents
| Document | Location | Use When |
|---|---|---|
| Project reference | docs/PROJECT_REFERENCE.md |
Architecture decisions, completed work |
| Developer guide | docs/CONTRIBUTING.md |
How to add pages, blog posts, tabs |
| Lessons learned | docs/project-lessons-learned/INDEX.md |
Past issues and solutions |
Projman Plugin Workflow
CRITICAL: Always use the projman plugin for sprint and task management.
When to Use Projman Skills
| Skill | Trigger | Purpose |
|---|---|---|
/projman:sprint-plan |
New sprint or phase implementation | Architecture analysis + Gitea issue creation |
/projman:sprint-start |
Beginning implementation work | Load lessons learned (Wiki.js or local), start execution |
/projman:sprint-status |
Check progress | Review blockers and completion status |
/projman:sprint-close |
Sprint completion | Capture lessons learned (Wiki.js or local backup) |
Default Behavior
When user requests implementation work:
- ALWAYS start with
/projman:sprint-planbefore writing code - Create Gitea issues with proper labels and acceptance criteria
- Use
/projman:sprint-startto begin execution with lessons learned - Track progress via Gitea issue comments
- Close sprint with
/projman:sprint-closeto document lessons
Gitea Repository
- Repo:
lmiranda/personal-portfolio - Host:
gitea.hotserv.cloud - Note:
lmirandais a user account (not org), so label lookup may require repo-level labels
MCP Tools Available
Gitea:
list_issues,get_issue,create_issue,update_issue,add_commentget_labels,suggest_labels
Wiki.js:
search_lessons,create_lesson,search_pages,get_page
Lessons Learned (Backup Method)
When Wiki.js is unavailable, use the local backup in docs/project-lessons-learned/:
At Sprint Start:
- Review
docs/project-lessons-learned/INDEX.mdfor relevant past lessons - Search lesson files by tags/keywords before implementation
- Apply prevention strategies from applicable lessons
At Sprint Close:
- Try Wiki.js
create_lessonfirst - If Wiki.js fails, create lesson in
docs/project-lessons-learned/ - Use naming convention:
{phase-or-sprint}-{short-description}.md - Update
INDEX.mdwith new entry - Follow the lesson template in INDEX.md
Migration: Once Wiki.js is configured, lessons will be migrated there for better searchability.
Issue Structure
Every Gitea issue should include:
- Overview: Brief description
- Files to Create/Modify: Explicit paths
- Acceptance Criteria: Checkboxes
- Technical Notes: Implementation hints
- Labels: Listed in body (workaround for label API issues)
Last Updated: January 2026 (Post-Sprint 9)