personal-portfolio/CLAUDE.md

# CLAUDE.md

## ⛔ MANDATORY BEHAVIOR RULES - READ FIRST

**These rules are NON-NEGOTIABLE. Violating them wastes the user's time and money.**

### 1. WHEN USER ASKS YOU TO CHECK SOMETHING - CHECK EVERYTHING
- Search ALL locations, not just where you think it is
- Check cache directories: `~/.claude/plugins/cache/`
- Check installed: `~/.claude/plugins/marketplaces/`
- Check source directories
- **NEVER say "no" or "that's not the issue" without exhaustive verification**

### 2. WHEN USER SAYS SOMETHING IS WRONG - BELIEVE THEM
- The user knows their system better than you
- Investigate thoroughly before disagreeing
- **Your confidence is often wrong. User's instincts are often right.**

### 3. NEVER SAY "DONE" WITHOUT VERIFICATION
- Run the actual command/script to verify
- Show the output to the user
- **"Done" means VERIFIED WORKING, not "I made changes"**

### 4. SHOW EXACTLY WHAT USER ASKS FOR
- If user asks for messages, show the MESSAGES
- If user asks for code, show the CODE
- **Do not interpret or summarize unless asked**

**FAILURE TO FOLLOW THESE RULES = WASTED USER TIME = UNACCEPTABLE**

---


Working context for Claude Code on the Analytics Portfolio project.

---

## Project Status

**Last Completed Sprint**: 9 (Neighbourhood Dashboard Transition)
**Current State**: Ready for deployment sprint or new features
**Branch**: `development` (feature branches merge here)

---

## Quick Reference

### Run Commands

```bash
# Setup & Database
make setup          # Install deps, create .env, init pre-commit
make docker-up      # Start PostgreSQL + PostGIS (auto-detects x86/ARM)
make docker-down    # Stop containers
make docker-logs    # View container logs
make db-init        # Initialize database schema
make db-reset       # Drop and recreate database (DESTRUCTIVE)

# Data Loading
make load-data      # Load all project data (currently: Toronto)
make load-toronto   # Load Toronto data from APIs
make load-toronto-only # Load Toronto data without dbt or seeding
make seed-data      # Seed sample development data

# Application
make run            # Start Dash dev server

# Testing & Quality
make test           # Run pytest
make test-cov       # Run pytest with coverage
make lint           # Run ruff linter
make format         # Run ruff formatter
make typecheck      # Run mypy type checker
make ci             # Run all checks (lint, typecheck, test)

# dbt
make dbt-run        # Run dbt models
make dbt-test       # Run dbt tests
make dbt-docs       # Generate and serve dbt documentation

# Maintenance
make clean          # Remove build artifacts and caches
```

### Branch Workflow

1. Create feature branch FROM `development`: `git checkout -b feature/{sprint}-{description}`
2. Work and commit on feature branch
3. Merge INTO `development` when complete
4. `development` -> `staging` -> `main` for releases

---

## Code Conventions

### Import Style

| Context | Style | Example |
|---------|-------|---------|
| Same directory | Single dot | `from .neighbourhood import NeighbourhoodRecord` |
| Sibling directory | Double dot | `from ..schemas.neighbourhood import CensusRecord` |
| External packages | Absolute | `import pandas as pd` |

### Module Responsibilities

| Directory | Contains | Purpose |
|-----------|----------|---------|
| `schemas/` | Pydantic models | Data validation |
| `models/` | SQLAlchemy ORM | Database persistence |
| `parsers/` | API/CSV extraction | Raw data ingestion |
| `loaders/` | Database operations | Data loading |
| `services/` | Query functions | dbt mart queries, business logic |
| `figures/` | Chart factories | Plotly figure generation |
| `callbacks/` | Dash callbacks | In `pages/{dashboard}/callbacks/` |
| `errors/` | Exception classes | Custom exceptions |
| `utils/` | Helper modules | Markdown loading, shared utilities |

### Type Hints

Use Python 3.10+ style:
```python
def process(items: list[str], config: dict[str, int] | None = None) -> bool:
    ...
```

### Error Handling

```python
# errors/exceptions.py
class PortfolioError(Exception):
    """Base exception."""

class ParseError(PortfolioError):
    """PDF/CSV parsing failed."""

class ValidationError(PortfolioError):
    """Pydantic or business rule validation failed."""

class LoadError(PortfolioError):
    """Database load operation failed."""
```

### Code Standards

- Single responsibility functions with verb naming
- Early returns over deep nesting
- Google-style docstrings only for non-obvious behavior
- Module-level constants for magic values
- Pydantic BaseSettings for runtime config

---

## Application Structure

**Entry Point:** `portfolio_app/app.py` (Dash app factory with Pages routing)

| Directory | Purpose | Notes |
|-----------|---------|-------|
| `pages/` | Dash Pages (file-based routing) | URLs match file paths |
| `pages/toronto/` | Toronto Dashboard | `tabs/` for layouts, `callbacks/` for interactions |
| `components/` | Shared UI components | metric_card, sidebar, map_controls, time_slider |
| `figures/toronto/` | Toronto chart factories | choropleth, bar_charts, scatter, radar, time_series |
| `toronto/` | Toronto data logic | parsers/, loaders/, schemas/, models/ |
| `content/blog/` | Markdown blog articles | Processed by `utils/markdown_loader.py` |
| `notebooks/toronto/` | Toronto documentation | 5 domains: overview, housing, safety, demographics, amenities |

**Key URLs:** `/` (home), `/toronto` (dashboard), `/blog` (listing), `/blog/{slug}` (articles)

### Multi-Dashboard Architecture

The codebase is structured to support multiple dashboard projects:
- **figures/**: Domain-namespaced figure factories (`figures/toronto/`, future: `figures/football/`)
- **notebooks/**: Domain-namespaced documentation (`notebooks/toronto/`, future: `notebooks/football/`)
- **dbt models**: Domain subdirectories (`staging/toronto/`, `marts/toronto/`)
- **Database schemas**: Domain-specific raw data (`raw_toronto`, future: `raw_football`)

---

## Tech Stack (Locked)

| Layer | Technology | Version |
|-------|------------|---------|
| Database | PostgreSQL + PostGIS | 16.x |
| Validation | Pydantic | >=2.0 |
| ORM | SQLAlchemy | >=2.0 (2.0-style API only) |
| Transformation | dbt-postgres | >=1.7 |
| Data Processing | Pandas | >=2.1 |
| Geospatial | GeoPandas + Shapely | >=0.14 |
| Visualization | Dash + Plotly | >=2.14 |
| UI Components | dash-mantine-components | Latest stable |
| Testing | pytest | >=7.0 |
| Python | 3.11+ | Via pyenv |

**Notes**:
- SQLAlchemy 2.0 + Pydantic 2.0 only (never mix 1.x APIs)
- PostGIS extension required in database
- Docker Compose V2 format (no `version` field)
- **Multi-architecture support**: `make docker-up` auto-detects CPU architecture and uses the appropriate PostGIS image (x86_64: `postgis/postgis`, ARM64: `imresamu/postgis`)

---

## Data Model Overview

### Database Schemas

| Schema | Purpose |
|--------|---------|
| `public` | Shared dimensions (dim_time) |
| `raw_toronto` | Toronto-specific raw/dimension tables |
| `stg_toronto` | Toronto dbt staging views |
| `int_toronto` | Toronto dbt intermediate views |
| `mart_toronto` | Toronto dbt mart tables |

### Geographic Reality (Toronto Housing)

```
City Neighbourhoods (158) - Primary geographic unit for analysis
CMHC Zones (~20)          - Rental data (Census Tract aligned)
```

### Star Schema (raw_toronto)

| Table | Type | Keys |
|-------|------|------|
| `fact_rentals` | Fact | -> dim_time, dim_cmhc_zone |
| `dim_time` | Dimension (public) | date_key (PK) - shared |
| `dim_cmhc_zone` | Dimension | zone_key (PK), geometry |
| `dim_neighbourhood` | Dimension | neighbourhood_id (PK), geometry |
| `dim_policy_event` | Dimension | event_id (PK) |

### dbt Project: `portfolio`

**Model Structure:**
```
dbt/models/
├── shared/              # Cross-domain dimensions
│   └── stg_dimensions__time.sql
├── staging/toronto/     # Toronto staging models
├── intermediate/toronto/ # Toronto intermediate models
└── marts/toronto/       # Toronto mart tables
```

| Layer | Naming | Purpose |
|-------|--------|---------|
| Shared | `stg_dimensions__*` | Cross-domain dimensions |
| Staging | `stg_{source}__{entity}` | 1:1 source, cleaned, typed |
| Intermediate | `int_{domain}__{transform}` | Business logic |
| Marts | `mart_{domain}` | Final analytical tables |

---

## Deferred Features

**Stop and flag if a task seems to require these**:

| Feature | Reason |
|---------|--------|
| Historical boundary reconciliation (140->158) | 2021+ data only for V1 |
| ML prediction models | Energy project scope (future phase) |

---

## Environment Variables

Required in `.env`:

```bash
DATABASE_URL=postgresql://user:pass@localhost:5432/portfolio
POSTGRES_USER=portfolio
POSTGRES_PASSWORD=<secure>
POSTGRES_DB=portfolio
DASH_DEBUG=true
SECRET_KEY=<random>
LOG_LEVEL=INFO
```

---

## Script Standards

All scripts in `scripts/`:
- Include usage comments at top
- Idempotent where possible
- Exit codes: 0 = success, 1 = error
- Use `set -euo pipefail` for bash
- Log to stdout, errors to stderr

---

## Reference Documents

| Document | Location | Use When |
|----------|----------|----------|
| Project reference | `docs/PROJECT_REFERENCE.md` | Architecture decisions, completed work |
| Developer guide | `docs/CONTRIBUTING.md` | How to add pages, blog posts, tabs |
| Lessons learned | `docs/project-lessons-learned/INDEX.md` | Past issues and solutions |
| Deployment runbook | `docs/runbooks/deployment.md` | Deploying to staging/production |
| Dashboard runbook | `docs/runbooks/adding-dashboard.md` | Adding new data dashboards |

---

## Projman Plugin Workflow

**CRITICAL: Always use the projman plugin for sprint and task management.**

### When to Use Projman Skills

| Skill | Trigger | Purpose |
|-------|---------|---------|
| `/projman:sprint-plan` | New sprint or phase implementation | Architecture analysis + Gitea issue creation |
| `/projman:sprint-start` | Beginning implementation work | Load lessons learned (Wiki.js or local), start execution |
| `/projman:sprint-status` | Check progress | Review blockers and completion status |
| `/projman:sprint-close` | Sprint completion | Capture lessons learned (Wiki.js or local backup) |

### Default Behavior

When user requests implementation work:

1. **ALWAYS start with `/projman:sprint-plan`** before writing code
2. Create Gitea issues with proper labels and acceptance criteria
3. Use `/projman:sprint-start` to begin execution with lessons learned
4. Track progress via Gitea issue comments
5. Close sprint with `/projman:sprint-close` to document lessons

### Gitea Repository

- **Repo**: `personal-projects/personal-portfolio`
- **Host**: `gitea.hotserv.cloud`
- **SSH**: `ssh://git@hotserv.tailc9b278.ts.net:2222/personal-projects/personal-portfolio.git`
- **Labels**: 18 repository-level labels configured (Type, Priority, Complexity, Effort)

### MCP Tools Available

**Gitea**:
- `list_issues`, `get_issue`, `create_issue`, `update_issue`, `add_comment`
- `get_labels`, `suggest_labels`

**Wiki.js**:
- `search_lessons`, `create_lesson`, `search_pages`, `get_page`

### Lessons Learned (Backup Method)

**When Wiki.js is unavailable**, use the local backup in `docs/project-lessons-learned/`:

**At Sprint Start:**
1. Review `docs/project-lessons-learned/INDEX.md` for relevant past lessons
2. Search lesson files by tags/keywords before implementation
3. Apply prevention strategies from applicable lessons

**At Sprint Close:**
1. Try Wiki.js `create_lesson` first
2. If Wiki.js fails, create lesson in `docs/project-lessons-learned/`
3. Use naming convention: `{phase-or-sprint}-{short-description}.md`
4. Update `INDEX.md` with new entry
5. Follow the lesson template in INDEX.md

**Migration:** Once Wiki.js is configured, lessons will be migrated there for better searchability.

### Issue Structure

Every Gitea issue should include:
- **Overview**: Brief description
- **Files to Create/Modify**: Explicit paths
- **Acceptance Criteria**: Checkboxes
- **Technical Notes**: Implementation hints
- **Labels**: Listed in body (workaround for label API issues)

---

## Other Available Plugins

### Code Quality: code-sentinel

Use for security scanning and refactoring analysis.

| Command | Purpose |
|---------|---------|
| `/code-sentinel:security-scan` | Full security audit of codebase |
| `/code-sentinel:refactor` | Apply refactoring patterns |
| `/code-sentinel:refactor-dry` | Preview refactoring without applying |

**When to use:** Before major releases, after adding authentication/data handling code, periodic audits.

### Documentation: doc-guardian

Use for documentation drift detection and synchronization.

| Command | Purpose |
|---------|---------|
| `/doc-guardian:doc-audit` | Scan project for documentation drift |
| `/doc-guardian:doc-sync` | Synchronize pending documentation updates |

**When to use:** After significant code changes, before releases, when docs feel stale.

### Pull Requests: pr-review

Use for comprehensive PR review with multiple analysis perspectives.

| Command | Purpose |
|---------|---------|
| `/pr-review:initial-setup` | Configure PR review for this project |
| `/pr-review:project-init` | Quick project-level setup |

**When to use:** Before merging significant PRs to `development` or `main`.

### Git Workflow: git-flow

Use for git operations assistance.

**When to use:** Complex merge scenarios, branch management questions.

---

*Last Updated: February 2026 (Multi-Dashboard Architecture)*