17 Commits

Author SHA1 Message Date
058d058975 Merge branch 'staging' into development
Some checks failed
CI / lint-and-test (push) Has been cancelled
2026-02-02 22:02:57 +00:00
0455ec69a0 Merge branch 'main' into development
Some checks failed
CI / lint-and-test (push) Has been cancelled
2026-02-02 22:02:26 +00:00
9e216962b1 Merge pull request 'refactor: domain-scoped schema migration for application code' (#104) from feature/domain-scoped-schema-migration into development
Some checks failed
CI / lint-and-test (push) Has been cancelled
Reviewed-on: #104
2026-02-02 22:01:48 +00:00
dfa5f92d8a refactor: update app code for domain-scoped schema migration
Some checks failed
CI / lint-and-test (pull_request) Has been cancelled
- Update dbt model references to use new schema naming (stg_toronto, int_toronto, mart_toronto)
- Refactor figure factories to use consistent column naming from new schema
- Update callbacks to work with refactored data structures
- Add centralized design tokens module for consistent styling
- Streamline CLAUDE.md documentation

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-02 17:00:30 -05:00
3bd2005c9d Merge pull request 'Merge pull request 'development' (#99) from development into main' (#102) from main into staging
Some checks failed
CI / lint-and-test (push) Has been cancelled
Deploy to Staging / deploy (push) Has been cancelled
Reviewed-on: #102
2026-02-02 17:35:06 +00:00
0c9769fd27 Merge branch 'staging' into main
Some checks failed
CI / lint-and-test (push) Has been cancelled
Deploy to Production / deploy (push) Has been cancelled
CI / lint-and-test (pull_request) Has been cancelled
2026-02-02 17:34:58 +00:00
cb908a18c3 Merge pull request 'Merge pull request 'development' (#98) from development into staging' (#101) from staging into development
Some checks failed
CI / lint-and-test (push) Has been cancelled
Reviewed-on: #101
2026-02-02 17:34:27 +00:00
558022f26e Merge branch 'development' into staging
Some checks failed
CI / lint-and-test (push) Has been cancelled
Deploy to Staging / deploy (push) Has been cancelled
CI / lint-and-test (pull_request) Has been cancelled
2026-02-02 17:34:14 +00:00
9e27fb8011 Merge pull request 'refactor(dbt): migrate to domain-scoped schema names' (#100) from feature/domain-scoped-schema-migration into development
Some checks failed
CI / lint-and-test (push) Has been cancelled
Reviewed-on: #100
2026-02-02 17:33:40 +00:00
cda2a078d9 refactor(dbt): migrate to domain-scoped schema names
Some checks failed
CI / lint-and-test (pull_request) Has been cancelled
- Create generate_schema_name macro to use custom schema names directly
- Update dbt_project.yml schemas: staging→stg_toronto, intermediate→int_toronto, marts→mart_toronto
- Add dbt/macros/toronto/ directory for future domain-specific macros
- Fix documentation drift in PROJECT_REFERENCE.md (load-data-only→load-toronto-only)
- Update DATABASE_SCHEMA.md with new schema names
- Update CLAUDE.md database schemas table
- Update adding-dashboard.md runbook with domain-scoped pattern

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-02 12:32:39 -05:00
dd8de9810d Merge pull request 'development' (#99) from development into main
Some checks failed
CI / lint-and-test (push) Has been cancelled
Deploy to Production / deploy (push) Has been cancelled
Reviewed-on: #99
2026-02-02 00:39:19 +00:00
56bcc1bb1d Merge branch 'main' into development
Some checks failed
CI / lint-and-test (push) Has been cancelled
2026-02-02 00:39:13 +00:00
ee0a7ef7ad Merge pull request 'development' (#98) from development into staging
Some checks failed
CI / lint-and-test (push) Has been cancelled
Deploy to Staging / deploy (push) Has been cancelled
CI / lint-and-test (pull_request) Has been cancelled
Reviewed-on: #98
2026-02-02 00:19:29 +00:00
fd9850778e Merge branch 'staging' into development
Some checks failed
CI / lint-and-test (push) Has been cancelled
2026-02-02 00:19:24 +00:00
01e98103c7 Merge pull request 'refactor: multi-dashboard structural migration' (#97) from feature/multi-dashboard-structure into development
Some checks failed
CI / lint-and-test (push) Has been cancelled
Reviewed-on: #97
2026-02-02 00:18:45 +00:00
62d1a52eed refactor: multi-dashboard structural migration
Some checks failed
CI / lint-and-test (pull_request) Has been cancelled
- Rename dbt project from toronto_housing to portfolio
- Restructure dbt models into domain subdirectories:
  - shared/ for cross-domain dimensions (dim_time)
  - staging/toronto/, intermediate/toronto/, marts/toronto/
- Update SQLAlchemy models for raw_toronto schema
- Add explicit cross-schema FK relationships for FactRentals
- Namespace figure factories under figures/toronto/
- Namespace notebooks under notebooks/toronto/
- Update Makefile with domain-specific targets and env loading
- Update all documentation for multi-dashboard structure

This enables adding new dashboard projects (e.g., /football, /energy)
without structural conflicts or naming collisions.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-01 19:08:20 -05:00
e37611673f Merge pull request 'staging' (#96) from staging into main
Some checks failed
CI / lint-and-test (push) Has been cancelled
Deploy to Production / deploy (push) Has been cancelled
Reviewed-on: #96
2026-02-01 21:33:12 +00:00
80 changed files with 1748 additions and 1027 deletions

311
CLAUDE.md
View File

@@ -1,5 +1,48 @@
# CLAUDE.md # CLAUDE.md
## ⛔ MANDATORY BEHAVIOR RULES - READ FIRST
**These rules are NON-NEGOTIABLE. Violating them wastes the user's time and money.**
### 1. WHEN USER ASKS YOU TO CHECK SOMETHING - CHECK EVERYTHING
- Search ALL locations, not just where you think it is
- Check cache directories: `~/.claude/plugins/cache/`
- Check installed: `~/.claude/plugins/marketplaces/`
- Check source directories
- **NEVER say "no" or "that's not the issue" without exhaustive verification**
### 2. WHEN USER SAYS SOMETHING IS WRONG - BELIEVE THEM
- The user knows their system better than you
- Investigate thoroughly before disagreeing
- **Your confidence is often wrong. User's instincts are often right.**
### 3. NEVER SAY "DONE" WITHOUT VERIFICATION
- Run the actual command/script to verify
- Show the output to the user
- **"Done" means VERIFIED WORKING, not "I made changes"**
### 4. SHOW EXACTLY WHAT USER ASKS FOR
- If user asks for messages, show the MESSAGES
- If user asks for code, show the CODE
- **Do not interpret or summarize unless asked**
**FAILURE TO FOLLOW THESE RULES = WASTED USER TIME = UNACCEPTABLE**
---
## Mandatory Behavior Rules
**These rules are NON-NEGOTIABLE. Violating them wastes the user's time and money.**
1. **CHECK EVERYTHING** - Search ALL locations before saying "no" (cache, installed, source directories)
2. **BELIEVE THE USER** - Investigate thoroughly before disagreeing; user instincts are often right
3. **VERIFY BEFORE "DONE"** - Run commands, show output; "done" means verified working
4. **SHOW EXACTLY WHAT'S ASKED** - Do not interpret or summarize unless requested
---
Working context for Claude Code on the Analytics Portfolio project. Working context for Claude Code on the Analytics Portfolio project.
--- ---
@@ -21,21 +64,18 @@ Working context for Claude Code on the Analytics Portfolio project.
make setup # Install deps, create .env, init pre-commit make setup # Install deps, create .env, init pre-commit
make docker-up # Start PostgreSQL + PostGIS (auto-detects x86/ARM) make docker-up # Start PostgreSQL + PostGIS (auto-detects x86/ARM)
make docker-down # Stop containers make docker-down # Stop containers
make docker-logs # View container logs
make db-init # Initialize database schema make db-init # Initialize database schema
make db-reset # Drop and recreate database (DESTRUCTIVE) make db-reset # Drop and recreate database (DESTRUCTIVE)
# Data Loading # Data Loading
make load-data # Load Toronto data from APIs, seed dev data make load-data # Load all project data (currently: Toronto)
make load-data-only # Load Toronto data without dbt or seeding make load-toronto # Load Toronto data from APIs
make seed-data # Seed sample development data
# Application # Application
make run # Start Dash dev server make run # Start Dash dev server
# Testing & Quality # Testing & Quality
make test # Run pytest make test # Run pytest
make test-cov # Run pytest with coverage
make lint # Run ruff linter make lint # Run ruff linter
make format # Run ruff formatter make format # Run ruff formatter
make typecheck # Run mypy type checker make typecheck # Run mypy type checker
@@ -46,8 +86,7 @@ make dbt-run # Run dbt models
make dbt-test # Run dbt tests make dbt-test # Run dbt tests
make dbt-docs # Generate and serve dbt documentation make dbt-docs # Generate and serve dbt documentation
# Maintenance # Run `make help` for full target list
make clean # Remove build artifacts and caches
``` ```
### Branch Workflow ### Branch Workflow
@@ -71,50 +110,22 @@ make clean # Remove build artifacts and caches
### Module Responsibilities ### Module Responsibilities
| Directory | Contains | Purpose | | Directory | Purpose |
|-----------|----------|---------| |-----------|---------|
| `schemas/` | Pydantic models | Data validation | | `schemas/` | Pydantic models for data validation |
| `models/` | SQLAlchemy ORM | Database persistence | | `models/` | SQLAlchemy ORM for database persistence |
| `parsers/` | API/CSV extraction | Raw data ingestion | | `parsers/` | API/CSV extraction for raw data ingestion |
| `loaders/` | Database operations | Data loading | | `loaders/` | Database operations for data loading |
| `services/` | Query functions | dbt mart queries, business logic | | `services/` | Query functions for dbt mart queries |
| `figures/` | Chart factories | Plotly figure generation | | `figures/` | Chart factories for Plotly figure generation |
| `callbacks/` | Dash callbacks | In `pages/{dashboard}/callbacks/` | | `errors/` | Custom exception classes (see `errors/exceptions.py`) |
| `errors/` | Exception classes | Custom exceptions |
| `utils/` | Helper modules | Markdown loading, shared utilities |
### Type Hints
Use Python 3.10+ style:
```python
def process(items: list[str], config: dict[str, int] | None = None) -> bool:
...
```
### Error Handling
```python
# errors/exceptions.py
class PortfolioError(Exception):
"""Base exception."""
class ParseError(PortfolioError):
"""PDF/CSV parsing failed."""
class ValidationError(PortfolioError):
"""Pydantic or business rule validation failed."""
class LoadError(PortfolioError):
"""Database load operation failed."""
```
### Code Standards ### Code Standards
- Python 3.10+ type hints: `list[str]`, `dict[str, int] | None`
- Single responsibility functions with verb naming - Single responsibility functions with verb naming
- Early returns over deep nesting - Early returns over deep nesting
- Google-style docstrings only for non-obvious behavior - Google-style docstrings only for non-obvious behavior
- Module-level constants for magic values
- Pydantic BaseSettings for runtime config
--- ---
@@ -122,17 +133,21 @@ class LoadError(PortfolioError):
**Entry Point:** `portfolio_app/app.py` (Dash app factory with Pages routing) **Entry Point:** `portfolio_app/app.py` (Dash app factory with Pages routing)
| Directory | Purpose | Notes | | Directory | Purpose |
|-----------|---------|-------| |-----------|---------|
| `pages/` | Dash Pages (file-based routing) | URLs match file paths | | `pages/` | Dash Pages (file-based routing) |
| `pages/toronto/` | Toronto Dashboard | `tabs/` for layouts, `callbacks/` for interactions | | `pages/toronto/` | Toronto Dashboard (`tabs/` for layouts, `callbacks/` for interactions) |
| `components/` | Shared UI components | metric_card, sidebar, map_controls, time_slider | | `components/` | Shared UI components |
| `figures/` | Plotly chart factories | choropleth, bar_charts, scatter, radar, time_series | | `figures/toronto/` | Toronto chart factories |
| `toronto/` | Toronto data logic | parsers/, loaders/, schemas/, models/ | | `toronto/` | Toronto data logic (parsers, loaders, schemas, models) |
| `content/blog/` | Markdown blog articles | Processed by `utils/markdown_loader.py` |
| `notebooks/` | Data documentation | 5 domains: overview, housing, safety, demographics, amenities |
**Key URLs:** `/` (home), `/toronto` (dashboard), `/blog` (listing), `/blog/{slug}` (articles) **Key URLs:** `/` (home), `/toronto` (dashboard), `/blog` (listing), `/blog/{slug}` (articles), `/health` (status)
### Multi-Dashboard Architecture
- **figures/**: Domain-namespaced (`figures/toronto/`, future: `figures/football/`)
- **dbt models**: Domain subdirectories (`staging/toronto/`, `marts/toronto/`)
- **Database schemas**: Domain-specific raw data (`raw_toronto`, future: `raw_football`)
--- ---
@@ -144,44 +159,31 @@ class LoadError(PortfolioError):
| Validation | Pydantic | >=2.0 | | Validation | Pydantic | >=2.0 |
| ORM | SQLAlchemy | >=2.0 (2.0-style API only) | | ORM | SQLAlchemy | >=2.0 (2.0-style API only) |
| Transformation | dbt-postgres | >=1.7 | | Transformation | dbt-postgres | >=1.7 |
| Data Processing | Pandas | >=2.1 | | Visualization | Dash + Plotly + dash-mantine-components | >=2.14 |
| Geospatial | GeoPandas + Shapely | >=0.14 | | Geospatial | GeoPandas + Shapely | >=0.14 |
| Visualization | Dash + Plotly | >=2.14 |
| UI Components | dash-mantine-components | Latest stable |
| Testing | pytest | >=7.0 |
| Python | 3.11+ | Via pyenv | | Python | 3.11+ | Via pyenv |
**Notes**: **Notes**: SQLAlchemy 2.0 + Pydantic 2.0 only. Docker Compose V2 format (no `version` field).
- SQLAlchemy 2.0 + Pydantic 2.0 only (never mix 1.x APIs)
- PostGIS extension required in database
- Docker Compose V2 format (no `version` field)
- **Multi-architecture support**: `make docker-up` auto-detects CPU architecture and uses the appropriate PostGIS image (x86_64: `postgis/postgis`, ARM64: `imresamu/postgis`)
--- ---
## Data Model Overview ## Data Model Overview
### Geographic Reality (Toronto Housing) ### Database Schemas
``` | Schema | Purpose |
City Neighbourhoods (158) - Primary geographic unit for analysis |--------|---------|
CMHC Zones (~20) - Rental data (Census Tract aligned) | `public` | Shared dimensions (dim_time) |
``` | `raw_toronto` | Toronto-specific raw/dimension tables |
| `stg_toronto` | Toronto dbt staging views |
| `int_toronto` | Toronto dbt intermediate views |
| `mart_toronto` | Toronto dbt mart tables |
### Star Schema ### dbt Project: `portfolio`
| Table | Type | Keys |
|-------|------|------|
| `fact_rentals` | Fact | -> dim_time, dim_cmhc_zone |
| `dim_time` | Dimension | date_key (PK) |
| `dim_cmhc_zone` | Dimension | zone_key (PK), geometry |
| `dim_neighbourhood` | Dimension | neighbourhood_id (PK), geometry |
| `dim_policy_event` | Dimension | event_id (PK) |
### dbt Layers
| Layer | Naming | Purpose | | Layer | Naming | Purpose |
|-------|--------|---------| |-------|--------|---------|
| Shared | `stg_dimensions__*` | Cross-domain dimensions |
| Staging | `stg_{source}__{entity}` | 1:1 source, cleaned, typed | | Staging | `stg_{source}__{entity}` | 1:1 source, cleaned, typed |
| Intermediate | `int_{domain}__{transform}` | Business logic | | Intermediate | `int_{domain}__{transform}` | Business logic |
| Marts | `mart_{domain}` | Final analytical tables | | Marts | `mart_{domain}` | Final analytical tables |
@@ -190,13 +192,12 @@ CMHC Zones (~20) - Rental data (Census Tract aligned)
## Deferred Features ## Deferred Features
**Stop and flag if a task seems to require these**: **Stop and flag if a task requires these**:
| Feature | Reason | | Feature | Reason |
|---------|--------| |---------|--------|
| Historical boundary reconciliation (140->158) | 2021+ data only for V1 | | Historical boundary reconciliation (140->158) | 2021+ data only for V1 |
| ML prediction models | Energy project scope (future phase) | | ML prediction models | Energy project scope (future phase) |
| Multi-project shared infrastructure | Build first, abstract second |
--- ---
@@ -216,139 +217,123 @@ LOG_LEVEL=INFO
--- ---
## Script Standards
All scripts in `scripts/`:
- Include usage comments at top
- Idempotent where possible
- Exit codes: 0 = success, 1 = error
- Use `set -euo pipefail` for bash
- Log to stdout, errors to stderr
---
## Reference Documents ## Reference Documents
| Document | Location | Use When | | Document | Location | Use When |
|----------|----------|----------| |----------|----------|----------|
| Project reference | `docs/PROJECT_REFERENCE.md` | Architecture decisions, completed work | | Project reference | `docs/PROJECT_REFERENCE.md` | Architecture decisions |
| Developer guide | `docs/CONTRIBUTING.md` | How to add pages, blog posts, tabs | | Developer guide | `docs/CONTRIBUTING.md` | How to add pages, tabs |
| Lessons learned | `docs/project-lessons-learned/INDEX.md` | Past issues and solutions | | Lessons learned | `docs/project-lessons-learned/INDEX.md` | Past issues and solutions |
| Deployment runbook | `docs/runbooks/deployment.md` | Deploying to staging/production | | Deployment runbook | `docs/runbooks/deployment.md` | Deploying to environments |
| Dashboard runbook | `docs/runbooks/adding-dashboard.md` | Adding new data dashboards |
--- ---
## Projman Plugin Workflow ## Plugin Reference
**CRITICAL: Always use the projman plugin for sprint and task management.** ### Sprint Management: projman
### When to Use Projman Skills **CRITICAL: Always use projman for sprint and task management.**
| Skill | Trigger | Purpose | | Skill | Trigger | Purpose |
|-------|---------|---------| |-------|---------|---------|
| `/projman:sprint-plan` | New sprint or phase implementation | Architecture analysis + Gitea issue creation | | `/projman:sprint-plan` | New sprint/feature | Architecture analysis + Gitea issue creation |
| `/projman:sprint-start` | Beginning implementation work | Load lessons learned (Wiki.js or local), start execution | | `/projman:sprint-start` | Begin implementation | Load lessons learned, start execution |
| `/projman:sprint-status` | Check progress | Review blockers and completion status | | `/projman:sprint-status` | Check progress | Review blockers and completion |
| `/projman:sprint-close` | Sprint completion | Capture lessons learned (Wiki.js or local backup) | | `/projman:sprint-close` | Sprint completion | Capture lessons learned |
### Default Behavior **Default workflow**: `/projman:sprint-plan` before code -> create issues -> `/projman:sprint-start` -> track via Gitea -> `/projman:sprint-close`
When user requests implementation work: **Gitea**: `personal-projects/personal-portfolio` at `gitea.hotserv.cloud`
1. **ALWAYS start with `/projman:sprint-plan`** before writing code ### Data Platform: data-platform
2. Create Gitea issues with proper labels and acceptance criteria
3. Use `/projman:sprint-start` to begin execution with lessons learned
4. Track progress via Gitea issue comments
5. Close sprint with `/projman:sprint-close` to document lessons
### Gitea Repository Use for dbt, PostgreSQL, and PostGIS operations.
- **Repo**: `personal-projects/personal-portfolio` | Skill | Purpose |
- **Host**: `gitea.hotserv.cloud` |-------|---------|
- **SSH**: `ssh://git@hotserv.tailc9b278.ts.net:2222/personal-projects/personal-portfolio.git` | `/data-platform:data-review` | Audit data integrity, schema validity, dbt compliance |
- **Labels**: 18 repository-level labels configured (Type, Priority, Complexity, Effort) | `/data-platform:data-gate` | CI/CD data quality gate (pass/fail) |
### MCP Tools Available **When to use:** Schema changes, dbt model development, data loading, before merging data PRs.
**Gitea**: **MCP tools available:** `pg_connect`, `pg_query`, `pg_tables`, `pg_columns`, `pg_schemas`, `st_*` (PostGIS), `dbt_*` operations.
- `list_issues`, `get_issue`, `create_issue`, `update_issue`, `add_comment`
- `get_labels`, `suggest_labels`
**Wiki.js**: ### Visualization: viz-platform
- `search_lessons`, `create_lesson`, `search_pages`, `get_page`
### Lessons Learned (Backup Method) Use for Dash/Mantine component validation and chart creation.
**When Wiki.js is unavailable**, use the local backup in `docs/project-lessons-learned/`: | Skill | Purpose |
|-------|---------|
| `/viz-platform:component` | Inspect DMC component props and validation |
| `/viz-platform:chart` | Create themed Plotly charts |
| `/viz-platform:theme` | Apply/validate themes |
| `/viz-platform:dashboard` | Create dashboard layouts |
**At Sprint Start:** **When to use:** Dashboard development, new visualizations, component prop lookup.
1. Review `docs/project-lessons-learned/INDEX.md` for relevant past lessons
2. Search lesson files by tags/keywords before implementation
3. Apply prevention strategies from applicable lessons
**At Sprint Close:**
1. Try Wiki.js `create_lesson` first
2. If Wiki.js fails, create lesson in `docs/project-lessons-learned/`
3. Use naming convention: `{phase-or-sprint}-{short-description}.md`
4. Update `INDEX.md` with new entry
5. Follow the lesson template in INDEX.md
**Migration:** Once Wiki.js is configured, lessons will be migrated there for better searchability.
### Issue Structure
Every Gitea issue should include:
- **Overview**: Brief description
- **Files to Create/Modify**: Explicit paths
- **Acceptance Criteria**: Checkboxes
- **Technical Notes**: Implementation hints
- **Labels**: Listed in body (workaround for label API issues)
---
## Other Available Plugins
### Code Quality: code-sentinel ### Code Quality: code-sentinel
Use for security scanning and refactoring analysis. Use for security scanning and refactoring analysis.
| Command | Purpose | | Skill | Purpose |
|---------|---------| |-------|---------|
| `/code-sentinel:security-scan` | Full security audit of codebase | | `/code-sentinel:security-scan` | Full security audit of codebase |
| `/code-sentinel:refactor` | Apply refactoring patterns | | `/code-sentinel:refactor` | Apply refactoring patterns |
| `/code-sentinel:refactor-dry` | Preview refactoring without applying | | `/code-sentinel:refactor-dry` | Preview refactoring without applying |
**When to use:** Before major releases, after adding authentication/data handling code, periodic audits. **When to use:** Before major releases, after adding auth/data handling code, periodic audits.
### Documentation: doc-guardian ### Documentation: doc-guardian
Use for documentation drift detection and synchronization. Use for documentation drift detection and synchronization.
| Command | Purpose | | Skill | Purpose |
|---------|---------| |-------|---------|
| `/doc-guardian:doc-audit` | Scan project for documentation drift | | `/doc-guardian:doc-audit` | Scan project for documentation drift |
| `/doc-guardian:doc-sync` | Synchronize pending documentation updates | | `/doc-guardian:doc-sync` | Synchronize pending documentation updates |
**When to use:** After significant code changes, before releases, when docs feel stale. **When to use:** After significant code changes, before releases.
### Pull Requests: pr-review ### Pull Requests: pr-review
Use for comprehensive PR review with multiple analysis perspectives. Use for comprehensive PR review with multiple analysis perspectives.
| Command | Purpose | | Skill | Purpose |
|---------|---------| |-------|---------|
| `/pr-review:initial-setup` | Configure PR review for this project | | `/pr-review:initial-setup` | Configure PR review for project |
| `/pr-review:project-init` | Quick project-level setup | | Triggered automatically | Security, performance, maintainability, test analysis |
**When to use:** Before merging significant PRs to `development` or `main`. **When to use:** Before merging significant PRs to `development` or `main`.
### Requirement Clarification: clarity-assist
Use when requirements are ambiguous or need decomposition.
**When to use:** Unclear specifications, complex feature requests, conflicting requirements.
### Contract Validation: contract-validator
Use for plugin interface validation.
| Skill | Purpose |
|-------|---------|
| `/contract-validator:agent-check` | Quick agent definition validation |
| `/contract-validator:full-validation` | Full plugin contract validation |
**When to use:** When modifying plugin integrations or agent definitions.
### Git Workflow: git-flow ### Git Workflow: git-flow
Use for git operations assistance. Use for standardized git operations.
**When to use:** Complex merge scenarios, branch management questions. | Skill | Purpose |
|-------|---------|
| `/git-flow:commit` | Auto-generated conventional commit |
| `/git-flow:branch-start` | Create feature/fix/chore branch |
| `/git-flow:git-status` | Comprehensive status with recommendations |
**When to use:** Complex merge scenarios, branch management, standardized commits.
--- ---
*Last Updated: January 2026 (Post-Sprint 9)* *Last Updated: February 2026*

View File

@@ -1,11 +1,12 @@
.PHONY: setup docker-up docker-down db-init load-data seed-data run test dbt-run dbt-test lint format ci deploy clean help logs run-detached etl-toronto .PHONY: setup docker-up docker-down db-init load-data load-all load-toronto load-toronto-only seed-data run test dbt-run dbt-test lint format ci deploy clean help logs run-detached etl-toronto
# Default target # Default target
.DEFAULT_GOAL := help .DEFAULT_GOAL := help
# Environment # Environment
PYTHON := python3 VENV := .venv
PIP := pip PYTHON := $(VENV)/bin/python3
PIP := $(VENV)/bin/pip
DOCKER_COMPOSE := docker compose DOCKER_COMPOSE := docker compose
# Architecture detection for Docker images # Architecture detection for Docker images
@@ -79,16 +80,23 @@ db-reset: ## Drop and recreate database (DESTRUCTIVE)
@sleep 3 @sleep 3
$(MAKE) db-init $(MAKE) db-init
load-data: ## Load Toronto data from APIs, seed dev data, run dbt # Domain-specific data loading
load-toronto: ## Load Toronto data from APIs
@echo "$(GREEN)Loading Toronto neighbourhood data...$(NC)" @echo "$(GREEN)Loading Toronto neighbourhood data...$(NC)"
$(PYTHON) scripts/data/load_toronto_data.py $(PYTHON) scripts/data/load_toronto_data.py
@echo "$(GREEN)Seeding development data...$(NC)" @echo "$(GREEN)Seeding Toronto development data...$(NC)"
$(PYTHON) scripts/data/seed_amenity_data.py $(PYTHON) scripts/data/seed_amenity_data.py
load-data-only: ## Load Toronto data without running dbt or seeding load-toronto-only: ## Load Toronto data without running dbt or seeding
@echo "$(GREEN)Loading Toronto data (skip dbt)...$(NC)" @echo "$(GREEN)Loading Toronto data (skip dbt)...$(NC)"
$(PYTHON) scripts/data/load_toronto_data.py --skip-dbt $(PYTHON) scripts/data/load_toronto_data.py --skip-dbt
# Aggregate data loading
load-data: load-toronto ## Load all project data (currently: Toronto)
@echo "$(GREEN)All data loaded!$(NC)"
load-all: load-data ## Alias for load-data
seed-data: ## Seed sample development data (amenities, median_age) seed-data: ## Seed sample development data (amenities, median_age)
@echo "$(GREEN)Seeding development data...$(NC)" @echo "$(GREEN)Seeding development data...$(NC)"
$(PYTHON) scripts/data/seed_amenity_data.py $(PYTHON) scripts/data/seed_amenity_data.py
@@ -119,15 +127,15 @@ test-cov: ## Run pytest with coverage
dbt-run: ## Run dbt models dbt-run: ## Run dbt models
@echo "$(GREEN)Running dbt models...$(NC)" @echo "$(GREEN)Running dbt models...$(NC)"
cd dbt && dbt run --profiles-dir . @set -a && . ./.env && set +a && cd dbt && dbt run --profiles-dir .
dbt-test: ## Run dbt tests dbt-test: ## Run dbt tests
@echo "$(GREEN)Running dbt tests...$(NC)" @echo "$(GREEN)Running dbt tests...$(NC)"
cd dbt && dbt test --profiles-dir . @set -a && . ./.env && set +a && cd dbt && dbt test --profiles-dir .
dbt-docs: ## Generate dbt documentation dbt-docs: ## Generate dbt documentation
@echo "$(GREEN)Generating dbt docs...$(NC)" @echo "$(GREEN)Generating dbt docs...$(NC)"
cd dbt && dbt docs generate --profiles-dir . && dbt docs serve --profiles-dir . @set -a && . ./.env && set +a && cd dbt && dbt docs generate --profiles-dir . && dbt docs serve --profiles-dir .
# ============================================================================= # =============================================================================
# Code Quality # Code Quality

View File

@@ -115,28 +115,31 @@ portfolio_app/
│ ├── tabs/ # Tab layouts (5) │ ├── tabs/ # Tab layouts (5)
│ └── callbacks/ # Interaction logic │ └── callbacks/ # Interaction logic
├── components/ # Shared UI components ├── components/ # Shared UI components
├── figures/ # Plotly figure factories ├── figures/
│ └── toronto/ # Toronto figure factories
├── content/ ├── content/
│ └── blog/ # Markdown blog articles │ └── blog/ # Markdown blog articles
├── toronto/ # Toronto data logic ├── toronto/ # Toronto data logic
│ ├── parsers/ # API data extraction │ ├── parsers/ # API data extraction
│ ├── loaders/ # Database operations │ ├── loaders/ # Database operations
│ ├── schemas/ # Pydantic models │ ├── schemas/ # Pydantic models
│ └── models/ # SQLAlchemy ORM │ └── models/ # SQLAlchemy ORM (raw_toronto schema)
└── errors/ # Exception handling └── errors/ # Exception handling
dbt/ dbt/ # dbt project: portfolio
├── models/ ├── models/
│ ├── staging/ # 1:1 source tables │ ├── shared/ # Cross-domain dimensions
│ ├── intermediate/ # Business logic │ ├── staging/toronto/ # Toronto staging models
── marts/ # Analytical tables ── intermediate/toronto/ # Toronto intermediate models
│ └── marts/toronto/ # Toronto analytical tables
notebooks/ # Data documentation (15 notebooks) notebooks/
── overview/ # Overview tab visualizations ── toronto/ # Toronto documentation (15 notebooks)
├── housing/ # Housing tab visualizations ├── overview/ # Overview tab visualizations
├── safety/ # Safety tab visualizations ├── housing/ # Housing tab visualizations
├── demographics/ # Demographics tab visualizations ├── safety/ # Safety tab visualizations
└── amenities/ # Amenities tab visualizations ├── demographics/ # Demographics tab visualizations
└── amenities/ # Amenities tab visualizations
docs/ docs/
├── PROJECT_REFERENCE.md # Architecture reference ├── PROJECT_REFERENCE.md # Architecture reference

View File

@@ -1,8 +1,7 @@
name: 'toronto_housing' name: 'portfolio'
version: '1.0.0'
config-version: 2 config-version: 2
profile: 'toronto_housing' profile: 'portfolio'
model-paths: ["models"] model-paths: ["models"]
analysis-paths: ["analyses"] analysis-paths: ["analyses"]
@@ -16,13 +15,19 @@ clean-targets:
- "dbt_packages" - "dbt_packages"
models: models:
toronto_housing: portfolio:
shared:
+materialized: view
+schema: shared
staging: staging:
+materialized: view toronto:
+schema: staging +materialized: view
+schema: stg_toronto
intermediate: intermediate:
+materialized: view toronto:
+schema: intermediate +materialized: view
+schema: int_toronto
marts: marts:
+materialized: table toronto:
+schema: marts +materialized: table
+schema: mart_toronto

View File

@@ -0,0 +1,11 @@
-- Override dbt default schema name generation.
-- Use the custom schema name directly instead of
-- concatenating with the target schema.
-- See: https://docs.getdbt.com/docs/build/custom-schemas
{% macro generate_schema_name(custom_schema_name, node) %}
{%- if custom_schema_name is none -%}
{{ target.schema }}
{%- else -%}
{{ custom_schema_name | trim }}
{%- endif -%}
{% endmacro %}

View File

@@ -5,11 +5,11 @@ models:
description: "Rental data enriched with time and zone dimensions" description: "Rental data enriched with time and zone dimensions"
columns: columns:
- name: rental_id - name: rental_id
tests: data_tests:
- unique - unique
- not_null - not_null
- name: zone_code - name: zone_code
tests: data_tests:
- not_null - not_null
- name: int_neighbourhood__demographics - name: int_neighbourhood__demographics
@@ -17,11 +17,11 @@ models:
columns: columns:
- name: neighbourhood_id - name: neighbourhood_id
description: "Neighbourhood identifier" description: "Neighbourhood identifier"
tests: data_tests:
- not_null - not_null
- name: census_year - name: census_year
description: "Census year" description: "Census year"
tests: data_tests:
- not_null - not_null
- name: income_quintile - name: income_quintile
description: "Income quintile (1-5, city-wide)" description: "Income quintile (1-5, city-wide)"
@@ -31,7 +31,7 @@ models:
columns: columns:
- name: neighbourhood_id - name: neighbourhood_id
description: "Neighbourhood identifier" description: "Neighbourhood identifier"
tests: data_tests:
- not_null - not_null
- name: year - name: year
description: "Reference year" description: "Reference year"
@@ -45,11 +45,11 @@ models:
columns: columns:
- name: neighbourhood_id - name: neighbourhood_id
description: "Neighbourhood identifier" description: "Neighbourhood identifier"
tests: data_tests:
- not_null - not_null
- name: year - name: year
description: "Statistics year" description: "Statistics year"
tests: data_tests:
- not_null - not_null
- name: crime_rate_per_100k - name: crime_rate_per_100k
description: "Total crime rate per 100K population" description: "Total crime rate per 100K population"
@@ -61,7 +61,7 @@ models:
columns: columns:
- name: neighbourhood_id - name: neighbourhood_id
description: "Neighbourhood identifier" description: "Neighbourhood identifier"
tests: data_tests:
- not_null - not_null
- name: year - name: year
description: "Reference year" description: "Reference year"
@@ -75,11 +75,11 @@ models:
columns: columns:
- name: neighbourhood_id - name: neighbourhood_id
description: "Neighbourhood identifier" description: "Neighbourhood identifier"
tests: data_tests:
- not_null - not_null
- name: year - name: year
description: "Survey year" description: "Survey year"
tests: data_tests:
- not_null - not_null
- name: avg_rent_2bed - name: avg_rent_2bed
description: "Weighted average 2-bedroom rent" description: "Weighted average 2-bedroom rent"

View File

@@ -16,12 +16,12 @@ crime_by_year as (
neighbourhood_id, neighbourhood_id,
crime_year as year, crime_year as year,
sum(incident_count) as total_incidents, sum(incident_count) as total_incidents,
sum(case when crime_type = 'Assault' then incident_count else 0 end) as assault_count, sum(case when crime_type = 'assault' then incident_count else 0 end) as assault_count,
sum(case when crime_type = 'Auto Theft' then incident_count else 0 end) as auto_theft_count, sum(case when crime_type = 'auto_theft' then incident_count else 0 end) as auto_theft_count,
sum(case when crime_type = 'Break and Enter' then incident_count else 0 end) as break_enter_count, sum(case when crime_type = 'break_and_enter' then incident_count else 0 end) as break_enter_count,
sum(case when crime_type = 'Robbery' then incident_count else 0 end) as robbery_count, sum(case when crime_type = 'robbery' then incident_count else 0 end) as robbery_count,
sum(case when crime_type = 'Theft Over' then incident_count else 0 end) as theft_over_count, sum(case when crime_type = 'theft_over' then incident_count else 0 end) as theft_over_count,
sum(case when crime_type = 'Homicide' then incident_count else 0 end) as homicide_count, sum(case when crime_type = 'homicide' then incident_count else 0 end) as homicide_count,
avg(rate_per_100k) as avg_rate_per_100k avg(rate_per_100k) as avg_rate_per_100k
from crime from crime
group by neighbourhood_id, crime_year group by neighbourhood_id, crime_year

View File

@@ -42,10 +42,10 @@ pivoted as (
select select
neighbourhood_id, neighbourhood_id,
year, year,
max(case when bedroom_type = 'Two Bedroom' then weighted_avg_rent / nullif(total_weight, 0) end) as avg_rent_2bed, max(case when bedroom_type = '2bed' then weighted_avg_rent / nullif(total_weight, 0) end) as avg_rent_2bed,
max(case when bedroom_type = 'One Bedroom' then weighted_avg_rent / nullif(total_weight, 0) end) as avg_rent_1bed, max(case when bedroom_type = '1bed' then weighted_avg_rent / nullif(total_weight, 0) end) as avg_rent_1bed,
max(case when bedroom_type = 'Bachelor' then weighted_avg_rent / nullif(total_weight, 0) end) as avg_rent_bachelor, max(case when bedroom_type = 'bachelor' then weighted_avg_rent / nullif(total_weight, 0) end) as avg_rent_bachelor,
max(case when bedroom_type = 'Three Bedroom +' then weighted_avg_rent / nullif(total_weight, 0) end) as avg_rent_3bed, max(case when bedroom_type = '3bed' then weighted_avg_rent / nullif(total_weight, 0) end) as avg_rent_3bed,
avg(vacancy_rate) as vacancy_rate, avg(vacancy_rate) as vacancy_rate,
sum(rental_units_estimate) as total_rental_units sum(rental_units_estimate) as total_rental_units
from allocated from allocated

View File

@@ -6,7 +6,7 @@ models:
columns: columns:
- name: rental_id - name: rental_id
description: "Unique rental record identifier" description: "Unique rental record identifier"
tests: data_tests:
- unique - unique
- not_null - not_null
@@ -17,11 +17,11 @@ models:
columns: columns:
- name: neighbourhood_id - name: neighbourhood_id
description: "Neighbourhood identifier" description: "Neighbourhood identifier"
tests: data_tests:
- not_null - not_null
- name: neighbourhood_name - name: neighbourhood_name
description: "Official neighbourhood name" description: "Official neighbourhood name"
tests: data_tests:
- not_null - not_null
- name: geometry - name: geometry
description: "PostGIS geometry for mapping" description: "PostGIS geometry for mapping"
@@ -41,11 +41,11 @@ models:
columns: columns:
- name: neighbourhood_id - name: neighbourhood_id
description: "Neighbourhood identifier" description: "Neighbourhood identifier"
tests: data_tests:
- not_null - not_null
- name: neighbourhood_name - name: neighbourhood_name
description: "Official neighbourhood name" description: "Official neighbourhood name"
tests: data_tests:
- not_null - not_null
- name: geometry - name: geometry
description: "PostGIS geometry for mapping" description: "PostGIS geometry for mapping"
@@ -63,11 +63,11 @@ models:
columns: columns:
- name: neighbourhood_id - name: neighbourhood_id
description: "Neighbourhood identifier" description: "Neighbourhood identifier"
tests: data_tests:
- not_null - not_null
- name: neighbourhood_name - name: neighbourhood_name
description: "Official neighbourhood name" description: "Official neighbourhood name"
tests: data_tests:
- not_null - not_null
- name: geometry - name: geometry
description: "PostGIS geometry for mapping" description: "PostGIS geometry for mapping"
@@ -77,7 +77,7 @@ models:
description: "100 = city average crime rate" description: "100 = city average crime rate"
- name: safety_tier - name: safety_tier
description: "Safety tier (1=safest, 5=highest crime)" description: "Safety tier (1=safest, 5=highest crime)"
tests: data_tests:
- accepted_values: - accepted_values:
arguments: arguments:
values: [1, 2, 3, 4, 5] values: [1, 2, 3, 4, 5]
@@ -89,11 +89,11 @@ models:
columns: columns:
- name: neighbourhood_id - name: neighbourhood_id
description: "Neighbourhood identifier" description: "Neighbourhood identifier"
tests: data_tests:
- not_null - not_null
- name: neighbourhood_name - name: neighbourhood_name
description: "Official neighbourhood name" description: "Official neighbourhood name"
tests: data_tests:
- not_null - not_null
- name: geometry - name: geometry
description: "PostGIS geometry for mapping" description: "PostGIS geometry for mapping"
@@ -103,7 +103,7 @@ models:
description: "100 = city average income" description: "100 = city average income"
- name: income_quintile - name: income_quintile
description: "Income quintile (1-5)" description: "Income quintile (1-5)"
tests: data_tests:
- accepted_values: - accepted_values:
arguments: arguments:
values: [1, 2, 3, 4, 5] values: [1, 2, 3, 4, 5]
@@ -115,11 +115,11 @@ models:
columns: columns:
- name: neighbourhood_id - name: neighbourhood_id
description: "Neighbourhood identifier" description: "Neighbourhood identifier"
tests: data_tests:
- not_null - not_null
- name: neighbourhood_name - name: neighbourhood_name
description: "Official neighbourhood name" description: "Official neighbourhood name"
tests: data_tests:
- not_null - not_null
- name: geometry - name: geometry
description: "PostGIS geometry for mapping" description: "PostGIS geometry for mapping"
@@ -129,7 +129,7 @@ models:
description: "100 = city average amenities" description: "100 = city average amenities"
- name: amenity_tier - name: amenity_tier
description: "Amenity tier (1=best, 5=lowest)" description: "Amenity tier (1=best, 5=lowest)"
tests: data_tests:
- accepted_values: - accepted_values:
arguments: arguments:
values: [1, 2, 3, 4, 5] values: [1, 2, 3, 4, 5]

View File

@@ -128,7 +128,8 @@ final as (
-- Component scores (0-100) -- Component scores (0-100)
round(safety_score::numeric, 1) as safety_score, round(safety_score::numeric, 1) as safety_score,
round(affordability_score::numeric, 1) as affordability_score, round(affordability_score::numeric, 1) as affordability_score,
-- Amenity score not available at this level, use placeholder -- TODO: Replace with actual amenity score when fact_amenities is populated
-- Currently uses neutral placeholder (50.0) which affects livability_score accuracy
50.0 as amenity_score, 50.0 as amenity_score,
-- Composite livability score: safety (40%), affordability (40%), amenities (20%) -- Composite livability score: safety (40%), affordability (40%), amenities (20%)

View File

@@ -0,0 +1,33 @@
version: 2
models:
- name: stg_dimensions__time
description: "Staged time dimension - shared across all projects"
columns:
- name: date_key
description: "Primary key (YYYYMM format)"
data_tests:
- unique
- not_null
- name: full_date
description: "First day of month"
data_tests:
- not_null
- name: year
description: "Calendar year"
data_tests:
- not_null
- name: month
description: "Month number (1-12)"
data_tests:
- not_null
- name: quarter
description: "Quarter (1-4)"
data_tests:
- not_null
- name: month_name
description: "Month name"
data_tests:
- not_null
- name: is_month_start
description: "Always true (monthly grain)"

View File

@@ -0,0 +1,25 @@
version: 2
sources:
- name: shared
description: "Shared dimension tables used across all dashboards"
database: portfolio
schema: public
tables:
- name: dim_time
description: "Time dimension (monthly grain) - shared across all projects"
columns:
- name: date_key
description: "Primary key (YYYYMM format)"
- name: full_date
description: "First day of month"
- name: year
description: "Calendar year"
- name: month
description: "Month number (1-12)"
- name: quarter
description: "Quarter (1-4)"
- name: month_name
description: "Month name"
- name: is_month_start
description: "Always true (monthly grain)"

View File

@@ -1,9 +1,10 @@
-- Staged time dimension -- Staged time dimension
-- Source: dim_time table -- Source: shared.dim_time table
-- Grain: One row per month -- Grain: One row per month
-- Note: Shared dimension used across all dashboard projects
with source as ( with source as (
select * from {{ source('toronto_housing', 'dim_time') }} select * from {{ source('shared', 'dim_time') }}
), ),
staged as ( staged as (

View File

@@ -1,18 +0,0 @@
-- Staged CMHC zone dimension
-- Source: dim_cmhc_zone table
-- Grain: One row per zone
with source as (
select * from {{ source('toronto_housing', 'dim_cmhc_zone') }}
),
staged as (
select
zone_key,
zone_code,
zone_name,
geometry
from source
)
select * from staged

View File

@@ -1,10 +1,10 @@
version: 2 version: 2
sources: sources:
- name: toronto_housing - name: toronto
description: "Toronto housing data loaded from CMHC and City of Toronto sources" description: "Toronto data loaded from CMHC and City of Toronto sources"
database: portfolio database: portfolio
schema: public schema: raw_toronto
tables: tables:
- name: fact_rentals - name: fact_rentals
description: "CMHC annual rental survey data by zone and bedroom type" description: "CMHC annual rental survey data by zone and bedroom type"
@@ -16,12 +16,6 @@ sources:
- name: zone_key - name: zone_key
description: "Foreign key to dim_cmhc_zone" description: "Foreign key to dim_cmhc_zone"
- name: dim_time
description: "Time dimension (monthly grain)"
columns:
- name: date_key
description: "Primary key (YYYYMMDD format)"
- name: dim_cmhc_zone - name: dim_cmhc_zone
description: "CMHC zone dimension with geometry" description: "CMHC zone dimension with geometry"
columns: columns:

View File

@@ -6,25 +6,16 @@ models:
columns: columns:
- name: rental_id - name: rental_id
description: "Unique identifier for rental record" description: "Unique identifier for rental record"
tests: data_tests:
- unique - unique
- not_null - not_null
- name: date_key - name: date_key
description: "Date dimension key (YYYYMMDD)" description: "Date dimension key (YYYYMMDD)"
tests: data_tests:
- not_null - not_null
- name: zone_key - name: zone_key
description: "CMHC zone dimension key" description: "CMHC zone dimension key"
tests: data_tests:
- not_null
- name: stg_dimensions__time
description: "Staged time dimension"
columns:
- name: date_key
description: "Date dimension key (YYYYMMDD)"
tests:
- unique
- not_null - not_null
- name: stg_dimensions__cmhc_zones - name: stg_dimensions__cmhc_zones
@@ -32,12 +23,12 @@ models:
columns: columns:
- name: zone_key - name: zone_key
description: "Zone dimension key" description: "Zone dimension key"
tests: data_tests:
- unique - unique
- not_null - not_null
- name: zone_code - name: zone_code
description: "CMHC zone code" description: "CMHC zone code"
tests: data_tests:
- unique - unique
- not_null - not_null
@@ -46,12 +37,12 @@ models:
columns: columns:
- name: neighbourhood_id - name: neighbourhood_id
description: "Neighbourhood primary key" description: "Neighbourhood primary key"
tests: data_tests:
- unique - unique
- not_null - not_null
- name: neighbourhood_name - name: neighbourhood_name
description: "Official neighbourhood name" description: "Official neighbourhood name"
tests: data_tests:
- not_null - not_null
- name: geometry - name: geometry
description: "PostGIS geometry (POLYGON)" description: "PostGIS geometry (POLYGON)"
@@ -61,16 +52,16 @@ models:
columns: columns:
- name: census_id - name: census_id
description: "Census record identifier" description: "Census record identifier"
tests: data_tests:
- unique - unique
- not_null - not_null
- name: neighbourhood_id - name: neighbourhood_id
description: "Neighbourhood foreign key" description: "Neighbourhood foreign key"
tests: data_tests:
- not_null - not_null
- name: census_year - name: census_year
description: "Census year (2016, 2021)" description: "Census year (2016, 2021)"
tests: data_tests:
- not_null - not_null
- name: stg_toronto__crime - name: stg_toronto__crime
@@ -78,16 +69,16 @@ models:
columns: columns:
- name: crime_id - name: crime_id
description: "Crime record identifier" description: "Crime record identifier"
tests: data_tests:
- unique - unique
- not_null - not_null
- name: neighbourhood_id - name: neighbourhood_id
description: "Neighbourhood foreign key" description: "Neighbourhood foreign key"
tests: data_tests:
- not_null - not_null
- name: crime_type - name: crime_type
description: "Type of crime" description: "Type of crime"
tests: data_tests:
- not_null - not_null
- name: stg_toronto__amenities - name: stg_toronto__amenities
@@ -95,16 +86,16 @@ models:
columns: columns:
- name: amenity_id - name: amenity_id
description: "Amenity record identifier" description: "Amenity record identifier"
tests: data_tests:
- unique - unique
- not_null - not_null
- name: neighbourhood_id - name: neighbourhood_id
description: "Neighbourhood foreign key" description: "Neighbourhood foreign key"
tests: data_tests:
- not_null - not_null
- name: amenity_type - name: amenity_type
description: "Type of amenity" description: "Type of amenity"
tests: data_tests:
- not_null - not_null
- name: stg_cmhc__zone_crosswalk - name: stg_cmhc__zone_crosswalk
@@ -112,18 +103,18 @@ models:
columns: columns:
- name: crosswalk_id - name: crosswalk_id
description: "Crosswalk record identifier" description: "Crosswalk record identifier"
tests: data_tests:
- unique - unique
- not_null - not_null
- name: cmhc_zone_code - name: cmhc_zone_code
description: "CMHC zone code" description: "CMHC zone code"
tests: data_tests:
- not_null - not_null
- name: neighbourhood_id - name: neighbourhood_id
description: "Neighbourhood foreign key" description: "Neighbourhood foreign key"
tests: data_tests:
- not_null - not_null
- name: area_weight - name: area_weight
description: "Proportional area weight (0-1)" description: "Proportional area weight (0-1)"
tests: data_tests:
- not_null - not_null

View File

@@ -6,8 +6,8 @@ with source as (
select select
f.*, f.*,
t.year as survey_year t.year as survey_year
from {{ source('toronto_housing', 'fact_rentals') }} f from {{ source('toronto', 'fact_rentals') }} f
join {{ source('toronto_housing', 'dim_time') }} t on f.date_key = t.date_key join {{ source('shared', 'dim_time') }} t on f.date_key = t.date_key
), ),
staged as ( staged as (

View File

@@ -3,7 +3,7 @@
-- Grain: One row per zone-neighbourhood intersection -- Grain: One row per zone-neighbourhood intersection
with source as ( with source as (
select * from {{ source('toronto_housing', 'bridge_cmhc_neighbourhood') }} select * from {{ source('toronto', 'bridge_cmhc_neighbourhood') }}
), ),
staged as ( staged as (

View File

@@ -0,0 +1,19 @@
-- Staged CMHC zone dimension
-- Source: dim_cmhc_zone table
-- Grain: One row per zone
with source as (
select * from {{ source('toronto', 'dim_cmhc_zone') }}
),
staged as (
select
zone_key,
zone_code,
zone_name
-- geometry column excluded: CMHC does not provide zone boundaries
-- Spatial analysis uses dim_neighbourhood geometry instead
from source
)
select * from staged

View File

@@ -3,7 +3,7 @@
-- Grain: One row per neighbourhood per amenity type per year -- Grain: One row per neighbourhood per amenity type per year
with source as ( with source as (
select * from {{ source('toronto_housing', 'fact_amenities') }} select * from {{ source('toronto', 'fact_amenities') }}
), ),
staged as ( staged as (

View File

@@ -3,7 +3,7 @@
-- Grain: One row per neighbourhood per census year -- Grain: One row per neighbourhood per census year
with source as ( with source as (
select * from {{ source('toronto_housing', 'fact_census') }} select * from {{ source('toronto', 'fact_census') }}
), ),
staged as ( staged as (

View File

@@ -3,7 +3,7 @@
-- Grain: One row per neighbourhood per year per crime type -- Grain: One row per neighbourhood per year per crime type
with source as ( with source as (
select * from {{ source('toronto_housing', 'fact_crime') }} select * from {{ source('toronto', 'fact_crime') }}
), ),
staged as ( staged as (

View File

@@ -3,7 +3,7 @@
-- Grain: One row per neighbourhood (158 total) -- Grain: One row per neighbourhood (158 total)
with source as ( with source as (
select * from {{ source('toronto_housing', 'dim_neighbourhood') }} select * from {{ source('toronto', 'dim_neighbourhood') }}
), ),
staged as ( staged as (

View File

@@ -1,4 +1,4 @@
toronto_housing: portfolio:
target: dev target: dev
outputs: outputs:
dev: dev:

View File

@@ -290,7 +290,7 @@ Dashboard tabs are in `portfolio_app/pages/toronto/tabs/`.
import dash_mantine_components as dmc import dash_mantine_components as dmc
from portfolio_app.figures.choropleth import create_choropleth from portfolio_app.figures.toronto.choropleth import create_choropleth
from portfolio_app.toronto.demo_data import get_demo_data from portfolio_app.toronto.demo_data import get_demo_data
@@ -339,13 +339,13 @@ dmc.TabsPanel(create_your_tab_layout(), value="your-tab"),
## Creating Figure Factories ## Creating Figure Factories
Figure factories are in `portfolio_app/figures/`. They create reusable Plotly figures. Figure factories are organized by dashboard domain under `portfolio_app/figures/{domain}/`.
### Pattern ### Pattern
```python ```python
# figures/your_chart.py # figures/toronto/your_chart.py
"""Your chart type factory.""" """Your chart type factory for Toronto dashboard."""
import plotly.express as px import plotly.express as px
import plotly.graph_objects as go import plotly.graph_objects as go
@@ -382,7 +382,7 @@ def create_your_chart(
### Export from `__init__.py` ### Export from `__init__.py`
```python ```python
# figures/__init__.py # figures/toronto/__init__.py
from .your_chart import create_your_chart from .your_chart import create_your_chart
__all__ = [ __all__ = [
@@ -391,6 +391,14 @@ __all__ = [
] ]
``` ```
### Importing Figure Factories
```python
# In callbacks or tabs
from portfolio_app.figures.toronto import create_choropleth_figure
from portfolio_app.figures.toronto.bar_charts import create_ranking_bar
```
--- ---
## Branch Workflow ## Branch Workflow

View File

@@ -116,18 +116,40 @@ erDiagram
## Schema Layers ## Schema Layers
### Raw Schema ### Database Schemas
Raw data is loaded directly from external sources without transformation: | Schema | Purpose | Managed By |
|--------|---------|------------|
| `public` | Shared dimensions (dim_time) | SQLAlchemy |
| `raw_toronto` | Toronto dimension and fact tables | SQLAlchemy |
| `stg_toronto` | Toronto staging models | dbt |
| `int_toronto` | Toronto intermediate models | dbt |
| `mart_toronto` | Toronto analytical tables | dbt |
### Raw Toronto Schema (raw_toronto)
Toronto-specific tables loaded by SQLAlchemy:
| Table | Source | Description | | Table | Source | Description |
|-------|--------|-------------| |-------|--------|-------------|
| `raw.neighbourhoods` | City of Toronto API | GeoJSON neighbourhood boundaries | | `dim_neighbourhood` | City of Toronto API | 158 neighbourhood boundaries |
| `raw.census_profiles` | City of Toronto API | Census profile data | | `dim_cmhc_zone` | CMHC | ~20 rental market zones |
| `raw.crime_data` | Toronto Police API | Crime statistics by neighbourhood | | `dim_policy_event` | Manual | Policy events for annotation |
| `raw.cmhc_rentals` | CMHC Data Files | Rental market survey data | | `fact_census` | City of Toronto API | Census profile data |
| `fact_crime` | Toronto Police API | Crime statistics |
| `fact_amenities` | City of Toronto API | Amenity counts |
| `fact_rentals` | CMHC Data Files | Rental market survey data |
| `bridge_cmhc_neighbourhood` | Computed | Zone-neighbourhood mapping |
### Staging Schema (dbt) ### Public Schema
Shared dimensions used across all projects:
| Table | Description |
|-------|-------------|
| `dim_time` | Time dimension (monthly grain) |
### Staging Schema - stg_toronto (dbt)
Staging models provide 1:1 cleaned representations of source data: Staging models provide 1:1 cleaned representations of source data:
@@ -142,7 +164,7 @@ Staging models provide 1:1 cleaned representations of source data:
| `stg_dimensions__cmhc_zones` | raw.cmhc_zones | CMHC zone boundaries | | `stg_dimensions__cmhc_zones` | raw.cmhc_zones | CMHC zone boundaries |
| `stg_cmhc__zone_crosswalk` | raw.crosswalk | Zone-neighbourhood mapping | | `stg_cmhc__zone_crosswalk` | raw.crosswalk | Zone-neighbourhood mapping |
### Marts Schema (dbt) ### Marts Schema - mart_toronto (dbt)
Analytical tables ready for dashboard consumption: Analytical tables ready for dashboard consumption:

View File

@@ -76,7 +76,8 @@ portfolio_app/
├── components/ # Shared UI components ├── components/ # Shared UI components
├── content/blog/ # Markdown blog articles ├── content/blog/ # Markdown blog articles
├── errors/ # Exception handling ├── errors/ # Exception handling
├── figures/ # Plotly figure factories ├── figures/
│ └── toronto/ # Toronto figure factories
├── pages/ ├── pages/
│ ├── home.py │ ├── home.py
│ ├── about.py │ ├── about.py
@@ -96,11 +97,21 @@ portfolio_app/
│ ├── parsers/ # API extraction (geo, toronto_open_data, toronto_police, cmhc) │ ├── parsers/ # API extraction (geo, toronto_open_data, toronto_police, cmhc)
│ ├── loaders/ # Database operations (base, cmhc, cmhc_crosswalk) │ ├── loaders/ # Database operations (base, cmhc, cmhc_crosswalk)
│ ├── schemas/ # Pydantic models │ ├── schemas/ # Pydantic models
│ ├── models/ # SQLAlchemy ORM │ ├── models/ # SQLAlchemy ORM (raw_toronto schema)
│ ├── services/ # Query functions (neighbourhood_service, geometry_service) │ ├── services/ # Query functions (neighbourhood_service, geometry_service)
│ └── demo_data.py # Sample data │ └── demo_data.py # Sample data
└── utils/ └── utils/
└── markdown_loader.py # Blog article loading └── markdown_loader.py # Blog article loading
dbt/ # dbt project: portfolio
├── models/
│ ├── shared/ # Cross-domain dimensions
│ ├── staging/toronto/ # Toronto staging models
│ ├── intermediate/toronto/ # Toronto intermediate models
│ └── marts/toronto/ # Toronto mart tables
notebooks/
└── toronto/ # Toronto documentation notebooks
``` ```
--- ---
@@ -144,10 +155,20 @@ CMHC Zones (~20) ← Rental data (Census Tract aligned)
| `fact_rentals` | Fact | Rental data by CMHC zone | | `fact_rentals` | Fact | Rental data by CMHC zone |
| `fact_amenities` | Fact | Amenity counts by neighbourhood | | `fact_amenities` | Fact | Amenity counts by neighbourhood |
### dbt Layers ### dbt Project: `portfolio`
**Model Structure:**
```
dbt/models/
├── shared/ # Cross-domain dimensions (stg_dimensions__time)
├── staging/toronto/ # Toronto staging models
├── intermediate/toronto/ # Toronto intermediate models
└── marts/toronto/ # Toronto mart tables
```
| Layer | Naming | Example | | Layer | Naming | Example |
|-------|--------|---------| |-------|--------|---------|
| Shared | `stg_dimensions__*` | `stg_dimensions__time` |
| Staging | `stg_{source}__{entity}` | `stg_toronto__neighbourhoods` | | Staging | `stg_{source}__{entity}` | `stg_toronto__neighbourhoods` |
| Intermediate | `int_{domain}__{transform}` | `int_neighbourhood__demographics` | | Intermediate | `int_{domain}__{transform}` | `int_neighbourhood__demographics` |
| Marts | `mart_{domain}` | `mart_neighbourhood_overview` | | Marts | `mart_{domain}` | `mart_neighbourhood_overview` |
@@ -248,7 +269,7 @@ LOG_LEVEL=INFO
| `db-init` | Initialize database schema | | `db-init` | Initialize database schema |
| `db-reset` | Drop and recreate database (DESTRUCTIVE) | | `db-reset` | Drop and recreate database (DESTRUCTIVE) |
| `load-data` | Load Toronto data from APIs, seed dev data | | `load-data` | Load Toronto data from APIs, seed dev data |
| `load-data-only` | Load Toronto data without dbt or seeding | | `load-toronto-only` | Load Toronto data without dbt or seeding |
| `seed-data` | Seed sample development data | | `seed-data` | Seed sample development data |
| `run` | Start Dash dev server | | `run` | Start Dash dev server |
| `test` | Run pytest | | `test` | Run pytest |

View File

@@ -10,7 +10,9 @@ This runbook describes how to add a new data dashboard to the portfolio applicat
## Directory Structure ## Directory Structure
Create the following structure under `portfolio_app/`: Create the following structure:
### Application Code (`portfolio_app/`)
``` ```
portfolio_app/ portfolio_app/
@@ -33,8 +35,40 @@ portfolio_app/
│ │ └── __init__.py │ │ └── __init__.py
│ ├── schemas/ # Pydantic models │ ├── schemas/ # Pydantic models
│ │ └── __init__.py │ │ └── __init__.py
│ └── models/ # SQLAlchemy ORM │ └── models/ # SQLAlchemy ORM (schema: raw_{dashboard_name})
│ └── __init__.py │ └── __init__.py
└── figures/
└── {dashboard_name}/ # Figure factories for this dashboard
├── __init__.py
└── ... # Chart modules
```
### dbt Models (`dbt/models/`)
```
dbt/models/
├── staging/
│ └── {dashboard_name}/ # Staging models
│ ├── _sources.yml # Source definitions (schema: raw_{dashboard_name})
│ ├── _staging.yml # Model tests/docs
│ └── stg_*.sql # Staging models
├── intermediate/
│ └── {dashboard_name}/ # Intermediate models
│ ├── _intermediate.yml
│ └── int_*.sql
└── marts/
└── {dashboard_name}/ # Mart tables
├── _marts.yml
└── mart_*.sql
```
### Documentation (`notebooks/`)
```
notebooks/
└── {dashboard_name}/ # Domain subdirectories
├── overview/
├── ...
``` ```
## Step-by-Step Checklist ## Step-by-Step Checklist
@@ -47,24 +81,55 @@ portfolio_app/
- [ ] Create loaders in `{dashboard_name}/loaders/` - [ ] Create loaders in `{dashboard_name}/loaders/`
- [ ] Add database migrations if needed - [ ] Add database migrations if needed
### 2. dbt Models ### 2. Database Schema
- [ ] Define schema constant in models (e.g., `RAW_FOOTBALL_SCHEMA = "raw_football"`)
- [ ] Add `__table_args__ = {"schema": RAW_FOOTBALL_SCHEMA}` to all models
- [ ] Update `scripts/db/init_schema.py` to create the new schema
### 3. dbt Models
Create dbt models in `dbt/models/`: Create dbt models in `dbt/models/`:
- [ ] `staging/stg_{source}__{entity}.sql` - Raw data cleaning - [ ] `staging/{dashboard_name}/_sources.yml` - Source definitions pointing to `raw_{dashboard_name}` schema
- [ ] `intermediate/int_{domain}__{transform}.sql` - Business logic - [ ] `staging/{dashboard_name}/stg_{source}__{entity}.sql` - Raw data cleaning
- [ ] `marts/mart_{domain}.sql` - Final analytical tables - [ ] `intermediate/{dashboard_name}/int_{domain}__{transform}.sql` - Business logic
- [ ] `marts/{dashboard_name}/mart_{domain}.sql` - Final analytical tables
Update `dbt/dbt_project.yml` with new subdirectory config:
```yaml
models:
portfolio:
staging:
{dashboard_name}:
+materialized: view
+schema: stg_{dashboard_name}
intermediate:
{dashboard_name}:
+materialized: view
+schema: int_{dashboard_name}
marts:
{dashboard_name}:
+materialized: table
+schema: mart_{dashboard_name}
```
Follow naming conventions: Follow naming conventions:
- Staging: `stg_{source}__{entity}` - Staging: `stg_{source}__{entity}`
- Intermediate: `int_{domain}__{transform}` - Intermediate: `int_{domain}__{transform}`
- Marts: `mart_{domain}` - Marts: `mart_{domain}`
### 3. Visualization Layer ### 4. Visualization Layer
- [ ] Create figure factories in `figures/` (or reuse existing) - [ ] Create figure factories in `figures/{dashboard_name}/`
- [ ] Create `figures/{dashboard_name}/__init__.py` with exports
- [ ] Follow the factory pattern: `create_{chart_type}_figure(data, **kwargs)` - [ ] Follow the factory pattern: `create_{chart_type}_figure(data, **kwargs)`
Import pattern:
```python
from portfolio_app.figures.{dashboard_name} import create_choropleth_figure
```
### 4. Dashboard Pages ### 4. Dashboard Pages
#### Main Dashboard (`pages/{dashboard_name}/dashboard.py`) #### Main Dashboard (`pages/{dashboard_name}/dashboard.py`)

View File

@@ -1,17 +1,18 @@
# Toronto Neighbourhood Dashboard - Notebooks # Dashboard Documentation Notebooks
Documentation notebooks for the Toronto Neighbourhood Dashboard visualizations. Each notebook documents how data is queried, transformed, and visualized using the figure factory pattern. Documentation notebooks organized by dashboard project. Each notebook documents how data is queried, transformed, and visualized using the figure factory pattern.
## Directory Structure ## Directory Structure
``` ```
notebooks/ notebooks/
├── README.md # This file ├── README.md # This file
── overview/ # Overview tab visualizations ── toronto/ # Toronto Neighbourhood Dashboard
├── housing/ # Housing tab visualizations ├── overview/ # Overview tab visualizations
├── safety/ # Safety tab visualizations ├── housing/ # Housing tab visualizations
├── demographics/ # Demographics tab visualizations ├── safety/ # Safety tab visualizations
└── amenities/ # Amenities tab visualizations ├── demographics/ # Demographics tab visualizations
└── amenities/ # Amenities tab visualizations
``` ```
## Notebook Template ## Notebook Template

View File

@@ -1,123 +0,0 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Amenity Radar Chart\n",
"\n",
"Spider/radar chart comparing amenity categories for selected neighbourhoods."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. Data Reference\n",
"\n",
"### Source Tables\n",
"\n",
"| Table | Grain | Key Columns |\n",
"|-------|-------|-------------|\n",
"| `mart_neighbourhood_amenities` | neighbourhood × year | parks_index, schools_index, transit_index |\n",
"\n",
"### SQL Query"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": "import pandas as pd\nfrom sqlalchemy import create_engine\nfrom dotenv import load_dotenv\nimport os\n\n# Load .env from project root\nload_dotenv('../../.env')\n\nengine = create_engine(os.environ['DATABASE_URL'])\n\nquery = \"\"\"\nSELECT\n neighbourhood_name,\n parks_index,\n schools_index,\n transit_index,\n amenity_index,\n amenity_tier\nFROM public_marts.mart_neighbourhood_amenities\nWHERE year = (SELECT MAX(year) FROM public_marts.mart_neighbourhood_amenities)\nORDER BY amenity_index DESC\n\"\"\"\n\ndf = pd.read_sql(query, engine)\nprint(f\"Loaded {len(df)} neighbourhoods\")"
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Transformation Steps\n",
"\n",
"1. Select top 5 and bottom 5 neighbourhoods by amenity index\n",
"2. Reshape for radar chart format"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Select representative neighbourhoods\n",
"top_5 = df.head(5)\n",
"bottom_5 = df.tail(5)\n",
"\n",
"# Prepare radar data\n",
"categories = ['Parks', 'Schools', 'Transit']\n",
"index_columns = ['parks_index', 'schools_index', 'transit_index']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Sample Output"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(\"Top 5 Amenity-Rich Neighbourhoods:\")\n",
"display(top_5[['neighbourhood_name', 'parks_index', 'schools_index', 'transit_index', 'amenity_index']])\n",
"print(\"\\nBottom 5 Underserved Neighbourhoods:\")\n",
"display(bottom_5[['neighbourhood_name', 'parks_index', 'schools_index', 'transit_index', 'amenity_index']])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. Data Visualization\n",
"\n",
"### Figure Factory\n",
"\n",
"Uses `create_radar` from `portfolio_app.figures.radar`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": "import sys\nsys.path.insert(0, '../..')\n\nfrom portfolio_app.figures.radar import create_comparison_radar\n\n# Compare top neighbourhood vs city average (100)\ntop_hood = top_5.iloc[0]\nmetrics = ['parks_index', 'schools_index', 'transit_index']\n\nfig = create_comparison_radar(\n selected_data=top_hood.to_dict(),\n average_data={'parks_index': 100, 'schools_index': 100, 'transit_index': 100},\n metrics=metrics,\n selected_name=top_hood['neighbourhood_name'],\n average_name='City Average',\n title=f\"Amenity Profile: {top_hood['neighbourhood_name']} vs City Average\",\n)\n\nfig.show()"
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Index Interpretation\n",
"\n",
"| Value | Meaning |\n",
"|-------|--------|\n",
"| < 100 | Below city average |\n",
"| = 100 | City average |\n",
"| > 100 | Above city average |"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"name": "python",
"version": "3.11.0"
}
},
"nbformat": 4,
"nbformat_minor": 4
}

View File

@@ -19,7 +19,7 @@
"\n", "\n",
"| Table | Grain | Key Columns |\n", "| Table | Grain | Key Columns |\n",
"|-------|-------|-------------|\n", "|-------|-------|-------------|\n",
"| `mart_neighbourhood_amenities` | neighbourhood \u00d7 year | amenity_index, total_amenities_per_1000, amenity_tier, geometry |\n", "| `mart_neighbourhood_amenities` | neighbourhood × year | amenity_index, total_amenities_per_1000, amenity_tier, geometry |\n",
"\n", "\n",
"### SQL Query" "### SQL Query"
] ]
@@ -30,15 +30,16 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"import pandas as pd\n",
"from sqlalchemy import create_engine\n",
"from dotenv import load_dotenv\n",
"import os\n", "import os\n",
"\n", "\n",
"# Load .env from project root\n", "import pandas as pd\n",
"load_dotenv('../../.env')\n", "from dotenv import load_dotenv\n",
"from sqlalchemy import create_engine\n",
"\n", "\n",
"engine = create_engine(os.environ['DATABASE_URL'])\n", "# Load .env from project root\n",
"load_dotenv(\"../../.env\")\n",
"\n",
"engine = create_engine(os.environ[\"DATABASE_URL\"])\n",
"\n", "\n",
"query = \"\"\"\n", "query = \"\"\"\n",
"SELECT\n", "SELECT\n",
@@ -79,17 +80,16 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"import geopandas as gpd\n",
"import json\n", "import json\n",
"\n", "\n",
"import geopandas as gpd\n",
"\n",
"gdf = gpd.GeoDataFrame(\n", "gdf = gpd.GeoDataFrame(\n",
" df,\n", " df, geometry=gpd.GeoSeries.from_wkb(df[\"geometry\"]), crs=\"EPSG:4326\"\n",
" geometry=gpd.GeoSeries.from_wkb(df['geometry']),\n",
" crs='EPSG:4326'\n",
")\n", ")\n",
"\n", "\n",
"geojson = json.loads(gdf.to_json())\n", "geojson = json.loads(gdf.to_json())\n",
"data = df.drop(columns=['geometry']).to_dict('records')" "data = df.drop(columns=[\"geometry\"]).to_dict(\"records\")"
] ]
}, },
{ {
@@ -105,7 +105,9 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"df[['neighbourhood_name', 'total_amenities_per_1000', 'amenity_index', 'amenity_tier']].head(10)" "df[\n",
" [\"neighbourhood_name\", \"total_amenities_per_1000\", \"amenity_index\", \"amenity_tier\"]\n",
"].head(10)"
] ]
}, },
{ {
@@ -116,7 +118,7 @@
"\n", "\n",
"### Figure Factory\n", "### Figure Factory\n",
"\n", "\n",
"Uses `create_choropleth_figure` from `portfolio_app.figures.choropleth`." "Uses `create_choropleth_figure` from `portfolio_app.figures.toronto.choropleth`."
] ]
}, },
{ {
@@ -126,18 +128,24 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"import sys\n", "import sys\n",
"sys.path.insert(0, '../..')\n",
"\n", "\n",
"from portfolio_app.figures.choropleth import create_choropleth_figure\n", "sys.path.insert(0, \"../..\")\n",
"\n",
"from portfolio_app.figures.toronto.choropleth import create_choropleth_figure\n",
"\n", "\n",
"fig = create_choropleth_figure(\n", "fig = create_choropleth_figure(\n",
" geojson=geojson,\n", " geojson=geojson,\n",
" data=data,\n", " data=data,\n",
" location_key='neighbourhood_id',\n", " location_key=\"neighbourhood_id\",\n",
" color_column='total_amenities_per_1000',\n", " color_column=\"total_amenities_per_1000\",\n",
" hover_data=['neighbourhood_name', 'amenity_index', 'parks_per_1000', 'schools_per_1000'],\n", " hover_data=[\n",
" color_scale='Greens',\n", " \"neighbourhood_name\",\n",
" title='Toronto Amenities per 1,000 Population',\n", " \"amenity_index\",\n",
" \"parks_per_1000\",\n",
" \"schools_per_1000\",\n",
" ],\n",
" color_scale=\"Greens\",\n",
" title=\"Toronto Amenities per 1,000 Population\",\n",
" zoom=10,\n", " zoom=10,\n",
")\n", ")\n",
"\n", "\n",

View File

@@ -0,0 +1,191 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Amenity Radar Chart\n",
"\n",
"Spider/radar chart comparing amenity categories for selected neighbourhoods."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. Data Reference\n",
"\n",
"### Source Tables\n",
"\n",
"| Table | Grain | Key Columns |\n",
"|-------|-------|-------------|\n",
"| `mart_neighbourhood_amenities` | neighbourhood × year | parks_index, schools_index, transit_index |\n",
"\n",
"### SQL Query"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"import pandas as pd\n",
"from dotenv import load_dotenv\n",
"from sqlalchemy import create_engine\n",
"\n",
"# Load .env from project root\n",
"load_dotenv(\"../../.env\")\n",
"\n",
"engine = create_engine(os.environ[\"DATABASE_URL\"])\n",
"\n",
"query = \"\"\"\n",
"SELECT\n",
" neighbourhood_name,\n",
" parks_index,\n",
" schools_index,\n",
" transit_index,\n",
" amenity_index,\n",
" amenity_tier\n",
"FROM public_marts.mart_neighbourhood_amenities\n",
"WHERE year = (SELECT MAX(year) FROM public_marts.mart_neighbourhood_amenities)\n",
"ORDER BY amenity_index DESC\n",
"\"\"\"\n",
"\n",
"df = pd.read_sql(query, engine)\n",
"print(f\"Loaded {len(df)} neighbourhoods\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Transformation Steps\n",
"\n",
"1. Select top 5 and bottom 5 neighbourhoods by amenity index\n",
"2. Reshape for radar chart format"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Select representative neighbourhoods\n",
"top_5 = df.head(5)\n",
"bottom_5 = df.tail(5)\n",
"\n",
"# Prepare radar data\n",
"categories = [\"Parks\", \"Schools\", \"Transit\"]\n",
"index_columns = [\"parks_index\", \"schools_index\", \"transit_index\"]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Sample Output"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(\"Top 5 Amenity-Rich Neighbourhoods:\")\n",
"display(\n",
" top_5[\n",
" [\n",
" \"neighbourhood_name\",\n",
" \"parks_index\",\n",
" \"schools_index\",\n",
" \"transit_index\",\n",
" \"amenity_index\",\n",
" ]\n",
" ]\n",
")\n",
"print(\"\\nBottom 5 Underserved Neighbourhoods:\")\n",
"display(\n",
" bottom_5[\n",
" [\n",
" \"neighbourhood_name\",\n",
" \"parks_index\",\n",
" \"schools_index\",\n",
" \"transit_index\",\n",
" \"amenity_index\",\n",
" ]\n",
" ]\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. Data Visualization\n",
"\n",
"### Figure Factory\n",
"\n",
"Uses `create_radar` from `portfolio_app.figures.toronto.radar`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import sys\n",
"\n",
"sys.path.insert(0, \"../..\")\n",
"\n",
"from portfolio_app.figures.toronto.radar import create_comparison_radar\n",
"\n",
"# Compare top neighbourhood vs city average (100)\n",
"top_hood = top_5.iloc[0]\n",
"metrics = [\"parks_index\", \"schools_index\", \"transit_index\"]\n",
"\n",
"fig = create_comparison_radar(\n",
" selected_data=top_hood.to_dict(),\n",
" average_data={\"parks_index\": 100, \"schools_index\": 100, \"transit_index\": 100},\n",
" metrics=metrics,\n",
" selected_name=top_hood[\"neighbourhood_name\"],\n",
" average_name=\"City Average\",\n",
" title=f\"Amenity Profile: {top_hood['neighbourhood_name']} vs City Average\",\n",
")\n",
"\n",
"fig.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Index Interpretation\n",
"\n",
"| Value | Meaning |\n",
"|-------|--------|\n",
"| < 100 | Below city average |\n",
"| = 100 | City average |\n",
"| > 100 | Above city average |"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"name": "python",
"version": "3.11.0"
}
},
"nbformat": 4,
"nbformat_minor": 4
}

View File

@@ -19,7 +19,7 @@
"\n", "\n",
"| Table | Grain | Key Columns |\n", "| Table | Grain | Key Columns |\n",
"|-------|-------|-------------|\n", "|-------|-------|-------------|\n",
"| `mart_neighbourhood_amenities` | neighbourhood \u00d7 year | transit_per_1000, transit_index, transit_count |\n", "| `mart_neighbourhood_amenities` | neighbourhood × year | transit_per_1000, transit_index, transit_count |\n",
"\n", "\n",
"### SQL Query" "### SQL Query"
] ]
@@ -30,15 +30,16 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"import pandas as pd\n",
"from sqlalchemy import create_engine\n",
"from dotenv import load_dotenv\n",
"import os\n", "import os\n",
"\n", "\n",
"# Load .env from project root\n", "import pandas as pd\n",
"load_dotenv('../../.env')\n", "from dotenv import load_dotenv\n",
"from sqlalchemy import create_engine\n",
"\n", "\n",
"engine = create_engine(os.environ['DATABASE_URL'])\n", "# Load .env from project root\n",
"load_dotenv(\"../../.env\")\n",
"\n",
"engine = create_engine(os.environ[\"DATABASE_URL\"])\n",
"\n", "\n",
"query = \"\"\"\n", "query = \"\"\"\n",
"SELECT\n", "SELECT\n",
@@ -74,7 +75,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"data = df.head(20).to_dict('records')" "data = df.head(20).to_dict(\"records\")"
] ]
}, },
{ {
@@ -90,7 +91,9 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"df[['neighbourhood_name', 'transit_per_1000', 'transit_index', 'transit_count']].head(10)" "df[[\"neighbourhood_name\", \"transit_per_1000\", \"transit_index\", \"transit_count\"]].head(\n",
" 10\n",
")"
] ]
}, },
{ {
@@ -101,7 +104,7 @@
"\n", "\n",
"### Figure Factory\n", "### Figure Factory\n",
"\n", "\n",
"Uses `create_horizontal_bar` from `portfolio_app.figures.bar_charts`." "Uses `create_horizontal_bar` from `portfolio_app.figures.toronto.bar_charts`."
] ]
}, },
{ {
@@ -111,17 +114,18 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"import sys\n", "import sys\n",
"sys.path.insert(0, '../..')\n",
"\n", "\n",
"from portfolio_app.figures.bar_charts import create_horizontal_bar\n", "sys.path.insert(0, \"../..\")\n",
"\n",
"from portfolio_app.figures.toronto.bar_charts import create_horizontal_bar\n",
"\n", "\n",
"fig = create_horizontal_bar(\n", "fig = create_horizontal_bar(\n",
" data=data,\n", " data=data,\n",
" name_column='neighbourhood_name',\n", " name_column=\"neighbourhood_name\",\n",
" value_column='transit_per_1000',\n", " value_column=\"transit_per_1000\",\n",
" title='Top 20 Neighbourhoods by Transit Accessibility',\n", " title=\"Top 20 Neighbourhoods by Transit Accessibility\",\n",
" color='#00BCD4',\n", " color=\"#00BCD4\",\n",
" value_format='.2f',\n", " value_format=\".2f\",\n",
")\n", ")\n",
"\n", "\n",
"fig.show()" "fig.show()"
@@ -140,7 +144,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"print(f\"City-wide Transit Statistics:\")\n", "print(\"City-wide Transit Statistics:\")\n",
"print(f\" Total Transit Stops: {df['transit_count'].sum():,.0f}\")\n", "print(f\" Total Transit Stops: {df['transit_count'].sum():,.0f}\")\n",
"print(f\" Average per 1,000 pop: {df['transit_per_1000'].mean():.2f}\")\n", "print(f\" Average per 1,000 pop: {df['transit_per_1000'].mean():.2f}\")\n",
"print(f\" Median per 1,000 pop: {df['transit_per_1000'].median():.2f}\")\n", "print(f\" Median per 1,000 pop: {df['transit_per_1000'].median():.2f}\")\n",

View File

@@ -19,7 +19,7 @@
"\n", "\n",
"| Table | Grain | Key Columns |\n", "| Table | Grain | Key Columns |\n",
"|-------|-------|-------------|\n", "|-------|-------|-------------|\n",
"| `mart_neighbourhood_demographics` | neighbourhood \u00d7 year | median_age, age_index, city_avg_age |\n", "| `mart_neighbourhood_demographics` | neighbourhood × year | median_age, age_index, city_avg_age |\n",
"\n", "\n",
"### SQL Query" "### SQL Query"
] ]
@@ -30,15 +30,16 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"import pandas as pd\n",
"from sqlalchemy import create_engine\n",
"from dotenv import load_dotenv\n",
"import os\n", "import os\n",
"\n", "\n",
"# Load .env from project root\n", "import pandas as pd\n",
"load_dotenv('../../.env')\n", "from dotenv import load_dotenv\n",
"from sqlalchemy import create_engine\n",
"\n", "\n",
"engine = create_engine(os.environ['DATABASE_URL'])\n", "# Load .env from project root\n",
"load_dotenv(\"../../.env\")\n",
"\n",
"engine = create_engine(os.environ[\"DATABASE_URL\"])\n",
"\n", "\n",
"query = \"\"\"\n", "query = \"\"\"\n",
"SELECT\n", "SELECT\n",
@@ -76,13 +77,13 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"city_avg = df['city_avg_age'].iloc[0]\n", "city_avg = df[\"city_avg_age\"].iloc[0]\n",
"df['age_category'] = df['median_age'].apply(\n", "df[\"age_category\"] = df[\"median_age\"].apply(\n",
" lambda x: 'Younger' if x < city_avg else 'Older'\n", " lambda x: \"Younger\" if x < city_avg else \"Older\"\n",
")\n", ")\n",
"df['age_deviation'] = df['median_age'] - city_avg\n", "df[\"age_deviation\"] = df[\"median_age\"] - city_avg\n",
"\n", "\n",
"data = df.to_dict('records')" "data = df.to_dict(\"records\")"
] ]
}, },
{ {
@@ -100,9 +101,13 @@
"source": [ "source": [
"print(f\"City Average Age: {city_avg:.1f}\")\n", "print(f\"City Average Age: {city_avg:.1f}\")\n",
"print(\"\\nYoungest Neighbourhoods:\")\n", "print(\"\\nYoungest Neighbourhoods:\")\n",
"display(df.tail(5)[['neighbourhood_name', 'median_age', 'age_index', 'pct_renter_occupied']])\n", "display(\n",
" df.tail(5)[[\"neighbourhood_name\", \"median_age\", \"age_index\", \"pct_renter_occupied\"]]\n",
")\n",
"print(\"\\nOldest Neighbourhoods:\")\n", "print(\"\\nOldest Neighbourhoods:\")\n",
"display(df.head(5)[['neighbourhood_name', 'median_age', 'age_index', 'pct_renter_occupied']])" "display(\n",
" df.head(5)[[\"neighbourhood_name\", \"median_age\", \"age_index\", \"pct_renter_occupied\"]]\n",
")"
] ]
}, },
{ {
@@ -113,7 +118,7 @@
"\n", "\n",
"### Figure Factory\n", "### Figure Factory\n",
"\n", "\n",
"Uses `create_ranking_bar` from `portfolio_app.figures.bar_charts`." "Uses `create_ranking_bar` from `portfolio_app.figures.toronto.bar_charts`."
] ]
}, },
{ {
@@ -123,20 +128,21 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"import sys\n", "import sys\n",
"sys.path.insert(0, '../..')\n",
"\n", "\n",
"from portfolio_app.figures.bar_charts import create_ranking_bar\n", "sys.path.insert(0, \"../..\")\n",
"\n",
"from portfolio_app.figures.toronto.bar_charts import create_ranking_bar\n",
"\n", "\n",
"fig = create_ranking_bar(\n", "fig = create_ranking_bar(\n",
" data=data,\n", " data=data,\n",
" name_column='neighbourhood_name',\n", " name_column=\"neighbourhood_name\",\n",
" value_column='median_age',\n", " value_column=\"median_age\",\n",
" title='Youngest & Oldest Neighbourhoods (Median Age)',\n", " title=\"Youngest & Oldest Neighbourhoods (Median Age)\",\n",
" top_n=10,\n", " top_n=10,\n",
" bottom_n=10,\n", " bottom_n=10,\n",
" color_top='#FF9800', # Orange for older\n", " color_top=\"#FF9800\", # Orange for older\n",
" color_bottom='#2196F3', # Blue for younger\n", " color_bottom=\"#2196F3\", # Blue for younger\n",
" value_format='.1f',\n", " value_format=\".1f\",\n",
")\n", ")\n",
"\n", "\n",
"fig.show()" "fig.show()"
@@ -157,7 +163,7 @@
"source": [ "source": [
"# Age by income quintile\n", "# Age by income quintile\n",
"print(\"Median Age by Income Quintile:\")\n", "print(\"Median Age by Income Quintile:\")\n",
"df.groupby('income_quintile')['median_age'].mean().round(1)" "df.groupby(\"income_quintile\")[\"median_age\"].mean().round(1)"
] ]
} }
], ],

View File

@@ -19,7 +19,7 @@
"\n", "\n",
"| Table | Grain | Key Columns |\n", "| Table | Grain | Key Columns |\n",
"|-------|-------|-------------|\n", "|-------|-------|-------------|\n",
"| `mart_neighbourhood_demographics` | neighbourhood \u00d7 year | median_household_income, income_index, income_quintile, geometry |\n", "| `mart_neighbourhood_demographics` | neighbourhood × year | median_household_income, income_index, income_quintile, geometry |\n",
"\n", "\n",
"### SQL Query" "### SQL Query"
] ]
@@ -30,15 +30,16 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"import pandas as pd\n",
"from sqlalchemy import create_engine\n",
"from dotenv import load_dotenv\n",
"import os\n", "import os\n",
"\n", "\n",
"# Load .env from project root\n", "import pandas as pd\n",
"load_dotenv('../../.env')\n", "from dotenv import load_dotenv\n",
"from sqlalchemy import create_engine\n",
"\n", "\n",
"engine = create_engine(os.environ['DATABASE_URL'])\n", "# Load .env from project root\n",
"load_dotenv(\"../../.env\")\n",
"\n",
"engine = create_engine(os.environ[\"DATABASE_URL\"])\n",
"\n", "\n",
"query = \"\"\"\n", "query = \"\"\"\n",
"SELECT\n", "SELECT\n",
@@ -77,19 +78,18 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"import geopandas as gpd\n",
"import json\n", "import json\n",
"\n", "\n",
"df['income_thousands'] = df['median_household_income'] / 1000\n", "import geopandas as gpd\n",
"\n",
"df[\"income_thousands\"] = df[\"median_household_income\"] / 1000\n",
"\n", "\n",
"gdf = gpd.GeoDataFrame(\n", "gdf = gpd.GeoDataFrame(\n",
" df,\n", " df, geometry=gpd.GeoSeries.from_wkb(df[\"geometry\"]), crs=\"EPSG:4326\"\n",
" geometry=gpd.GeoSeries.from_wkb(df['geometry']),\n",
" crs='EPSG:4326'\n",
")\n", ")\n",
"\n", "\n",
"geojson = json.loads(gdf.to_json())\n", "geojson = json.loads(gdf.to_json())\n",
"data = df.drop(columns=['geometry']).to_dict('records')" "data = df.drop(columns=[\"geometry\"]).to_dict(\"records\")"
] ]
}, },
{ {
@@ -105,7 +105,9 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"df[['neighbourhood_name', 'median_household_income', 'income_index', 'income_quintile']].head(10)" "df[\n",
" [\"neighbourhood_name\", \"median_household_income\", \"income_index\", \"income_quintile\"]\n",
"].head(10)"
] ]
}, },
{ {
@@ -116,7 +118,7 @@
"\n", "\n",
"### Figure Factory\n", "### Figure Factory\n",
"\n", "\n",
"Uses `create_choropleth_figure` from `portfolio_app.figures.choropleth`." "Uses `create_choropleth_figure` from `portfolio_app.figures.toronto.choropleth`."
] ]
}, },
{ {
@@ -126,18 +128,19 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"import sys\n", "import sys\n",
"sys.path.insert(0, '../..')\n",
"\n", "\n",
"from portfolio_app.figures.choropleth import create_choropleth_figure\n", "sys.path.insert(0, \"../..\")\n",
"\n",
"from portfolio_app.figures.toronto.choropleth import create_choropleth_figure\n",
"\n", "\n",
"fig = create_choropleth_figure(\n", "fig = create_choropleth_figure(\n",
" geojson=geojson,\n", " geojson=geojson,\n",
" data=data,\n", " data=data,\n",
" location_key='neighbourhood_id',\n", " location_key=\"neighbourhood_id\",\n",
" color_column='median_household_income',\n", " color_column=\"median_household_income\",\n",
" hover_data=['neighbourhood_name', 'income_index', 'income_quintile'],\n", " hover_data=[\"neighbourhood_name\", \"income_index\", \"income_quintile\"],\n",
" color_scale='Viridis',\n", " color_scale=\"Viridis\",\n",
" title='Toronto Median Household Income by Neighbourhood',\n", " title=\"Toronto Median Household Income by Neighbourhood\",\n",
" zoom=10,\n", " zoom=10,\n",
")\n", ")\n",
"\n", "\n",
@@ -157,7 +160,9 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"df.groupby('income_quintile')['median_household_income'].agg(['count', 'mean', 'min', 'max']).round(0)" "df.groupby(\"income_quintile\")[\"median_household_income\"].agg(\n",
" [\"count\", \"mean\", \"min\", \"max\"]\n",
").round(0)"
] ]
} }
], ],

View File

@@ -19,7 +19,7 @@
"\n", "\n",
"| Table | Grain | Key Columns |\n", "| Table | Grain | Key Columns |\n",
"|-------|-------|-------------|\n", "|-------|-------|-------------|\n",
"| `mart_neighbourhood_demographics` | neighbourhood \u00d7 year | population_density, population, land_area_sqkm |\n", "| `mart_neighbourhood_demographics` | neighbourhood × year | population_density, population, land_area_sqkm |\n",
"\n", "\n",
"### SQL Query" "### SQL Query"
] ]
@@ -30,15 +30,16 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"import pandas as pd\n",
"from sqlalchemy import create_engine\n",
"from dotenv import load_dotenv\n",
"import os\n", "import os\n",
"\n", "\n",
"# Load .env from project root\n", "import pandas as pd\n",
"load_dotenv('../../.env')\n", "from dotenv import load_dotenv\n",
"from sqlalchemy import create_engine\n",
"\n", "\n",
"engine = create_engine(os.environ['DATABASE_URL'])\n", "# Load .env from project root\n",
"load_dotenv(\"../../.env\")\n",
"\n",
"engine = create_engine(os.environ[\"DATABASE_URL\"])\n",
"\n", "\n",
"query = \"\"\"\n", "query = \"\"\"\n",
"SELECT\n", "SELECT\n",
@@ -74,7 +75,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"data = df.head(20).to_dict('records')" "data = df.head(20).to_dict(\"records\")"
] ]
}, },
{ {
@@ -90,7 +91,9 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"df[['neighbourhood_name', 'population_density', 'population', 'land_area_sqkm']].head(10)" "df[[\"neighbourhood_name\", \"population_density\", \"population\", \"land_area_sqkm\"]].head(\n",
" 10\n",
")"
] ]
}, },
{ {
@@ -101,7 +104,7 @@
"\n", "\n",
"### Figure Factory\n", "### Figure Factory\n",
"\n", "\n",
"Uses `create_horizontal_bar` from `portfolio_app.figures.bar_charts`." "Uses `create_horizontal_bar` from `portfolio_app.figures.toronto.bar_charts`."
] ]
}, },
{ {
@@ -111,17 +114,18 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"import sys\n", "import sys\n",
"sys.path.insert(0, '../..')\n",
"\n", "\n",
"from portfolio_app.figures.bar_charts import create_horizontal_bar\n", "sys.path.insert(0, \"../..\")\n",
"\n",
"from portfolio_app.figures.toronto.bar_charts import create_horizontal_bar\n",
"\n", "\n",
"fig = create_horizontal_bar(\n", "fig = create_horizontal_bar(\n",
" data=data,\n", " data=data,\n",
" name_column='neighbourhood_name',\n", " name_column=\"neighbourhood_name\",\n",
" value_column='population_density',\n", " value_column=\"population_density\",\n",
" title='Top 20 Most Dense Neighbourhoods',\n", " title=\"Top 20 Most Dense Neighbourhoods\",\n",
" color='#9C27B0',\n", " color=\"#9C27B0\",\n",
" value_format=',.0f',\n", " value_format=\",.0f\",\n",
")\n", ")\n",
"\n", "\n",
"fig.show()" "fig.show()"
@@ -140,7 +144,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"print(f\"City-wide Statistics:\")\n", "print(\"City-wide Statistics:\")\n",
"print(f\" Total Population: {df['population'].sum():,.0f}\")\n", "print(f\" Total Population: {df['population'].sum():,.0f}\")\n",
"print(f\" Total Area: {df['land_area_sqkm'].sum():,.1f} sq km\")\n", "print(f\" Total Area: {df['land_area_sqkm'].sum():,.1f} sq km\")\n",
"print(f\" Average Density: {df['population_density'].mean():,.0f} per sq km\")\n", "print(f\" Average Density: {df['population_density'].mean():,.0f} per sq km\")\n",

View File

@@ -19,7 +19,7 @@
"\n", "\n",
"| Table | Grain | Key Columns |\n", "| Table | Grain | Key Columns |\n",
"|-------|-------|-------------|\n", "|-------|-------|-------------|\n",
"| `mart_neighbourhood_housing` | neighbourhood \u00d7 year | affordability_index, rent_to_income_pct, avg_rent_2bed, geometry |\n", "| `mart_neighbourhood_housing` | neighbourhood × year | affordability_index, rent_to_income_pct, avg_rent_2bed, geometry |\n",
"\n", "\n",
"### SQL Query" "### SQL Query"
] ]
@@ -30,15 +30,16 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"import pandas as pd\n",
"from sqlalchemy import create_engine\n",
"from dotenv import load_dotenv\n",
"import os\n", "import os\n",
"\n", "\n",
"# Load .env from project root\n", "import pandas as pd\n",
"load_dotenv('../../.env')\n", "from dotenv import load_dotenv\n",
"from sqlalchemy import create_engine\n",
"\n", "\n",
"engine = create_engine(os.environ['DATABASE_URL'])\n", "# Load .env from project root\n",
"load_dotenv(\"../../.env\")\n",
"\n",
"engine = create_engine(os.environ[\"DATABASE_URL\"])\n",
"\n", "\n",
"query = \"\"\"\n", "query = \"\"\"\n",
"SELECT\n", "SELECT\n",
@@ -77,17 +78,16 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"import geopandas as gpd\n",
"import json\n", "import json\n",
"\n", "\n",
"import geopandas as gpd\n",
"\n",
"gdf = gpd.GeoDataFrame(\n", "gdf = gpd.GeoDataFrame(\n",
" df,\n", " df, geometry=gpd.GeoSeries.from_wkb(df[\"geometry\"]), crs=\"EPSG:4326\"\n",
" geometry=gpd.GeoSeries.from_wkb(df['geometry']),\n",
" crs='EPSG:4326'\n",
")\n", ")\n",
"\n", "\n",
"geojson = json.loads(gdf.to_json())\n", "geojson = json.loads(gdf.to_json())\n",
"data = df.drop(columns=['geometry']).to_dict('records')" "data = df.drop(columns=[\"geometry\"]).to_dict(\"records\")"
] ]
}, },
{ {
@@ -103,7 +103,15 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"df[['neighbourhood_name', 'affordability_index', 'rent_to_income_pct', 'avg_rent_2bed', 'is_affordable']].head(10)" "df[\n",
" [\n",
" \"neighbourhood_name\",\n",
" \"affordability_index\",\n",
" \"rent_to_income_pct\",\n",
" \"avg_rent_2bed\",\n",
" \"is_affordable\",\n",
" ]\n",
"].head(10)"
] ]
}, },
{ {
@@ -114,7 +122,7 @@
"\n", "\n",
"### Figure Factory\n", "### Figure Factory\n",
"\n", "\n",
"Uses `create_choropleth_figure` from `portfolio_app.figures.choropleth`.\n", "Uses `create_choropleth_figure` from `portfolio_app.figures.toronto.choropleth`.\n",
"\n", "\n",
"**Key Parameters:**\n", "**Key Parameters:**\n",
"- `color_column`: 'affordability_index'\n", "- `color_column`: 'affordability_index'\n",
@@ -128,18 +136,19 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"import sys\n", "import sys\n",
"sys.path.insert(0, '../..')\n",
"\n", "\n",
"from portfolio_app.figures.choropleth import create_choropleth_figure\n", "sys.path.insert(0, \"../..\")\n",
"\n",
"from portfolio_app.figures.toronto.choropleth import create_choropleth_figure\n",
"\n", "\n",
"fig = create_choropleth_figure(\n", "fig = create_choropleth_figure(\n",
" geojson=geojson,\n", " geojson=geojson,\n",
" data=data,\n", " data=data,\n",
" location_key='neighbourhood_id',\n", " location_key=\"neighbourhood_id\",\n",
" color_column='affordability_index',\n", " color_column=\"affordability_index\",\n",
" hover_data=['neighbourhood_name', 'rent_to_income_pct', 'avg_rent_2bed'],\n", " hover_data=[\"neighbourhood_name\", \"rent_to_income_pct\", \"avg_rent_2bed\"],\n",
" color_scale='RdYlGn_r', # Reversed: lower index (affordable) = green\n", " color_scale=\"RdYlGn_r\", # Reversed: lower index (affordable) = green\n",
" title='Toronto Housing Affordability Index',\n", " title=\"Toronto Housing Affordability Index\",\n",
" zoom=10,\n", " zoom=10,\n",
")\n", ")\n",
"\n", "\n",

View File

@@ -19,7 +19,7 @@
"\n", "\n",
"| Table | Grain | Key Columns |\n", "| Table | Grain | Key Columns |\n",
"|-------|-------|-------------|\n", "|-------|-------|-------------|\n",
"| `mart_neighbourhood_housing` | neighbourhood \u00d7 year | year, avg_rent_2bed, rent_yoy_change_pct |\n", "| `mart_neighbourhood_housing` | neighbourhood × year | year, avg_rent_2bed, rent_yoy_change_pct |\n",
"\n", "\n",
"### SQL Query" "### SQL Query"
] ]
@@ -30,15 +30,16 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"import pandas as pd\n",
"from sqlalchemy import create_engine\n",
"from dotenv import load_dotenv\n",
"import os\n", "import os\n",
"\n", "\n",
"# Load .env from project root\n", "import pandas as pd\n",
"load_dotenv('../../.env')\n", "from dotenv import load_dotenv\n",
"from sqlalchemy import create_engine\n",
"\n", "\n",
"engine = create_engine(os.environ['DATABASE_URL'])\n", "# Load .env from project root\n",
"load_dotenv(\"../../.env\")\n",
"\n",
"engine = create_engine(os.environ[\"DATABASE_URL\"])\n",
"\n", "\n",
"# City-wide average rent by year\n", "# City-wide average rent by year\n",
"query = \"\"\"\n", "query = \"\"\"\n",
@@ -77,23 +78,25 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"# Create date column from year\n", "# Create date column from year\n",
"df['date'] = pd.to_datetime(df['year'].astype(str) + '-01-01')\n", "df[\"date\"] = pd.to_datetime(df[\"year\"].astype(str) + \"-01-01\")\n",
"\n", "\n",
"# Melt for multi-line chart\n", "# Melt for multi-line chart\n",
"df_melted = df.melt(\n", "df_melted = df.melt(\n",
" id_vars=['year', 'date'],\n", " id_vars=[\"year\", \"date\"],\n",
" value_vars=['avg_rent_bachelor', 'avg_rent_1bed', 'avg_rent_2bed', 'avg_rent_3bed'],\n", " value_vars=[\"avg_rent_bachelor\", \"avg_rent_1bed\", \"avg_rent_2bed\", \"avg_rent_3bed\"],\n",
" var_name='bedroom_type',\n", " var_name=\"bedroom_type\",\n",
" value_name='avg_rent'\n", " value_name=\"avg_rent\",\n",
")\n", ")\n",
"\n", "\n",
"# Clean labels\n", "# Clean labels\n",
"df_melted['bedroom_type'] = df_melted['bedroom_type'].map({\n", "df_melted[\"bedroom_type\"] = df_melted[\"bedroom_type\"].map(\n",
" 'avg_rent_bachelor': 'Bachelor',\n", " {\n",
" 'avg_rent_1bed': '1 Bedroom',\n", " \"avg_rent_bachelor\": \"Bachelor\",\n",
" 'avg_rent_2bed': '2 Bedroom',\n", " \"avg_rent_1bed\": \"1 Bedroom\",\n",
" 'avg_rent_3bed': '3 Bedroom'\n", " \"avg_rent_2bed\": \"2 Bedroom\",\n",
"})" " \"avg_rent_3bed\": \"3 Bedroom\",\n",
" }\n",
")"
] ]
}, },
{ {
@@ -109,7 +112,16 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"df[['year', 'avg_rent_bachelor', 'avg_rent_1bed', 'avg_rent_2bed', 'avg_rent_3bed', 'avg_yoy_change']]" "df[\n",
" [\n",
" \"year\",\n",
" \"avg_rent_bachelor\",\n",
" \"avg_rent_1bed\",\n",
" \"avg_rent_2bed\",\n",
" \"avg_rent_3bed\",\n",
" \"avg_yoy_change\",\n",
" ]\n",
"]"
] ]
}, },
{ {
@@ -120,7 +132,7 @@
"\n", "\n",
"### Figure Factory\n", "### Figure Factory\n",
"\n", "\n",
"Uses `create_price_time_series` from `portfolio_app.figures.time_series`.\n", "Uses `create_price_time_series` from `portfolio_app.figures.toronto.time_series`.\n",
"\n", "\n",
"**Key Parameters:**\n", "**Key Parameters:**\n",
"- `date_column`: 'date'\n", "- `date_column`: 'date'\n",
@@ -135,18 +147,19 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"import sys\n", "import sys\n",
"sys.path.insert(0, '../..')\n",
"\n", "\n",
"from portfolio_app.figures.time_series import create_price_time_series\n", "sys.path.insert(0, \"../..\")\n",
"\n", "\n",
"data = df_melted.to_dict('records')\n", "from portfolio_app.figures.toronto.time_series import create_price_time_series\n",
"\n",
"data = df_melted.to_dict(\"records\")\n",
"\n", "\n",
"fig = create_price_time_series(\n", "fig = create_price_time_series(\n",
" data=data,\n", " data=data,\n",
" date_column='date',\n", " date_column=\"date\",\n",
" price_column='avg_rent',\n", " price_column=\"avg_rent\",\n",
" group_column='bedroom_type',\n", " group_column=\"bedroom_type\",\n",
" title='Toronto Average Rent Trend (5 Years)',\n", " title=\"Toronto Average Rent Trend (5 Years)\",\n",
")\n", ")\n",
"\n", "\n",
"fig.show()" "fig.show()"
@@ -167,7 +180,7 @@
"source": [ "source": [
"# Show year-over-year changes\n", "# Show year-over-year changes\n",
"print(\"Year-over-Year Rent Change (%)\")\n", "print(\"Year-over-Year Rent Change (%)\")\n",
"df[['year', 'avg_yoy_change']].dropna()" "df[[\"year\", \"avg_yoy_change\"]].dropna()"
] ]
} }
], ],

View File

@@ -19,7 +19,7 @@
"\n", "\n",
"| Table | Grain | Key Columns |\n", "| Table | Grain | Key Columns |\n",
"|-------|-------|-------------|\n", "|-------|-------|-------------|\n",
"| `mart_neighbourhood_housing` | neighbourhood \u00d7 year | pct_owner_occupied, pct_renter_occupied, income_quintile |\n", "| `mart_neighbourhood_housing` | neighbourhood × year | pct_owner_occupied, pct_renter_occupied, income_quintile |\n",
"\n", "\n",
"### SQL Query" "### SQL Query"
] ]
@@ -30,15 +30,16 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"import pandas as pd\n",
"from sqlalchemy import create_engine\n",
"from dotenv import load_dotenv\n",
"import os\n", "import os\n",
"\n", "\n",
"# Load .env from project root\n", "import pandas as pd\n",
"load_dotenv('../../.env')\n", "from dotenv import load_dotenv\n",
"from sqlalchemy import create_engine\n",
"\n", "\n",
"engine = create_engine(os.environ['DATABASE_URL'])\n", "# Load .env from project root\n",
"load_dotenv(\"../../.env\")\n",
"\n",
"engine = create_engine(os.environ[\"DATABASE_URL\"])\n",
"\n", "\n",
"query = \"\"\"\n", "query = \"\"\"\n",
"SELECT\n", "SELECT\n",
@@ -77,18 +78,17 @@
"source": [ "source": [
"# Prepare for stacked bar\n", "# Prepare for stacked bar\n",
"df_stacked = df.melt(\n", "df_stacked = df.melt(\n",
" id_vars=['neighbourhood_name', 'income_quintile'],\n", " id_vars=[\"neighbourhood_name\", \"income_quintile\"],\n",
" value_vars=['pct_owner_occupied', 'pct_renter_occupied'],\n", " value_vars=[\"pct_owner_occupied\", \"pct_renter_occupied\"],\n",
" var_name='tenure_type',\n", " var_name=\"tenure_type\",\n",
" value_name='percentage'\n", " value_name=\"percentage\",\n",
")\n", ")\n",
"\n", "\n",
"df_stacked['tenure_type'] = df_stacked['tenure_type'].map({\n", "df_stacked[\"tenure_type\"] = df_stacked[\"tenure_type\"].map(\n",
" 'pct_owner_occupied': 'Owner',\n", " {\"pct_owner_occupied\": \"Owner\", \"pct_renter_occupied\": \"Renter\"}\n",
" 'pct_renter_occupied': 'Renter'\n", ")\n",
"})\n",
"\n", "\n",
"data = df_stacked.to_dict('records')" "data = df_stacked.to_dict(\"records\")"
] ]
}, },
{ {
@@ -105,7 +105,14 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"print(\"Highest Renter Neighbourhoods:\")\n", "print(\"Highest Renter Neighbourhoods:\")\n",
"df[['neighbourhood_name', 'pct_renter_occupied', 'pct_owner_occupied', 'income_quintile']].head(10)" "df[\n",
" [\n",
" \"neighbourhood_name\",\n",
" \"pct_renter_occupied\",\n",
" \"pct_owner_occupied\",\n",
" \"income_quintile\",\n",
" ]\n",
"].head(10)"
] ]
}, },
{ {
@@ -116,7 +123,7 @@
"\n", "\n",
"### Figure Factory\n", "### Figure Factory\n",
"\n", "\n",
"Uses `create_stacked_bar` from `portfolio_app.figures.bar_charts`.\n", "Uses `create_stacked_bar` from `portfolio_app.figures.toronto.bar_charts`.\n",
"\n", "\n",
"**Key Parameters:**\n", "**Key Parameters:**\n",
"- `x_column`: 'neighbourhood_name'\n", "- `x_column`: 'neighbourhood_name'\n",
@@ -132,21 +139,22 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"import sys\n", "import sys\n",
"sys.path.insert(0, '../..')\n",
"\n", "\n",
"from portfolio_app.figures.bar_charts import create_stacked_bar\n", "sys.path.insert(0, \"../..\")\n",
"\n",
"from portfolio_app.figures.toronto.bar_charts import create_stacked_bar\n",
"\n", "\n",
"# Show top 20 by renter percentage\n", "# Show top 20 by renter percentage\n",
"top_20_names = df.head(20)['neighbourhood_name'].tolist()\n", "top_20_names = df.head(20)[\"neighbourhood_name\"].tolist()\n",
"data_filtered = [d for d in data if d['neighbourhood_name'] in top_20_names]\n", "data_filtered = [d for d in data if d[\"neighbourhood_name\"] in top_20_names]\n",
"\n", "\n",
"fig = create_stacked_bar(\n", "fig = create_stacked_bar(\n",
" data=data_filtered,\n", " data=data_filtered,\n",
" x_column='neighbourhood_name',\n", " x_column=\"neighbourhood_name\",\n",
" value_column='percentage',\n", " value_column=\"percentage\",\n",
" category_column='tenure_type',\n", " category_column=\"tenure_type\",\n",
" title='Housing Tenure Mix - Top 20 Renter Neighbourhoods',\n", " title=\"Housing Tenure Mix - Top 20 Renter Neighbourhoods\",\n",
" color_map={'Owner': '#4CAF50', 'Renter': '#2196F3'},\n", " color_map={\"Owner\": \"#4CAF50\", \"Renter\": \"#2196F3\"},\n",
" show_percentages=True,\n", " show_percentages=True,\n",
")\n", ")\n",
"\n", "\n",
@@ -172,7 +180,9 @@
"\n", "\n",
"# By income quintile\n", "# By income quintile\n",
"print(\"\\nTenure by Income Quintile:\")\n", "print(\"\\nTenure by Income Quintile:\")\n",
"df.groupby('income_quintile')[['pct_owner_occupied', 'pct_renter_occupied']].mean().round(1)" "df.groupby(\"income_quintile\")[\n",
" [\"pct_owner_occupied\", \"pct_renter_occupied\"]\n",
"].mean().round(1)"
] ]
} }
], ],

View File

@@ -19,7 +19,7 @@
"\n", "\n",
"| Table | Grain | Key Columns |\n", "| Table | Grain | Key Columns |\n",
"|-------|-------|-------------|\n", "|-------|-------|-------------|\n",
"| `mart_neighbourhood_overview` | neighbourhood \u00d7 year | neighbourhood_name, median_household_income, safety_score, population |\n", "| `mart_neighbourhood_overview` | neighbourhood × year | neighbourhood_name, median_household_income, safety_score, population |\n",
"\n", "\n",
"### SQL Query" "### SQL Query"
] ]
@@ -30,15 +30,16 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"import pandas as pd\n",
"from sqlalchemy import create_engine\n",
"from dotenv import load_dotenv\n",
"import os\n", "import os\n",
"\n", "\n",
"# Load .env from project root\n", "import pandas as pd\n",
"load_dotenv('../../.env')\n", "from dotenv import load_dotenv\n",
"from sqlalchemy import create_engine\n",
"\n", "\n",
"engine = create_engine(os.environ['DATABASE_URL'])\n", "# Load .env from project root\n",
"load_dotenv(\"../../.env\")\n",
"\n",
"engine = create_engine(os.environ[\"DATABASE_URL\"])\n",
"\n", "\n",
"query = \"\"\"\n", "query = \"\"\"\n",
"SELECT\n", "SELECT\n",
@@ -77,10 +78,10 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"# Scale income to thousands for better axis readability\n", "# Scale income to thousands for better axis readability\n",
"df['income_thousands'] = df['median_household_income'] / 1000\n", "df[\"income_thousands\"] = df[\"median_household_income\"] / 1000\n",
"\n", "\n",
"# Prepare data for figure factory\n", "# Prepare data for figure factory\n",
"data = df.to_dict('records')" "data = df.to_dict(\"records\")"
] ]
}, },
{ {
@@ -96,7 +97,14 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"df[['neighbourhood_name', 'median_household_income', 'safety_score', 'crime_rate_per_100k']].head(10)" "df[\n",
" [\n",
" \"neighbourhood_name\",\n",
" \"median_household_income\",\n",
" \"safety_score\",\n",
" \"crime_rate_per_100k\",\n",
" ]\n",
"].head(10)"
] ]
}, },
{ {
@@ -107,7 +115,7 @@
"\n", "\n",
"### Figure Factory\n", "### Figure Factory\n",
"\n", "\n",
"Uses `create_scatter_figure` from `portfolio_app.figures.scatter`.\n", "Uses `create_scatter_figure` from `portfolio_app.figures.toronto.scatter`.\n",
"\n", "\n",
"**Key Parameters:**\n", "**Key Parameters:**\n",
"- `x_column`: 'income_thousands' (median household income in $K)\n", "- `x_column`: 'income_thousands' (median household income in $K)\n",
@@ -124,19 +132,20 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"import sys\n", "import sys\n",
"sys.path.insert(0, '../..')\n",
"\n", "\n",
"from portfolio_app.figures.scatter import create_scatter_figure\n", "sys.path.insert(0, \"../..\")\n",
"\n",
"from portfolio_app.figures.toronto.scatter import create_scatter_figure\n",
"\n", "\n",
"fig = create_scatter_figure(\n", "fig = create_scatter_figure(\n",
" data=data,\n", " data=data,\n",
" x_column='income_thousands',\n", " x_column=\"income_thousands\",\n",
" y_column='safety_score',\n", " y_column=\"safety_score\",\n",
" name_column='neighbourhood_name',\n", " name_column=\"neighbourhood_name\",\n",
" size_column='population',\n", " size_column=\"population\",\n",
" title='Income vs Safety by Neighbourhood',\n", " title=\"Income vs Safety by Neighbourhood\",\n",
" x_title='Median Household Income ($K)',\n", " x_title=\"Median Household Income ($K)\",\n",
" y_title='Safety Score (0-100)',\n", " y_title=\"Safety Score (0-100)\",\n",
" trendline=True,\n", " trendline=True,\n",
")\n", ")\n",
"\n", "\n",
@@ -166,7 +175,7 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"# Calculate correlation coefficient\n", "# Calculate correlation coefficient\n",
"correlation = df['median_household_income'].corr(df['safety_score'])\n", "correlation = df[\"median_household_income\"].corr(df[\"safety_score\"])\n",
"print(f\"Correlation coefficient (Income vs Safety): {correlation:.3f}\")" "print(f\"Correlation coefficient (Income vs Safety): {correlation:.3f}\")"
] ]
} }

View File

@@ -29,7 +29,38 @@
"execution_count": null, "execution_count": null,
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": "import pandas as pd\nfrom sqlalchemy import create_engine\nfrom dotenv import load_dotenv\nimport os\n\n# Load .env from project root\nload_dotenv('../../.env')\n\nengine = create_engine(os.environ['DATABASE_URL'])\n\nquery = \"\"\"\nSELECT\n neighbourhood_id,\n neighbourhood_name,\n geometry,\n year,\n livability_score,\n safety_score,\n affordability_score,\n amenity_score,\n population,\n median_household_income\nFROM public_marts.mart_neighbourhood_overview\nWHERE year = (SELECT MAX(year) FROM public_marts.mart_neighbourhood_overview)\nORDER BY livability_score DESC\n\"\"\"\n\ndf = pd.read_sql(query, engine)\nprint(f\"Loaded {len(df)} neighbourhoods\")" "source": [
"import os\n",
"\n",
"import pandas as pd\n",
"from dotenv import load_dotenv\n",
"from sqlalchemy import create_engine\n",
"\n",
"# Load .env from project root\n",
"load_dotenv(\"../../.env\")\n",
"\n",
"engine = create_engine(os.environ[\"DATABASE_URL\"])\n",
"\n",
"query = \"\"\"\n",
"SELECT\n",
" neighbourhood_id,\n",
" neighbourhood_name,\n",
" geometry,\n",
" year,\n",
" livability_score,\n",
" safety_score,\n",
" affordability_score,\n",
" amenity_score,\n",
" population,\n",
" median_household_income\n",
"FROM public_marts.mart_neighbourhood_overview\n",
"WHERE year = (SELECT MAX(year) FROM public_marts.mart_neighbourhood_overview)\n",
"ORDER BY livability_score DESC\n",
"\"\"\"\n",
"\n",
"df = pd.read_sql(query, engine)\n",
"print(f\"Loaded {len(df)} neighbourhoods\")"
]
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
@@ -49,21 +80,20 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"# Transform geometry to GeoJSON\n", "# Transform geometry to GeoJSON\n",
"import geopandas as gpd\n",
"import json\n", "import json\n",
"\n", "\n",
"import geopandas as gpd\n",
"\n",
"# Convert WKB geometry to GeoDataFrame\n", "# Convert WKB geometry to GeoDataFrame\n",
"gdf = gpd.GeoDataFrame(\n", "gdf = gpd.GeoDataFrame(\n",
" df,\n", " df, geometry=gpd.GeoSeries.from_wkb(df[\"geometry\"]), crs=\"EPSG:4326\"\n",
" geometry=gpd.GeoSeries.from_wkb(df['geometry']),\n",
" crs='EPSG:4326'\n",
")\n", ")\n",
"\n", "\n",
"# Create GeoJSON FeatureCollection\n", "# Create GeoJSON FeatureCollection\n",
"geojson = json.loads(gdf.to_json())\n", "geojson = json.loads(gdf.to_json())\n",
"\n", "\n",
"# Prepare data for figure factory\n", "# Prepare data for figure factory\n",
"data = df.drop(columns=['geometry']).to_dict('records')" "data = df.drop(columns=[\"geometry\"]).to_dict(\"records\")"
] ]
}, },
{ {
@@ -79,7 +109,15 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"df[['neighbourhood_name', 'livability_score', 'safety_score', 'affordability_score', 'amenity_score']].head(10)" "df[\n",
" [\n",
" \"neighbourhood_name\",\n",
" \"livability_score\",\n",
" \"safety_score\",\n",
" \"affordability_score\",\n",
" \"amenity_score\",\n",
" ]\n",
"].head(10)"
] ]
}, },
{ {
@@ -90,7 +128,7 @@
"\n", "\n",
"### Figure Factory\n", "### Figure Factory\n",
"\n", "\n",
"Uses `create_choropleth_figure` from `portfolio_app.figures.choropleth`.\n", "Uses `create_choropleth_figure` from `portfolio_app.figures.toronto.choropleth`.\n",
"\n", "\n",
"**Key Parameters:**\n", "**Key Parameters:**\n",
"- `geojson`: GeoJSON FeatureCollection with neighbourhood boundaries\n", "- `geojson`: GeoJSON FeatureCollection with neighbourhood boundaries\n",
@@ -107,18 +145,24 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"import sys\n", "import sys\n",
"sys.path.insert(0, '../..')\n",
"\n", "\n",
"from portfolio_app.figures.choropleth import create_choropleth_figure\n", "sys.path.insert(0, \"../..\")\n",
"\n",
"from portfolio_app.figures.toronto.choropleth import create_choropleth_figure\n",
"\n", "\n",
"fig = create_choropleth_figure(\n", "fig = create_choropleth_figure(\n",
" geojson=geojson,\n", " geojson=geojson,\n",
" data=data,\n", " data=data,\n",
" location_key='neighbourhood_id',\n", " location_key=\"neighbourhood_id\",\n",
" color_column='livability_score',\n", " color_column=\"livability_score\",\n",
" hover_data=['neighbourhood_name', 'safety_score', 'affordability_score', 'amenity_score'],\n", " hover_data=[\n",
" color_scale='RdYlGn',\n", " \"neighbourhood_name\",\n",
" title='Toronto Neighbourhood Livability Score',\n", " \"safety_score\",\n",
" \"affordability_score\",\n",
" \"amenity_score\",\n",
" ],\n",
" color_scale=\"RdYlGn\",\n",
" title=\"Toronto Neighbourhood Livability Score\",\n",
" zoom=10,\n", " zoom=10,\n",
")\n", ")\n",
"\n", "\n",

View File

@@ -19,7 +19,7 @@
"\n", "\n",
"| Table | Grain | Key Columns |\n", "| Table | Grain | Key Columns |\n",
"|-------|-------|-------------|\n", "|-------|-------|-------------|\n",
"| `mart_neighbourhood_overview` | neighbourhood \u00d7 year | neighbourhood_name, livability_score |\n", "| `mart_neighbourhood_overview` | neighbourhood × year | neighbourhood_name, livability_score |\n",
"\n", "\n",
"### SQL Query" "### SQL Query"
] ]
@@ -30,15 +30,16 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"import pandas as pd\n",
"from sqlalchemy import create_engine\n",
"from dotenv import load_dotenv\n",
"import os\n", "import os\n",
"\n", "\n",
"# Load .env from project root\n", "import pandas as pd\n",
"load_dotenv('../../.env')\n", "from dotenv import load_dotenv\n",
"from sqlalchemy import create_engine\n",
"\n", "\n",
"engine = create_engine(os.environ['DATABASE_URL'])\n", "# Load .env from project root\n",
"load_dotenv(\"../../.env\")\n",
"\n",
"engine = create_engine(os.environ[\"DATABASE_URL\"])\n",
"\n", "\n",
"query = \"\"\"\n", "query = \"\"\"\n",
"SELECT\n", "SELECT\n",
@@ -76,7 +77,7 @@
"source": [ "source": [
"# The figure factory handles top/bottom selection internally\n", "# The figure factory handles top/bottom selection internally\n",
"# Just prepare as list of dicts\n", "# Just prepare as list of dicts\n",
"data = df.to_dict('records')" "data = df.to_dict(\"records\")"
] ]
}, },
{ {
@@ -106,7 +107,7 @@
"\n", "\n",
"### Figure Factory\n", "### Figure Factory\n",
"\n", "\n",
"Uses `create_ranking_bar` from `portfolio_app.figures.bar_charts`.\n", "Uses `create_ranking_bar` from `portfolio_app.figures.toronto.bar_charts`.\n",
"\n", "\n",
"**Key Parameters:**\n", "**Key Parameters:**\n",
"- `data`: List of dicts with all neighbourhoods\n", "- `data`: List of dicts with all neighbourhoods\n",
@@ -123,20 +124,21 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"import sys\n", "import sys\n",
"sys.path.insert(0, '../..')\n",
"\n", "\n",
"from portfolio_app.figures.bar_charts import create_ranking_bar\n", "sys.path.insert(0, \"../..\")\n",
"\n",
"from portfolio_app.figures.toronto.bar_charts import create_ranking_bar\n",
"\n", "\n",
"fig = create_ranking_bar(\n", "fig = create_ranking_bar(\n",
" data=data,\n", " data=data,\n",
" name_column='neighbourhood_name',\n", " name_column=\"neighbourhood_name\",\n",
" value_column='livability_score',\n", " value_column=\"livability_score\",\n",
" title='Top & Bottom 10 Neighbourhoods by Livability',\n", " title=\"Top & Bottom 10 Neighbourhoods by Livability\",\n",
" top_n=10,\n", " top_n=10,\n",
" bottom_n=10,\n", " bottom_n=10,\n",
" color_top='#4CAF50', # Green for top performers\n", " color_top=\"#4CAF50\", # Green for top performers\n",
" color_bottom='#F44336', # Red for bottom performers\n", " color_bottom=\"#F44336\", # Red for bottom performers\n",
" value_format='.1f',\n", " value_format=\".1f\",\n",
")\n", ")\n",
"\n", "\n",
"fig.show()" "fig.show()"

View File

View File

@@ -19,7 +19,7 @@
"\n", "\n",
"| Table | Grain | Key Columns |\n", "| Table | Grain | Key Columns |\n",
"|-------|-------|-------------|\n", "|-------|-------|-------------|\n",
"| `mart_neighbourhood_safety` | neighbourhood \u00d7 year | assault_count, auto_theft_count, break_enter_count, robbery_count, etc. |\n", "| `mart_neighbourhood_safety` | neighbourhood × year | assault_count, auto_theft_count, break_enter_count, robbery_count, etc. |\n",
"\n", "\n",
"### SQL Query" "### SQL Query"
] ]
@@ -30,15 +30,16 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"import pandas as pd\n",
"from sqlalchemy import create_engine\n",
"from dotenv import load_dotenv\n",
"import os\n", "import os\n",
"\n", "\n",
"# Load .env from project root\n", "import pandas as pd\n",
"load_dotenv('../../.env')\n", "from dotenv import load_dotenv\n",
"from sqlalchemy import create_engine\n",
"\n", "\n",
"engine = create_engine(os.environ['DATABASE_URL'])\n", "# Load .env from project root\n",
"load_dotenv(\"../../.env\")\n",
"\n",
"engine = create_engine(os.environ[\"DATABASE_URL\"])\n",
"\n", "\n",
"query = \"\"\"\n", "query = \"\"\"\n",
"SELECT\n", "SELECT\n",
@@ -79,17 +80,25 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"df_melted = df.melt(\n", "df_melted = df.melt(\n",
" id_vars=['neighbourhood_name', 'total_incidents'],\n", " id_vars=[\"neighbourhood_name\", \"total_incidents\"],\n",
" value_vars=['assault_count', 'auto_theft_count', 'break_enter_count', \n", " value_vars=[\n",
" 'robbery_count', 'theft_over_count', 'homicide_count'],\n", " \"assault_count\",\n",
" var_name='crime_type',\n", " \"auto_theft_count\",\n",
" value_name='count'\n", " \"break_enter_count\",\n",
" \"robbery_count\",\n",
" \"theft_over_count\",\n",
" \"homicide_count\",\n",
" ],\n",
" var_name=\"crime_type\",\n",
" value_name=\"count\",\n",
")\n", ")\n",
"\n", "\n",
"# Clean labels\n", "# Clean labels\n",
"df_melted['crime_type'] = df_melted['crime_type'].str.replace('_count', '').str.replace('_', ' ').str.title()\n", "df_melted[\"crime_type\"] = (\n",
" df_melted[\"crime_type\"].str.replace(\"_count\", \"\").str.replace(\"_\", \" \").str.title()\n",
")\n",
"\n", "\n",
"data = df_melted.to_dict('records')" "data = df_melted.to_dict(\"records\")"
] ]
}, },
{ {
@@ -105,7 +114,15 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"df[['neighbourhood_name', 'assault_count', 'auto_theft_count', 'break_enter_count', 'total_incidents']].head(10)" "df[\n",
" [\n",
" \"neighbourhood_name\",\n",
" \"assault_count\",\n",
" \"auto_theft_count\",\n",
" \"break_enter_count\",\n",
" \"total_incidents\",\n",
" ]\n",
"].head(10)"
] ]
}, },
{ {
@@ -116,7 +133,7 @@
"\n", "\n",
"### Figure Factory\n", "### Figure Factory\n",
"\n", "\n",
"Uses `create_stacked_bar` from `portfolio_app.figures.bar_charts`." "Uses `create_stacked_bar` from `portfolio_app.figures.toronto.bar_charts`."
] ]
}, },
{ {
@@ -126,23 +143,24 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"import sys\n", "import sys\n",
"sys.path.insert(0, '../..')\n",
"\n", "\n",
"from portfolio_app.figures.bar_charts import create_stacked_bar\n", "sys.path.insert(0, \"../..\")\n",
"\n",
"from portfolio_app.figures.toronto.bar_charts import create_stacked_bar\n",
"\n", "\n",
"fig = create_stacked_bar(\n", "fig = create_stacked_bar(\n",
" data=data,\n", " data=data,\n",
" x_column='neighbourhood_name',\n", " x_column=\"neighbourhood_name\",\n",
" value_column='count',\n", " value_column=\"count\",\n",
" category_column='crime_type',\n", " category_column=\"crime_type\",\n",
" title='Crime Type Breakdown - Top 15 Neighbourhoods',\n", " title=\"Crime Type Breakdown - Top 15 Neighbourhoods\",\n",
" color_map={\n", " color_map={\n",
" 'Assault': '#d62728',\n", " \"Assault\": \"#d62728\",\n",
" 'Auto Theft': '#ff7f0e',\n", " \"Auto Theft\": \"#ff7f0e\",\n",
" 'Break Enter': '#9467bd',\n", " \"Break Enter\": \"#9467bd\",\n",
" 'Robbery': '#8c564b',\n", " \"Robbery\": \"#8c564b\",\n",
" 'Theft Over': '#e377c2',\n", " \"Theft Over\": \"#e377c2\",\n",
" 'Homicide': '#1f77b4'\n", " \"Homicide\": \"#1f77b4\",\n",
" },\n", " },\n",
")\n", ")\n",
"\n", "\n",

View File

@@ -19,7 +19,7 @@
"\n", "\n",
"| Table | Grain | Key Columns |\n", "| Table | Grain | Key Columns |\n",
"|-------|-------|-------------|\n", "|-------|-------|-------------|\n",
"| `mart_neighbourhood_safety` | neighbourhood \u00d7 year | crime_rate_per_100k, crime_index, safety_tier, geometry |\n", "| `mart_neighbourhood_safety` | neighbourhood × year | crime_rate_per_100k, crime_index, safety_tier, geometry |\n",
"\n", "\n",
"### SQL Query" "### SQL Query"
] ]
@@ -30,15 +30,16 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"import pandas as pd\n",
"from sqlalchemy import create_engine\n",
"from dotenv import load_dotenv\n",
"import os\n", "import os\n",
"\n", "\n",
"# Load .env from project root\n", "import pandas as pd\n",
"load_dotenv('../../.env')\n", "from dotenv import load_dotenv\n",
"from sqlalchemy import create_engine\n",
"\n", "\n",
"engine = create_engine(os.environ['DATABASE_URL'])\n", "# Load .env from project root\n",
"load_dotenv(\"../../.env\")\n",
"\n",
"engine = create_engine(os.environ[\"DATABASE_URL\"])\n",
"\n", "\n",
"query = \"\"\"\n", "query = \"\"\"\n",
"SELECT\n", "SELECT\n",
@@ -77,17 +78,16 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"import geopandas as gpd\n",
"import json\n", "import json\n",
"\n", "\n",
"import geopandas as gpd\n",
"\n",
"gdf = gpd.GeoDataFrame(\n", "gdf = gpd.GeoDataFrame(\n",
" df,\n", " df, geometry=gpd.GeoSeries.from_wkb(df[\"geometry\"]), crs=\"EPSG:4326\"\n",
" geometry=gpd.GeoSeries.from_wkb(df['geometry']),\n",
" crs='EPSG:4326'\n",
")\n", ")\n",
"\n", "\n",
"geojson = json.loads(gdf.to_json())\n", "geojson = json.loads(gdf.to_json())\n",
"data = df.drop(columns=['geometry']).to_dict('records')" "data = df.drop(columns=[\"geometry\"]).to_dict(\"records\")"
] ]
}, },
{ {
@@ -103,7 +103,15 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"df[['neighbourhood_name', 'crime_rate_per_100k', 'crime_index', 'safety_tier', 'total_incidents']].head(10)" "df[\n",
" [\n",
" \"neighbourhood_name\",\n",
" \"crime_rate_per_100k\",\n",
" \"crime_index\",\n",
" \"safety_tier\",\n",
" \"total_incidents\",\n",
" ]\n",
"].head(10)"
] ]
}, },
{ {
@@ -114,7 +122,7 @@
"\n", "\n",
"### Figure Factory\n", "### Figure Factory\n",
"\n", "\n",
"Uses `create_choropleth_figure` from `portfolio_app.figures.choropleth`.\n", "Uses `create_choropleth_figure` from `portfolio_app.figures.toronto.choropleth`.\n",
"\n", "\n",
"**Key Parameters:**\n", "**Key Parameters:**\n",
"- `color_column`: 'crime_rate_per_100k'\n", "- `color_column`: 'crime_rate_per_100k'\n",
@@ -128,18 +136,19 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"import sys\n", "import sys\n",
"sys.path.insert(0, '../..')\n",
"\n", "\n",
"from portfolio_app.figures.choropleth import create_choropleth_figure\n", "sys.path.insert(0, \"../..\")\n",
"\n",
"from portfolio_app.figures.toronto.choropleth import create_choropleth_figure\n",
"\n", "\n",
"fig = create_choropleth_figure(\n", "fig = create_choropleth_figure(\n",
" geojson=geojson,\n", " geojson=geojson,\n",
" data=data,\n", " data=data,\n",
" location_key='neighbourhood_id',\n", " location_key=\"neighbourhood_id\",\n",
" color_column='crime_rate_per_100k',\n", " color_column=\"crime_rate_per_100k\",\n",
" hover_data=['neighbourhood_name', 'crime_index', 'total_incidents'],\n", " hover_data=[\"neighbourhood_name\", \"crime_index\", \"total_incidents\"],\n",
" color_scale='RdYlGn_r',\n", " color_scale=\"RdYlGn_r\",\n",
" title='Toronto Crime Rate per 100,000 Population',\n", " title=\"Toronto Crime Rate per 100,000 Population\",\n",
" zoom=10,\n", " zoom=10,\n",
")\n", ")\n",
"\n", "\n",

View File

@@ -19,7 +19,7 @@
"\n", "\n",
"| Table | Grain | Key Columns |\n", "| Table | Grain | Key Columns |\n",
"|-------|-------|-------------|\n", "|-------|-------|-------------|\n",
"| `mart_neighbourhood_safety` | neighbourhood \u00d7 year | year, crime_rate_per_100k, crime_yoy_change_pct |\n", "| `mart_neighbourhood_safety` | neighbourhood × year | year, crime_rate_per_100k, crime_yoy_change_pct |\n",
"\n", "\n",
"### SQL Query" "### SQL Query"
] ]
@@ -30,15 +30,16 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"import pandas as pd\n",
"from sqlalchemy import create_engine\n",
"from dotenv import load_dotenv\n",
"import os\n", "import os\n",
"\n", "\n",
"# Load .env from project root\n", "import pandas as pd\n",
"load_dotenv('../../.env')\n", "from dotenv import load_dotenv\n",
"from sqlalchemy import create_engine\n",
"\n", "\n",
"engine = create_engine(os.environ['DATABASE_URL'])\n", "# Load .env from project root\n",
"load_dotenv(\"../../.env\")\n",
"\n",
"engine = create_engine(os.environ[\"DATABASE_URL\"])\n",
"\n", "\n",
"query = \"\"\"\n", "query = \"\"\"\n",
"SELECT\n", "SELECT\n",
@@ -76,21 +77,23 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"df['date'] = pd.to_datetime(df['year'].astype(str) + '-01-01')\n", "df[\"date\"] = pd.to_datetime(df[\"year\"].astype(str) + \"-01-01\")\n",
"\n", "\n",
"# Melt for multi-line\n", "# Melt for multi-line\n",
"df_melted = df.melt(\n", "df_melted = df.melt(\n",
" id_vars=['year', 'date'],\n", " id_vars=[\"year\", \"date\"],\n",
" value_vars=['avg_assault_rate', 'avg_auto_theft_rate', 'avg_break_enter_rate'],\n", " value_vars=[\"avg_assault_rate\", \"avg_auto_theft_rate\", \"avg_break_enter_rate\"],\n",
" var_name='crime_type',\n", " var_name=\"crime_type\",\n",
" value_name='rate_per_100k'\n", " value_name=\"rate_per_100k\",\n",
")\n", ")\n",
"\n", "\n",
"df_melted['crime_type'] = df_melted['crime_type'].map({\n", "df_melted[\"crime_type\"] = df_melted[\"crime_type\"].map(\n",
" 'avg_assault_rate': 'Assault',\n", " {\n",
" 'avg_auto_theft_rate': 'Auto Theft',\n", " \"avg_assault_rate\": \"Assault\",\n",
" 'avg_break_enter_rate': 'Break & Enter'\n", " \"avg_auto_theft_rate\": \"Auto Theft\",\n",
"})" " \"avg_break_enter_rate\": \"Break & Enter\",\n",
" }\n",
")"
] ]
}, },
{ {
@@ -106,7 +109,7 @@
"metadata": {}, "metadata": {},
"outputs": [], "outputs": [],
"source": [ "source": [
"df[['year', 'avg_crime_rate', 'total_city_incidents', 'avg_yoy_change']]" "df[[\"year\", \"avg_crime_rate\", \"total_city_incidents\", \"avg_yoy_change\"]]"
] ]
}, },
{ {
@@ -127,22 +130,23 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"import sys\n", "import sys\n",
"sys.path.insert(0, '../..')\n",
"\n", "\n",
"from portfolio_app.figures.time_series import create_price_time_series\n", "sys.path.insert(0, \"../..\")\n",
"\n", "\n",
"data = df_melted.to_dict('records')\n", "from portfolio_app.figures.toronto.time_series import create_price_time_series\n",
"\n",
"data = df_melted.to_dict(\"records\")\n",
"\n", "\n",
"fig = create_price_time_series(\n", "fig = create_price_time_series(\n",
" data=data,\n", " data=data,\n",
" date_column='date',\n", " date_column=\"date\",\n",
" price_column='rate_per_100k',\n", " price_column=\"rate_per_100k\",\n",
" group_column='crime_type',\n", " group_column=\"crime_type\",\n",
" title='Toronto Crime Trends by Type (5 Years)',\n", " title=\"Toronto Crime Trends by Type (5 Years)\",\n",
")\n", ")\n",
"\n", "\n",
"# Remove dollar sign formatting since this is rate data\n", "# Remove dollar sign formatting since this is rate data\n",
"fig.update_layout(yaxis_tickprefix='', yaxis_title='Rate per 100K')\n", "fig.update_layout(yaxis_tickprefix=\"\", yaxis_title=\"Rate per 100K\")\n",
"\n", "\n",
"fig.show()" "fig.show()"
] ]
@@ -161,15 +165,19 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"# Total crime rate trend\n", "# Total crime rate trend\n",
"total_data = df[['date', 'avg_crime_rate']].rename(columns={'avg_crime_rate': 'total_rate'}).to_dict('records')\n", "total_data = (\n",
" df[[\"date\", \"avg_crime_rate\"]]\n",
" .rename(columns={\"avg_crime_rate\": \"total_rate\"})\n",
" .to_dict(\"records\")\n",
")\n",
"\n", "\n",
"fig2 = create_price_time_series(\n", "fig2 = create_price_time_series(\n",
" data=total_data,\n", " data=total_data,\n",
" date_column='date',\n", " date_column=\"date\",\n",
" price_column='total_rate',\n", " price_column=\"total_rate\",\n",
" title='Toronto Overall Crime Rate Trend',\n", " title=\"Toronto Overall Crime Rate Trend\",\n",
")\n", ")\n",
"fig2.update_layout(yaxis_tickprefix='', yaxis_title='Rate per 100K')\n", "fig2.update_layout(yaxis_tickprefix=\"\", yaxis_title=\"Rate per 100K\")\n",
"fig2.show()" "fig2.show()"
] ]
} }

View File

@@ -28,7 +28,7 @@ def create_metric_selector(
label=label, label=label,
data=options, data=options,
value=default_value or (options[0]["value"] if options else None), value=default_value or (options[0]["value"] if options else None),
style={"width": "200px"}, w=200,
) )
@@ -64,7 +64,7 @@ def create_map_controls(
id=f"{id_prefix}-layer-toggle", id=f"{id_prefix}-layer-toggle",
label="Show Boundaries", label="Show Boundaries",
checked=True, checked=True,
style={"marginTop": "10px"}, mt="sm",
) )
) )

View File

@@ -5,7 +5,7 @@ from typing import Any
import dash_mantine_components as dmc import dash_mantine_components as dmc
from dash import dcc from dash import dcc
from portfolio_app.figures.summary_cards import create_metric_card_figure from portfolio_app.figures.toronto.summary_cards import create_metric_card_figure
class MetricCard: class MetricCard:

View File

@@ -38,7 +38,7 @@ def create_year_selector(
label=label, label=label,
data=options, data=options,
value=str(default_year), value=str(default_year),
style={"width": "120px"}, w=120,
) )
@@ -83,7 +83,8 @@ def create_time_slider(
marks=marks, marks=marks,
step=1, step=1,
minRange=1, minRange=1,
style={"marginTop": "20px", "marginBottom": "10px"}, mt="md",
mb="sm",
), ),
], ],
p="md", p="md",
@@ -131,5 +132,5 @@ def create_month_selector(
label=label, label=label,
data=options, data=options,
value=str(default_month), value=str(default_month),
style={"width": "140px"}, w=140,
) )

View File

@@ -0,0 +1,48 @@
"""Design system tokens and utilities."""
from .tokens import (
CHART_PALETTE,
COLOR_ACCENT,
COLOR_NEGATIVE,
COLOR_POSITIVE,
COLOR_WARNING,
GRID_COLOR,
GRID_COLOR_DARK,
PALETTE_COMPARISON,
PALETTE_GENDER,
PALETTE_TREND,
PAPER_BG,
PLOT_BG,
POLICY_COLORS,
TEXT_MUTED,
TEXT_PRIMARY,
TEXT_SECONDARY,
get_colorbar_defaults,
get_default_layout,
)
__all__ = [
# Text colors
"TEXT_PRIMARY",
"TEXT_SECONDARY",
"TEXT_MUTED",
# Chart backgrounds
"GRID_COLOR",
"GRID_COLOR_DARK",
"PAPER_BG",
"PLOT_BG",
# Semantic colors
"COLOR_POSITIVE",
"COLOR_NEGATIVE",
"COLOR_WARNING",
"COLOR_ACCENT",
# Palettes
"CHART_PALETTE",
"PALETTE_COMPARISON",
"PALETTE_GENDER",
"PALETTE_TREND",
"POLICY_COLORS",
# Utility functions
"get_default_layout",
"get_colorbar_defaults",
]

View File

@@ -0,0 +1,162 @@
"""Centralized design tokens for consistent styling across the application.
This module provides a single source of truth for colors, ensuring:
- Consistent styling across all Plotly figures and components
- Accessibility compliance (WCAG color contrast)
- Easy theme updates without hunting through multiple files
Usage:
from portfolio_app.design import TEXT_PRIMARY, CHART_PALETTE
fig.update_layout(font_color=TEXT_PRIMARY)
"""
from typing import Any
# =============================================================================
# TEXT COLORS (Dark Theme)
# =============================================================================
TEXT_PRIMARY = "#c9c9c9"
"""Primary text color for labels, titles, and body text."""
TEXT_SECONDARY = "#888888"
"""Secondary text color for subtitles, captions, and muted text."""
TEXT_MUTED = "#666666"
"""Muted text color for disabled states and placeholders."""
# =============================================================================
# CHART BACKGROUND & GRID
# =============================================================================
GRID_COLOR = "rgba(128, 128, 128, 0.2)"
"""Standard grid line color with transparency."""
GRID_COLOR_DARK = "rgba(128, 128, 128, 0.3)"
"""Darker grid for radar charts and polar plots."""
PAPER_BG = "rgba(0, 0, 0, 0)"
"""Transparent paper background for charts."""
PLOT_BG = "rgba(0, 0, 0, 0)"
"""Transparent plot background for charts."""
# =============================================================================
# SEMANTIC COLORS
# =============================================================================
COLOR_POSITIVE = "#40c057"
"""Positive/success indicator (Mantine green-6)."""
COLOR_NEGATIVE = "#fa5252"
"""Negative/error indicator (Mantine red-6)."""
COLOR_WARNING = "#fab005"
"""Warning indicator (Mantine yellow-6)."""
COLOR_ACCENT = "#228be6"
"""Primary accent color (Mantine blue-6)."""
# =============================================================================
# ACCESSIBLE CHART PALETTE
# =============================================================================
# Okabe-Ito palette - optimized for all color vision deficiencies
# Reference: https://jfly.uni-koeln.de/color/
CHART_PALETTE = [
"#0072B2", # Blue (primary data series)
"#E69F00", # Orange
"#56B4E9", # Sky blue
"#009E73", # Teal/green
"#F0E442", # Yellow
"#D55E00", # Vermillion
"#CC79A7", # Pink
"#000000", # Black (use sparingly)
]
"""
Accessible categorical palette (Okabe-Ito).
Distinguishable for deuteranopia, protanopia, and tritanopia.
Use indices 0-6 for most charts; index 7 (black) for emphasis only.
"""
# Semantic subsets for specific use cases
PALETTE_COMPARISON = [CHART_PALETTE[0], CHART_PALETTE[1]]
"""Two-color palette for A/B comparisons."""
PALETTE_GENDER = {
"male": "#56B4E9", # Sky blue
"female": "#CC79A7", # Pink
}
"""Gender-specific colors (accessible contrast)."""
PALETTE_TREND = {
"positive": COLOR_POSITIVE,
"negative": COLOR_NEGATIVE,
"neutral": TEXT_SECONDARY,
}
"""Trend indicator colors for sparklines and deltas."""
# =============================================================================
# POLICY/EVENT MARKERS (Time Series)
# =============================================================================
POLICY_COLORS = {
"policy_change": "#E69F00", # Orange - policy changes
"major_event": "#D55E00", # Vermillion - major events
"data_note": "#56B4E9", # Sky blue - data annotations
"forecast": "#009E73", # Teal - forecast periods
"highlight": "#F0E442", # Yellow - highlighted regions
}
"""Colors for policy markers and event annotations on time series."""
# =============================================================================
# CHART LAYOUT DEFAULTS
# =============================================================================
def get_default_layout() -> dict[str, Any]:
"""Return default Plotly layout settings with design tokens.
Returns:
dict: Layout configuration for fig.update_layout()
Example:
fig.update_layout(**get_default_layout())
"""
return {
"paper_bgcolor": PAPER_BG,
"plot_bgcolor": PLOT_BG,
"font": {"color": TEXT_PRIMARY},
"title": {"font": {"color": TEXT_PRIMARY}},
"legend": {"font": {"color": TEXT_PRIMARY}},
"xaxis": {
"gridcolor": GRID_COLOR,
"linecolor": GRID_COLOR,
"tickfont": {"color": TEXT_PRIMARY},
"title": {"font": {"color": TEXT_PRIMARY}},
},
"yaxis": {
"gridcolor": GRID_COLOR,
"linecolor": GRID_COLOR,
"tickfont": {"color": TEXT_PRIMARY},
"title": {"font": {"color": TEXT_PRIMARY}},
},
}
def get_colorbar_defaults() -> dict[str, Any]:
"""Return default colorbar settings with design tokens.
Returns:
dict: Colorbar configuration for choropleth/heatmap traces
"""
return {
"tickfont": {"color": TEXT_PRIMARY},
"title": {"font": {"color": TEXT_PRIMARY}},
}

View File

@@ -1,61 +1,15 @@
"""Plotly figure factories for data visualization.""" """Plotly figure factories for data visualization.
from .bar_charts import ( Figure factories are organized by dashboard domain:
create_horizontal_bar, - toronto/ : Toronto Neighbourhood Dashboard figures
create_ranking_bar,
create_stacked_bar, Usage:
) from portfolio_app.figures.toronto import create_choropleth_figure
from .choropleth import ( from portfolio_app.figures.toronto import create_ranking_bar
create_choropleth_figure, """
create_zone_map,
) from . import toronto
from .demographics import (
create_age_pyramid,
create_donut_chart,
create_income_distribution,
)
from .radar import (
create_comparison_radar,
create_radar_figure,
)
from .scatter import (
create_bubble_chart,
create_scatter_figure,
)
from .summary_cards import create_metric_card_figure, create_summary_metrics
from .time_series import (
add_policy_markers,
create_market_comparison_chart,
create_price_time_series,
create_time_series_with_events,
create_volume_time_series,
)
__all__ = [ __all__ = [
# Choropleth "toronto",
"create_choropleth_figure",
"create_zone_map",
# Time series
"create_price_time_series",
"create_volume_time_series",
"create_market_comparison_chart",
"create_time_series_with_events",
"add_policy_markers",
# Summary
"create_metric_card_figure",
"create_summary_metrics",
# Bar charts
"create_ranking_bar",
"create_stacked_bar",
"create_horizontal_bar",
# Scatter plots
"create_scatter_figure",
"create_bubble_chart",
# Radar charts
"create_radar_figure",
"create_comparison_radar",
# Demographics
"create_age_pyramid",
"create_donut_chart",
"create_income_distribution",
] ]

View File

@@ -0,0 +1,61 @@
"""Plotly figure factories for Toronto dashboard visualizations."""
from .bar_charts import (
create_horizontal_bar,
create_ranking_bar,
create_stacked_bar,
)
from .choropleth import (
create_choropleth_figure,
create_zone_map,
)
from .demographics import (
create_age_pyramid,
create_donut_chart,
create_income_distribution,
)
from .radar import (
create_comparison_radar,
create_radar_figure,
)
from .scatter import (
create_bubble_chart,
create_scatter_figure,
)
from .summary_cards import create_metric_card_figure, create_summary_metrics
from .time_series import (
add_policy_markers,
create_market_comparison_chart,
create_price_time_series,
create_time_series_with_events,
create_volume_time_series,
)
__all__ = [
# Choropleth
"create_choropleth_figure",
"create_zone_map",
# Time series
"create_price_time_series",
"create_volume_time_series",
"create_market_comparison_chart",
"create_time_series_with_events",
"add_policy_markers",
# Summary
"create_metric_card_figure",
"create_summary_metrics",
# Bar charts
"create_ranking_bar",
"create_stacked_bar",
"create_horizontal_bar",
# Scatter plots
"create_scatter_figure",
"create_bubble_chart",
# Radar charts
"create_radar_figure",
"create_comparison_radar",
# Demographics
"create_age_pyramid",
"create_donut_chart",
"create_income_distribution",
]

View File

@@ -6,6 +6,17 @@ import pandas as pd
import plotly.express as px import plotly.express as px
import plotly.graph_objects as go import plotly.graph_objects as go
from portfolio_app.design import (
CHART_PALETTE,
COLOR_NEGATIVE,
COLOR_POSITIVE,
GRID_COLOR,
PAPER_BG,
PLOT_BG,
TEXT_PRIMARY,
TEXT_SECONDARY,
)
def create_ranking_bar( def create_ranking_bar(
data: list[dict[str, Any]], data: list[dict[str, Any]],
@@ -14,8 +25,8 @@ def create_ranking_bar(
title: str | None = None, title: str | None = None,
top_n: int = 10, top_n: int = 10,
bottom_n: int = 10, bottom_n: int = 10,
color_top: str = "#4CAF50", color_top: str = COLOR_POSITIVE,
color_bottom: str = "#F44336", color_bottom: str = COLOR_NEGATIVE,
value_format: str = ",.0f", value_format: str = ",.0f",
) -> go.Figure: ) -> go.Figure:
"""Create horizontal bar chart showing top and bottom rankings. """Create horizontal bar chart showing top and bottom rankings.
@@ -87,10 +98,10 @@ def create_ranking_bar(
barmode="group", barmode="group",
showlegend=True, showlegend=True,
legend={"orientation": "h", "yanchor": "bottom", "y": 1.02}, legend={"orientation": "h", "yanchor": "bottom", "y": 1.02},
paper_bgcolor="rgba(0,0,0,0)", paper_bgcolor=PAPER_BG,
plot_bgcolor="rgba(0,0,0,0)", plot_bgcolor=PLOT_BG,
font_color="#c9c9c9", font_color=TEXT_PRIMARY,
xaxis={"gridcolor": "rgba(128,128,128,0.2)", "title": None}, xaxis={"gridcolor": GRID_COLOR, "title": None},
yaxis={"autorange": "reversed", "title": None}, yaxis={"autorange": "reversed", "title": None},
margin={"l": 10, "r": 10, "t": 40, "b": 10}, margin={"l": 10, "r": 10, "t": 40, "b": 10},
) )
@@ -126,10 +137,10 @@ def create_stacked_bar(
df = pd.DataFrame(data) df = pd.DataFrame(data)
# Default color scheme # Default color scheme using accessible palette
if color_map is None: if color_map is None:
categories = df[category_column].unique() categories = df[category_column].unique()
colors = px.colors.qualitative.Set2[: len(categories)] colors = CHART_PALETTE[: len(categories)]
color_map = dict(zip(categories, colors, strict=False)) color_map = dict(zip(categories, colors, strict=False))
fig = px.bar( fig = px.bar(
@@ -147,11 +158,11 @@ def create_stacked_bar(
fig.update_layout( fig.update_layout(
title=title, title=title,
paper_bgcolor="rgba(0,0,0,0)", paper_bgcolor=PAPER_BG,
plot_bgcolor="rgba(0,0,0,0)", plot_bgcolor=PLOT_BG,
font_color="#c9c9c9", font_color=TEXT_PRIMARY,
xaxis={"gridcolor": "rgba(128,128,128,0.2)", "title": None}, xaxis={"gridcolor": GRID_COLOR, "title": None},
yaxis={"gridcolor": "rgba(128,128,128,0.2)", "title": None}, yaxis={"gridcolor": GRID_COLOR, "title": None},
legend={"orientation": "h", "yanchor": "bottom", "y": 1.02}, legend={"orientation": "h", "yanchor": "bottom", "y": 1.02},
margin={"l": 10, "r": 10, "t": 60, "b": 10}, margin={"l": 10, "r": 10, "t": 60, "b": 10},
) )
@@ -164,7 +175,7 @@ def create_horizontal_bar(
name_column: str, name_column: str,
value_column: str, value_column: str,
title: str | None = None, title: str | None = None,
color: str = "#2196F3", color: str = CHART_PALETTE[0],
value_format: str = ",.0f", value_format: str = ",.0f",
sort: bool = True, sort: bool = True,
) -> go.Figure: ) -> go.Figure:
@@ -204,10 +215,10 @@ def create_horizontal_bar(
fig.update_layout( fig.update_layout(
title=title, title=title,
paper_bgcolor="rgba(0,0,0,0)", paper_bgcolor=PAPER_BG,
plot_bgcolor="rgba(0,0,0,0)", plot_bgcolor=PLOT_BG,
font_color="#c9c9c9", font_color=TEXT_PRIMARY,
xaxis={"gridcolor": "rgba(128,128,128,0.2)", "title": None}, xaxis={"gridcolor": GRID_COLOR, "title": None},
yaxis={"title": None}, yaxis={"title": None},
margin={"l": 10, "r": 10, "t": 40, "b": 10}, margin={"l": 10, "r": 10, "t": 40, "b": 10},
) )
@@ -225,13 +236,13 @@ def _create_empty_figure(title: str) -> go.Figure:
x=0.5, x=0.5,
y=0.5, y=0.5,
showarrow=False, showarrow=False,
font={"size": 14, "color": "#888888"}, font={"size": 14, "color": TEXT_SECONDARY},
) )
fig.update_layout( fig.update_layout(
title=title, title=title,
paper_bgcolor="rgba(0,0,0,0)", paper_bgcolor=PAPER_BG,
plot_bgcolor="rgba(0,0,0,0)", plot_bgcolor=PLOT_BG,
font_color="#c9c9c9", font_color=TEXT_PRIMARY,
xaxis={"visible": False}, xaxis={"visible": False},
yaxis={"visible": False}, yaxis={"visible": False},
) )

View File

@@ -5,6 +5,13 @@ from typing import Any
import plotly.express as px import plotly.express as px
import plotly.graph_objects as go import plotly.graph_objects as go
from portfolio_app.design import (
PAPER_BG,
PLOT_BG,
TEXT_PRIMARY,
TEXT_SECONDARY,
)
def create_choropleth_figure( def create_choropleth_figure(
geojson: dict[str, Any] | None, geojson: dict[str, Any] | None,
@@ -55,9 +62,9 @@ def create_choropleth_figure(
margin={"l": 0, "r": 0, "t": 40, "b": 0}, margin={"l": 0, "r": 0, "t": 40, "b": 0},
title=title or "Toronto Housing Map", title=title or "Toronto Housing Map",
height=500, height=500,
paper_bgcolor="rgba(0,0,0,0)", paper_bgcolor=PAPER_BG,
plot_bgcolor="rgba(0,0,0,0)", plot_bgcolor=PLOT_BG,
font_color="#c9c9c9", font_color=TEXT_PRIMARY,
) )
fig.add_annotation( fig.add_annotation(
text="No geometry data available. Complete QGIS digitization to enable map.", text="No geometry data available. Complete QGIS digitization to enable map.",
@@ -66,7 +73,7 @@ def create_choropleth_figure(
x=0.5, x=0.5,
y=0.5, y=0.5,
showarrow=False, showarrow=False,
font={"size": 14, "color": "#888888"}, font={"size": 14, "color": TEXT_SECONDARY},
) )
return fig return fig
@@ -98,17 +105,17 @@ def create_choropleth_figure(
margin={"l": 0, "r": 0, "t": 40, "b": 0}, margin={"l": 0, "r": 0, "t": 40, "b": 0},
title=title, title=title,
height=500, height=500,
paper_bgcolor="rgba(0,0,0,0)", paper_bgcolor=PAPER_BG,
plot_bgcolor="rgba(0,0,0,0)", plot_bgcolor=PLOT_BG,
font_color="#c9c9c9", font_color=TEXT_PRIMARY,
coloraxis_colorbar={ coloraxis_colorbar={
"title": { "title": {
"text": color_column.replace("_", " ").title(), "text": color_column.replace("_", " ").title(),
"font": {"color": "#c9c9c9"}, "font": {"color": TEXT_PRIMARY},
}, },
"thickness": 15, "thickness": 15,
"len": 0.7, "len": 0.7,
"tickfont": {"color": "#c9c9c9"}, "tickfont": {"color": TEXT_PRIMARY},
}, },
) )

View File

@@ -5,6 +5,16 @@ from typing import Any
import pandas as pd import pandas as pd
import plotly.graph_objects as go import plotly.graph_objects as go
from portfolio_app.design import (
CHART_PALETTE,
GRID_COLOR,
PALETTE_GENDER,
PAPER_BG,
PLOT_BG,
TEXT_PRIMARY,
TEXT_SECONDARY,
)
def create_age_pyramid( def create_age_pyramid(
data: list[dict[str, Any]], data: list[dict[str, Any]],
@@ -52,7 +62,7 @@ def create_age_pyramid(
x=male_values_neg, x=male_values_neg,
orientation="h", orientation="h",
name="Male", name="Male",
marker_color="#2196F3", marker_color=PALETTE_GENDER["male"],
hovertemplate="%{y}<br>Male: %{customdata:,}<extra></extra>", hovertemplate="%{y}<br>Male: %{customdata:,}<extra></extra>",
customdata=male_values, customdata=male_values,
) )
@@ -65,7 +75,7 @@ def create_age_pyramid(
x=female_values, x=female_values,
orientation="h", orientation="h",
name="Female", name="Female",
marker_color="#E91E63", marker_color=PALETTE_GENDER["female"],
hovertemplate="%{y}<br>Female: %{x:,}<extra></extra>", hovertemplate="%{y}<br>Female: %{x:,}<extra></extra>",
) )
) )
@@ -77,12 +87,12 @@ def create_age_pyramid(
title=title, title=title,
barmode="overlay", barmode="overlay",
bargap=0.1, bargap=0.1,
paper_bgcolor="rgba(0,0,0,0)", paper_bgcolor=PAPER_BG,
plot_bgcolor="rgba(0,0,0,0)", plot_bgcolor=PLOT_BG,
font_color="#c9c9c9", font_color=TEXT_PRIMARY,
xaxis={ xaxis={
"title": "Population", "title": "Population",
"gridcolor": "rgba(128,128,128,0.2)", "gridcolor": GRID_COLOR,
"range": [-max_val * 1.1, max_val * 1.1], "range": [-max_val * 1.1, max_val * 1.1],
"tickvals": [-max_val, -max_val / 2, 0, max_val / 2, max_val], "tickvals": [-max_val, -max_val / 2, 0, max_val / 2, max_val],
"ticktext": [ "ticktext": [
@@ -93,7 +103,7 @@ def create_age_pyramid(
f"{max_val:,.0f}", f"{max_val:,.0f}",
], ],
}, },
yaxis={"title": None, "gridcolor": "rgba(128,128,128,0.2)"}, yaxis={"title": None, "gridcolor": GRID_COLOR},
legend={"orientation": "h", "yanchor": "bottom", "y": 1.02}, legend={"orientation": "h", "yanchor": "bottom", "y": 1.02},
margin={"l": 10, "r": 10, "t": 60, "b": 10}, margin={"l": 10, "r": 10, "t": 60, "b": 10},
) )
@@ -127,17 +137,9 @@ def create_donut_chart(
df = pd.DataFrame(data) df = pd.DataFrame(data)
# Use accessible palette by default
if colors is None: if colors is None:
colors = [ colors = CHART_PALETTE
"#2196F3",
"#4CAF50",
"#FF9800",
"#E91E63",
"#9C27B0",
"#00BCD4",
"#FFC107",
"#795548",
]
fig = go.Figure( fig = go.Figure(
go.Pie( go.Pie(
@@ -153,8 +155,8 @@ def create_donut_chart(
fig.update_layout( fig.update_layout(
title=title, title=title,
paper_bgcolor="rgba(0,0,0,0)", paper_bgcolor=PAPER_BG,
font_color="#c9c9c9", font_color=TEXT_PRIMARY,
showlegend=False, showlegend=False,
margin={"l": 10, "r": 10, "t": 60, "b": 10}, margin={"l": 10, "r": 10, "t": 60, "b": 10},
) )
@@ -167,7 +169,7 @@ def create_income_distribution(
bracket_column: str, bracket_column: str,
count_column: str, count_column: str,
title: str | None = None, title: str | None = None,
color: str = "#4CAF50", color: str = CHART_PALETTE[3], # Teal
) -> go.Figure: ) -> go.Figure:
"""Create histogram-style bar chart for income distribution. """Create histogram-style bar chart for income distribution.
@@ -199,17 +201,17 @@ def create_income_distribution(
fig.update_layout( fig.update_layout(
title=title, title=title,
paper_bgcolor="rgba(0,0,0,0)", paper_bgcolor=PAPER_BG,
plot_bgcolor="rgba(0,0,0,0)", plot_bgcolor=PLOT_BG,
font_color="#c9c9c9", font_color=TEXT_PRIMARY,
xaxis={ xaxis={
"title": "Income Bracket", "title": "Income Bracket",
"gridcolor": "rgba(128,128,128,0.2)", "gridcolor": GRID_COLOR,
"tickangle": -45, "tickangle": -45,
}, },
yaxis={ yaxis={
"title": "Households", "title": "Households",
"gridcolor": "rgba(128,128,128,0.2)", "gridcolor": GRID_COLOR,
}, },
margin={"l": 10, "r": 10, "t": 60, "b": 80}, margin={"l": 10, "r": 10, "t": 60, "b": 80},
) )
@@ -227,13 +229,13 @@ def _create_empty_figure(title: str) -> go.Figure:
x=0.5, x=0.5,
y=0.5, y=0.5,
showarrow=False, showarrow=False,
font={"size": 14, "color": "#888888"}, font={"size": 14, "color": TEXT_SECONDARY},
) )
fig.update_layout( fig.update_layout(
title=title, title=title,
paper_bgcolor="rgba(0,0,0,0)", paper_bgcolor=PAPER_BG,
plot_bgcolor="rgba(0,0,0,0)", plot_bgcolor=PLOT_BG,
font_color="#c9c9c9", font_color=TEXT_PRIMARY,
xaxis={"visible": False}, xaxis={"visible": False},
yaxis={"visible": False}, yaxis={"visible": False},
) )

View File

@@ -4,6 +4,14 @@ from typing import Any
import plotly.graph_objects as go import plotly.graph_objects as go
from portfolio_app.design import (
CHART_PALETTE,
GRID_COLOR_DARK,
PAPER_BG,
TEXT_PRIMARY,
TEXT_SECONDARY,
)
def create_radar_figure( def create_radar_figure(
data: list[dict[str, Any]], data: list[dict[str, Any]],
@@ -32,16 +40,9 @@ def create_radar_figure(
if not data or not metrics: if not data or not metrics:
return _create_empty_figure(title or "Radar Chart") return _create_empty_figure(title or "Radar Chart")
# Default colors # Use accessible palette by default
if colors is None: if colors is None:
colors = [ colors = CHART_PALETTE
"#2196F3",
"#4CAF50",
"#FF9800",
"#E91E63",
"#9C27B0",
"#00BCD4",
]
fig = go.Figure() fig = go.Figure()
@@ -78,19 +79,19 @@ def create_radar_figure(
polar={ polar={
"radialaxis": { "radialaxis": {
"visible": True, "visible": True,
"gridcolor": "rgba(128,128,128,0.3)", "gridcolor": GRID_COLOR_DARK,
"linecolor": "rgba(128,128,128,0.3)", "linecolor": GRID_COLOR_DARK,
"tickfont": {"color": "#c9c9c9"}, "tickfont": {"color": TEXT_PRIMARY},
}, },
"angularaxis": { "angularaxis": {
"gridcolor": "rgba(128,128,128,0.3)", "gridcolor": GRID_COLOR_DARK,
"linecolor": "rgba(128,128,128,0.3)", "linecolor": GRID_COLOR_DARK,
"tickfont": {"color": "#c9c9c9"}, "tickfont": {"color": TEXT_PRIMARY},
}, },
"bgcolor": "rgba(0,0,0,0)", "bgcolor": PAPER_BG,
}, },
paper_bgcolor="rgba(0,0,0,0)", paper_bgcolor=PAPER_BG,
font_color="#c9c9c9", font_color=TEXT_PRIMARY,
showlegend=len(data) > 1, showlegend=len(data) > 1,
legend={"orientation": "h", "yanchor": "bottom", "y": -0.2}, legend={"orientation": "h", "yanchor": "bottom", "y": -0.2},
margin={"l": 40, "r": 40, "t": 60, "b": 40}, margin={"l": 40, "r": 40, "t": 60, "b": 40},
@@ -133,7 +134,7 @@ def create_comparison_radar(
metrics=metrics, metrics=metrics,
name_column="__name__", name_column="__name__",
title=title, title=title,
colors=["#4CAF50", "#9E9E9E"], colors=[CHART_PALETTE[3], TEXT_SECONDARY], # Teal for selected, gray for avg
) )
@@ -156,11 +157,11 @@ def _create_empty_figure(title: str) -> go.Figure:
x=0.5, x=0.5,
y=0.5, y=0.5,
showarrow=False, showarrow=False,
font={"size": 14, "color": "#888888"}, font={"size": 14, "color": TEXT_SECONDARY},
) )
fig.update_layout( fig.update_layout(
title=title, title=title,
paper_bgcolor="rgba(0,0,0,0)", paper_bgcolor=PAPER_BG,
font_color="#c9c9c9", font_color=TEXT_PRIMARY,
) )
return fig return fig

View File

@@ -6,6 +6,15 @@ import pandas as pd
import plotly.express as px import plotly.express as px
import plotly.graph_objects as go import plotly.graph_objects as go
from portfolio_app.design import (
CHART_PALETTE,
GRID_COLOR,
PAPER_BG,
PLOT_BG,
TEXT_PRIMARY,
TEXT_SECONDARY,
)
def create_scatter_figure( def create_scatter_figure(
data: list[dict[str, Any]], data: list[dict[str, Any]],
@@ -72,21 +81,21 @@ def create_scatter_figure(
if trendline: if trendline:
fig.update_traces( fig.update_traces(
selector={"mode": "lines"}, selector={"mode": "lines"},
line={"color": "#FF9800", "dash": "dash", "width": 2}, line={"color": CHART_PALETTE[1], "dash": "dash", "width": 2},
) )
fig.update_layout( fig.update_layout(
title=title, title=title,
paper_bgcolor="rgba(0,0,0,0)", paper_bgcolor=PAPER_BG,
plot_bgcolor="rgba(0,0,0,0)", plot_bgcolor=PLOT_BG,
font_color="#c9c9c9", font_color=TEXT_PRIMARY,
xaxis={ xaxis={
"gridcolor": "rgba(128,128,128,0.2)", "gridcolor": GRID_COLOR,
"title": x_title or x_column.replace("_", " ").title(), "title": x_title or x_column.replace("_", " ").title(),
"zeroline": False, "zeroline": False,
}, },
yaxis={ yaxis={
"gridcolor": "rgba(128,128,128,0.2)", "gridcolor": GRID_COLOR,
"title": y_title or y_column.replace("_", " ").title(), "title": y_title or y_column.replace("_", " ").title(),
"zeroline": False, "zeroline": False,
}, },
@@ -140,19 +149,20 @@ def create_bubble_chart(
hover_name=name_column, hover_name=name_column,
size_max=size_max, size_max=size_max,
opacity=0.7, opacity=0.7,
color_discrete_sequence=CHART_PALETTE,
) )
fig.update_layout( fig.update_layout(
title=title, title=title,
paper_bgcolor="rgba(0,0,0,0)", paper_bgcolor=PAPER_BG,
plot_bgcolor="rgba(0,0,0,0)", plot_bgcolor=PLOT_BG,
font_color="#c9c9c9", font_color=TEXT_PRIMARY,
xaxis={ xaxis={
"gridcolor": "rgba(128,128,128,0.2)", "gridcolor": GRID_COLOR,
"title": x_title or x_column.replace("_", " ").title(), "title": x_title or x_column.replace("_", " ").title(),
}, },
yaxis={ yaxis={
"gridcolor": "rgba(128,128,128,0.2)", "gridcolor": GRID_COLOR,
"title": y_title or y_column.replace("_", " ").title(), "title": y_title or y_column.replace("_", " ").title(),
}, },
margin={"l": 10, "r": 10, "t": 40, "b": 10}, margin={"l": 10, "r": 10, "t": 40, "b": 10},
@@ -171,13 +181,13 @@ def _create_empty_figure(title: str) -> go.Figure:
x=0.5, x=0.5,
y=0.5, y=0.5,
showarrow=False, showarrow=False,
font={"size": 14, "color": "#888888"}, font={"size": 14, "color": TEXT_SECONDARY},
) )
fig.update_layout( fig.update_layout(
title=title, title=title,
paper_bgcolor="rgba(0,0,0,0)", paper_bgcolor=PAPER_BG,
plot_bgcolor="rgba(0,0,0,0)", plot_bgcolor=PLOT_BG,
font_color="#c9c9c9", font_color=TEXT_PRIMARY,
xaxis={"visible": False}, xaxis={"visible": False},
yaxis={"visible": False}, yaxis={"visible": False},
) )

View File

@@ -4,6 +4,14 @@ from typing import Any
import plotly.graph_objects as go import plotly.graph_objects as go
from portfolio_app.design import (
COLOR_NEGATIVE,
COLOR_POSITIVE,
PAPER_BG,
PLOT_BG,
TEXT_PRIMARY,
)
def create_metric_card_figure( def create_metric_card_figure(
value: float | int | str, value: float | int | str,
@@ -59,8 +67,12 @@ def create_metric_card_figure(
"relative": False, "relative": False,
"valueformat": ".1f", "valueformat": ".1f",
"suffix": delta_suffix, "suffix": delta_suffix,
"increasing": {"color": "green" if positive_is_good else "red"}, "increasing": {
"decreasing": {"color": "red" if positive_is_good else "green"}, "color": COLOR_POSITIVE if positive_is_good else COLOR_NEGATIVE
},
"decreasing": {
"color": COLOR_NEGATIVE if positive_is_good else COLOR_POSITIVE
},
} }
fig.add_trace(go.Indicator(**indicator_config)) fig.add_trace(go.Indicator(**indicator_config))
@@ -68,9 +80,9 @@ def create_metric_card_figure(
fig.update_layout( fig.update_layout(
height=120, height=120,
margin={"l": 20, "r": 20, "t": 40, "b": 20}, margin={"l": 20, "r": 20, "t": 40, "b": 20},
paper_bgcolor="rgba(0,0,0,0)", paper_bgcolor=PAPER_BG,
plot_bgcolor="rgba(0,0,0,0)", plot_bgcolor=PLOT_BG,
font={"family": "Inter, sans-serif", "color": "#c9c9c9"}, font={"family": "Inter, sans-serif", "color": TEXT_PRIMARY},
) )
return fig return fig

View File

@@ -5,6 +5,15 @@ from typing import Any
import plotly.express as px import plotly.express as px
import plotly.graph_objects as go import plotly.graph_objects as go
from portfolio_app.design import (
CHART_PALETTE,
GRID_COLOR,
PAPER_BG,
PLOT_BG,
TEXT_PRIMARY,
TEXT_SECONDARY,
)
def create_price_time_series( def create_price_time_series(
data: list[dict[str, Any]], data: list[dict[str, Any]],
@@ -38,14 +47,14 @@ def create_price_time_series(
x=0.5, x=0.5,
y=0.5, y=0.5,
showarrow=False, showarrow=False,
font={"color": "#888888"}, font={"color": TEXT_SECONDARY},
) )
fig.update_layout( fig.update_layout(
title=title, title=title,
height=350, height=350,
paper_bgcolor="rgba(0,0,0,0)", paper_bgcolor=PAPER_BG,
plot_bgcolor="rgba(0,0,0,0)", plot_bgcolor=PLOT_BG,
font_color="#c9c9c9", font_color=TEXT_PRIMARY,
) )
return fig return fig
@@ -59,6 +68,7 @@ def create_price_time_series(
y=price_column, y=price_column,
color=group_column, color=group_column,
title=title, title=title,
color_discrete_sequence=CHART_PALETTE,
) )
else: else:
fig = px.line( fig = px.line(
@@ -67,6 +77,7 @@ def create_price_time_series(
y=price_column, y=price_column,
title=title, title=title,
) )
fig.update_traces(line_color=CHART_PALETTE[0])
fig.update_layout( fig.update_layout(
height=350, height=350,
@@ -76,11 +87,11 @@ def create_price_time_series(
yaxis_tickprefix="$", yaxis_tickprefix="$",
yaxis_tickformat=",", yaxis_tickformat=",",
hovermode="x unified", hovermode="x unified",
paper_bgcolor="rgba(0,0,0,0)", paper_bgcolor=PAPER_BG,
plot_bgcolor="rgba(0,0,0,0)", plot_bgcolor=PLOT_BG,
font_color="#c9c9c9", font_color=TEXT_PRIMARY,
xaxis={"gridcolor": "#333333", "linecolor": "#444444"}, xaxis={"gridcolor": GRID_COLOR, "linecolor": GRID_COLOR},
yaxis={"gridcolor": "#333333", "linecolor": "#444444"}, yaxis={"gridcolor": GRID_COLOR, "linecolor": GRID_COLOR},
) )
return fig return fig
@@ -118,14 +129,14 @@ def create_volume_time_series(
x=0.5, x=0.5,
y=0.5, y=0.5,
showarrow=False, showarrow=False,
font={"color": "#888888"}, font={"color": TEXT_SECONDARY},
) )
fig.update_layout( fig.update_layout(
title=title, title=title,
height=350, height=350,
paper_bgcolor="rgba(0,0,0,0)", paper_bgcolor=PAPER_BG,
plot_bgcolor="rgba(0,0,0,0)", plot_bgcolor=PLOT_BG,
font_color="#c9c9c9", font_color=TEXT_PRIMARY,
) )
return fig return fig
@@ -140,6 +151,7 @@ def create_volume_time_series(
y=volume_column, y=volume_column,
color=group_column, color=group_column,
title=title, title=title,
color_discrete_sequence=CHART_PALETTE,
) )
else: else:
fig = px.bar( fig = px.bar(
@@ -148,6 +160,7 @@ def create_volume_time_series(
y=volume_column, y=volume_column,
title=title, title=title,
) )
fig.update_traces(marker_color=CHART_PALETTE[0])
else: else:
if group_column and group_column in df.columns: if group_column and group_column in df.columns:
fig = px.line( fig = px.line(
@@ -156,6 +169,7 @@ def create_volume_time_series(
y=volume_column, y=volume_column,
color=group_column, color=group_column,
title=title, title=title,
color_discrete_sequence=CHART_PALETTE,
) )
else: else:
fig = px.line( fig = px.line(
@@ -164,6 +178,7 @@ def create_volume_time_series(
y=volume_column, y=volume_column,
title=title, title=title,
) )
fig.update_traces(line_color=CHART_PALETTE[0])
fig.update_layout( fig.update_layout(
height=350, height=350,
@@ -172,11 +187,11 @@ def create_volume_time_series(
yaxis_title=volume_column.replace("_", " ").title(), yaxis_title=volume_column.replace("_", " ").title(),
yaxis_tickformat=",", yaxis_tickformat=",",
hovermode="x unified", hovermode="x unified",
paper_bgcolor="rgba(0,0,0,0)", paper_bgcolor=PAPER_BG,
plot_bgcolor="rgba(0,0,0,0)", plot_bgcolor=PLOT_BG,
font_color="#c9c9c9", font_color=TEXT_PRIMARY,
xaxis={"gridcolor": "#333333", "linecolor": "#444444"}, xaxis={"gridcolor": GRID_COLOR, "linecolor": GRID_COLOR},
yaxis={"gridcolor": "#333333", "linecolor": "#444444"}, yaxis={"gridcolor": GRID_COLOR, "linecolor": GRID_COLOR},
) )
return fig return fig
@@ -211,14 +226,14 @@ def create_market_comparison_chart(
x=0.5, x=0.5,
y=0.5, y=0.5,
showarrow=False, showarrow=False,
font={"color": "#888888"}, font={"color": TEXT_SECONDARY},
) )
fig.update_layout( fig.update_layout(
title=title, title=title,
height=400, height=400,
paper_bgcolor="rgba(0,0,0,0)", paper_bgcolor=PAPER_BG,
plot_bgcolor="rgba(0,0,0,0)", plot_bgcolor=PLOT_BG,
font_color="#c9c9c9", font_color=TEXT_PRIMARY,
) )
return fig return fig
@@ -230,8 +245,6 @@ def create_market_comparison_chart(
fig = make_subplots(specs=[[{"secondary_y": True}]]) fig = make_subplots(specs=[[{"secondary_y": True}]])
colors = ["#1f77b4", "#ff7f0e", "#2ca02c", "#d62728"]
for i, metric in enumerate(metrics[:4]): for i, metric in enumerate(metrics[:4]):
if metric not in df.columns: if metric not in df.columns:
continue continue
@@ -242,7 +255,7 @@ def create_market_comparison_chart(
x=df[date_column], x=df[date_column],
y=df[metric], y=df[metric],
name=metric.replace("_", " ").title(), name=metric.replace("_", " ").title(),
line={"color": colors[i % len(colors)]}, line={"color": CHART_PALETTE[i % len(CHART_PALETTE)]},
), ),
secondary_y=secondary, secondary_y=secondary,
) )
@@ -252,18 +265,18 @@ def create_market_comparison_chart(
height=400, height=400,
margin={"l": 40, "r": 40, "t": 50, "b": 40}, margin={"l": 40, "r": 40, "t": 50, "b": 40},
hovermode="x unified", hovermode="x unified",
paper_bgcolor="rgba(0,0,0,0)", paper_bgcolor=PAPER_BG,
plot_bgcolor="rgba(0,0,0,0)", plot_bgcolor=PLOT_BG,
font_color="#c9c9c9", font_color=TEXT_PRIMARY,
xaxis={"gridcolor": "#333333", "linecolor": "#444444"}, xaxis={"gridcolor": GRID_COLOR, "linecolor": GRID_COLOR},
yaxis={"gridcolor": "#333333", "linecolor": "#444444"}, yaxis={"gridcolor": GRID_COLOR, "linecolor": GRID_COLOR},
legend={ legend={
"orientation": "h", "orientation": "h",
"yanchor": "bottom", "yanchor": "bottom",
"y": 1.02, "y": 1.02,
"xanchor": "right", "xanchor": "right",
"x": 1, "x": 1,
"font": {"color": "#c9c9c9"}, "font": {"color": TEXT_PRIMARY},
}, },
) )
@@ -290,13 +303,13 @@ def add_policy_markers(
if not policy_events: if not policy_events:
return fig return fig
# Color mapping for policy categories # Color mapping for policy categories using design tokens
category_colors = { category_colors = {
"monetary": "#1f77b4", # Blue "monetary": CHART_PALETTE[0], # Blue
"tax": "#2ca02c", # Green "tax": CHART_PALETTE[3], # Teal/green
"regulatory": "#ff7f0e", # Orange "regulatory": CHART_PALETTE[1], # Orange
"supply": "#9467bd", # Purple "supply": CHART_PALETTE[6], # Pink
"economic": "#d62728", # Red "economic": CHART_PALETTE[5], # Vermillion
} }
# Symbol mapping for expected direction # Symbol mapping for expected direction
@@ -313,7 +326,7 @@ def add_policy_markers(
title = event.get("title", "Policy Event") title = event.get("title", "Policy Event")
level = event.get("level", "federal") level = event.get("level", "federal")
color = category_colors.get(category, "#666666") color = category_colors.get(category, TEXT_SECONDARY)
symbol = direction_symbols.get(direction, "circle") symbol = direction_symbols.get(direction, "circle")
# Add vertical line for the event # Add vertical line for the event
@@ -335,7 +348,7 @@ def add_policy_markers(
"symbol": symbol, "symbol": symbol,
"size": 12, "size": 12,
"color": color, "color": color,
"line": {"width": 1, "color": "white"}, "line": {"width": 1, "color": TEXT_PRIMARY},
}, },
name=title, name=title,
hovertemplate=( hovertemplate=(

View File

@@ -5,7 +5,15 @@ import pandas as pd
import plotly.graph_objects as go import plotly.graph_objects as go
from dash import Input, Output, callback from dash import Input, Output, callback
from portfolio_app.figures import ( from portfolio_app.design import (
CHART_PALETTE,
GRID_COLOR,
PAPER_BG,
PLOT_BG,
TEXT_PRIMARY,
TEXT_SECONDARY,
)
from portfolio_app.figures.toronto import (
create_donut_chart, create_donut_chart,
create_horizontal_bar, create_horizontal_bar,
create_radar_figure, create_radar_figure,
@@ -109,18 +117,18 @@ def update_housing_trend(year: str, neighbourhood_id: int | None) -> go.Figure:
x=[d["year"] for d in data], x=[d["year"] for d in data],
y=[d["avg_rent"] for d in data], y=[d["avg_rent"] for d in data],
mode="lines+markers", mode="lines+markers",
line={"color": "#2196F3", "width": 2}, line={"color": CHART_PALETTE[0], "width": 2},
marker={"size": 8}, marker={"size": 8},
name="City Average", name="City Average",
) )
) )
fig.update_layout( fig.update_layout(
paper_bgcolor="rgba(0,0,0,0)", paper_bgcolor=PAPER_BG,
plot_bgcolor="rgba(0,0,0,0)", plot_bgcolor=PLOT_BG,
font_color="#c9c9c9", font_color=TEXT_PRIMARY,
xaxis={"gridcolor": "rgba(128,128,128,0.2)"}, xaxis={"gridcolor": GRID_COLOR},
yaxis={"gridcolor": "rgba(128,128,128,0.2)", "title": "Avg Rent (2BR)"}, yaxis={"gridcolor": GRID_COLOR, "title": "Avg Rent (2BR)"},
showlegend=False, showlegend=False,
margin={"l": 40, "r": 10, "t": 10, "b": 30}, margin={"l": 40, "r": 10, "t": 10, "b": 30},
) )
@@ -153,7 +161,7 @@ def update_housing_types(year: str) -> go.Figure:
data=data, data=data,
name_column="type", name_column="type",
value_column="percentage", value_column="percentage",
colors=["#4CAF50", "#2196F3"], colors=[CHART_PALETTE[3], CHART_PALETTE[0]], # Teal for owner, blue for renter
) )
@@ -178,19 +186,19 @@ def update_safety_trend(year: str) -> go.Figure:
x=[d["year"] for d in data], x=[d["year"] for d in data],
y=[d["crime_rate"] for d in data], y=[d["crime_rate"] for d in data],
mode="lines+markers", mode="lines+markers",
line={"color": "#FF5722", "width": 2}, line={"color": CHART_PALETTE[5], "width": 2}, # Vermillion
marker={"size": 8}, marker={"size": 8},
fill="tozeroy", fill="tozeroy",
fillcolor="rgba(255,87,34,0.1)", fillcolor="rgba(213, 94, 0, 0.1)", # Vermillion with opacity
) )
) )
fig.update_layout( fig.update_layout(
paper_bgcolor="rgba(0,0,0,0)", paper_bgcolor=PAPER_BG,
plot_bgcolor="rgba(0,0,0,0)", plot_bgcolor=PLOT_BG,
font_color="#c9c9c9", font_color=TEXT_PRIMARY,
xaxis={"gridcolor": "rgba(128,128,128,0.2)"}, xaxis={"gridcolor": GRID_COLOR},
yaxis={"gridcolor": "rgba(128,128,128,0.2)", "title": "Crime Rate per 100K"}, yaxis={"gridcolor": GRID_COLOR, "title": "Crime Rate per 100K"},
showlegend=False, showlegend=False,
margin={"l": 40, "r": 10, "t": 10, "b": 30}, margin={"l": 40, "r": 10, "t": 10, "b": 30},
) )
@@ -233,7 +241,7 @@ def update_safety_types(year: str) -> go.Figure:
data=data, data=data,
name_column="category", name_column="category",
value_column="count", value_column="count",
color="#FF5722", color=CHART_PALETTE[5], # Vermillion for crime
) )
@@ -264,7 +272,11 @@ def update_demographics_age(year: str) -> go.Figure:
data=data, data=data,
name_column="age_group", name_column="age_group",
value_column="percentage", value_column="percentage",
colors=["#9C27B0", "#673AB7", "#3F51B5"], colors=[
CHART_PALETTE[2],
CHART_PALETTE[0],
CHART_PALETTE[4],
], # Sky, Blue, Yellow
) )
@@ -301,7 +313,7 @@ def update_demographics_income(year: str) -> go.Figure:
data=data, data=data,
name_column="bracket", name_column="bracket",
value_column="count", value_column="count",
color="#4CAF50", color=CHART_PALETTE[3], # Teal
sort=False, sort=False,
) )
@@ -333,7 +345,7 @@ def update_amenities_breakdown(year: str) -> go.Figure:
data=data, data=data,
name_column="type", name_column="type",
value_column="count", value_column="count",
color="#4CAF50", color=CHART_PALETTE[3], # Teal
) )
@@ -387,9 +399,9 @@ def _empty_chart(message: str) -> go.Figure:
"""Create an empty chart with a message.""" """Create an empty chart with a message."""
fig = go.Figure() fig = go.Figure()
fig.update_layout( fig.update_layout(
paper_bgcolor="rgba(0,0,0,0)", paper_bgcolor=PAPER_BG,
plot_bgcolor="rgba(0,0,0,0)", plot_bgcolor=PLOT_BG,
font_color="#c9c9c9", font_color=TEXT_PRIMARY,
xaxis={"visible": False}, xaxis={"visible": False},
yaxis={"visible": False}, yaxis={"visible": False},
) )
@@ -400,6 +412,6 @@ def _empty_chart(message: str) -> go.Figure:
x=0.5, x=0.5,
y=0.5, y=0.5,
showarrow=False, showarrow=False,
font={"size": 14, "color": "#888888"}, font={"size": 14, "color": TEXT_SECONDARY},
) )
return fig return fig

View File

@@ -4,7 +4,13 @@
import plotly.graph_objects as go import plotly.graph_objects as go
from dash import Input, Output, State, callback, no_update from dash import Input, Output, State, callback, no_update
from portfolio_app.figures import create_choropleth_figure, create_ranking_bar from portfolio_app.design import (
PAPER_BG,
PLOT_BG,
TEXT_PRIMARY,
TEXT_SECONDARY,
)
from portfolio_app.figures.toronto import create_choropleth_figure, create_ranking_bar
from portfolio_app.toronto.services import ( from portfolio_app.toronto.services import (
get_amenities_data, get_amenities_data,
get_demographics_data, get_demographics_data,
@@ -267,8 +273,8 @@ def _empty_map(message: str) -> go.Figure:
"zoom": 9.5, "zoom": 9.5,
}, },
margin={"l": 0, "r": 0, "t": 0, "b": 0}, margin={"l": 0, "r": 0, "t": 0, "b": 0},
paper_bgcolor="rgba(0,0,0,0)", paper_bgcolor=PAPER_BG,
font_color="#c9c9c9", font_color=TEXT_PRIMARY,
) )
fig.add_annotation( fig.add_annotation(
text=message, text=message,
@@ -277,7 +283,7 @@ def _empty_map(message: str) -> go.Figure:
x=0.5, x=0.5,
y=0.5, y=0.5,
showarrow=False, showarrow=False,
font={"size": 14, "color": "#888888"}, font={"size": 14, "color": TEXT_SECONDARY},
) )
return fig return fig
@@ -286,9 +292,9 @@ def _empty_chart(message: str) -> go.Figure:
"""Create an empty chart with a message.""" """Create an empty chart with a message."""
fig = go.Figure() fig = go.Figure()
fig.update_layout( fig.update_layout(
paper_bgcolor="rgba(0,0,0,0)", paper_bgcolor=PAPER_BG,
plot_bgcolor="rgba(0,0,0,0)", plot_bgcolor=PLOT_BG,
font_color="#c9c9c9", font_color=TEXT_PRIMARY,
xaxis={"visible": False}, xaxis={"visible": False},
yaxis={"visible": False}, yaxis={"visible": False},
) )
@@ -299,6 +305,6 @@ def _empty_chart(message: str) -> go.Figure:
x=0.5, x=0.5,
y=0.5, y=0.5,
showarrow=False, showarrow=False,
font={"size": 14, "color": "#888888"}, font={"size": 14, "color": TEXT_SECONDARY},
) )
return fig return fig

View File

@@ -8,11 +8,18 @@ from sqlalchemy.orm import Mapped, mapped_column
from .base import Base from .base import Base
# Schema constants
RAW_TORONTO_SCHEMA = "raw_toronto"
class DimTime(Base): class DimTime(Base):
"""Time dimension table.""" """Time dimension table (shared across all projects).
Note: Stays in public schema as it's a shared dimension.
"""
__tablename__ = "dim_time" __tablename__ = "dim_time"
__table_args__ = {"schema": "public"}
date_key: Mapped[int] = mapped_column(Integer, primary_key=True) date_key: Mapped[int] = mapped_column(Integer, primary_key=True)
full_date: Mapped[date] = mapped_column(Date, nullable=False, unique=True) full_date: Mapped[date] = mapped_column(Date, nullable=False, unique=True)
@@ -27,6 +34,7 @@ class DimCMHCZone(Base):
"""CMHC zone dimension table with PostGIS geometry.""" """CMHC zone dimension table with PostGIS geometry."""
__tablename__ = "dim_cmhc_zone" __tablename__ = "dim_cmhc_zone"
__table_args__ = {"schema": RAW_TORONTO_SCHEMA}
zone_key: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True) zone_key: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
zone_code: Mapped[str] = mapped_column(String(10), nullable=False, unique=True) zone_code: Mapped[str] = mapped_column(String(10), nullable=False, unique=True)
@@ -41,6 +49,7 @@ class DimNeighbourhood(Base):
""" """
__tablename__ = "dim_neighbourhood" __tablename__ = "dim_neighbourhood"
__table_args__ = {"schema": RAW_TORONTO_SCHEMA}
neighbourhood_id: Mapped[int] = mapped_column(Integer, primary_key=True) neighbourhood_id: Mapped[int] = mapped_column(Integer, primary_key=True)
name: Mapped[str] = mapped_column(String(100), nullable=False) name: Mapped[str] = mapped_column(String(100), nullable=False)
@@ -69,6 +78,7 @@ class DimPolicyEvent(Base):
"""Policy event dimension for time-series annotation.""" """Policy event dimension for time-series annotation."""
__tablename__ = "dim_policy_event" __tablename__ = "dim_policy_event"
__table_args__ = {"schema": RAW_TORONTO_SCHEMA}
event_id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True) event_id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
event_date: Mapped[date] = mapped_column(Date, nullable=False) event_date: Mapped[date] = mapped_column(Date, nullable=False)

View File

@@ -4,6 +4,7 @@ from sqlalchemy import ForeignKey, Index, Integer, Numeric, String
from sqlalchemy.orm import Mapped, mapped_column, relationship from sqlalchemy.orm import Mapped, mapped_column, relationship
from .base import Base from .base import Base
from .dimensions import RAW_TORONTO_SCHEMA
class BridgeCMHCNeighbourhood(Base): class BridgeCMHCNeighbourhood(Base):
@@ -14,6 +15,11 @@ class BridgeCMHCNeighbourhood(Base):
""" """
__tablename__ = "bridge_cmhc_neighbourhood" __tablename__ = "bridge_cmhc_neighbourhood"
__table_args__ = (
Index("ix_bridge_cmhc_zone", "cmhc_zone_code"),
Index("ix_bridge_neighbourhood", "neighbourhood_id"),
{"schema": RAW_TORONTO_SCHEMA},
)
id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True) id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
cmhc_zone_code: Mapped[str] = mapped_column(String(10), nullable=False) cmhc_zone_code: Mapped[str] = mapped_column(String(10), nullable=False)
@@ -22,11 +28,6 @@ class BridgeCMHCNeighbourhood(Base):
Numeric(5, 4), nullable=False Numeric(5, 4), nullable=False
) # 0.0000 to 1.0000 ) # 0.0000 to 1.0000
__table_args__ = (
Index("ix_bridge_cmhc_zone", "cmhc_zone_code"),
Index("ix_bridge_neighbourhood", "neighbourhood_id"),
)
class FactCensus(Base): class FactCensus(Base):
"""Census statistics by neighbourhood and year. """Census statistics by neighbourhood and year.
@@ -35,6 +36,10 @@ class FactCensus(Base):
""" """
__tablename__ = "fact_census" __tablename__ = "fact_census"
__table_args__ = (
Index("ix_fact_census_neighbourhood_year", "neighbourhood_id", "census_year"),
{"schema": RAW_TORONTO_SCHEMA},
)
id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True) id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
neighbourhood_id: Mapped[int] = mapped_column(Integer, nullable=False) neighbourhood_id: Mapped[int] = mapped_column(Integer, nullable=False)
@@ -66,10 +71,6 @@ class FactCensus(Base):
Numeric(12, 2), nullable=True Numeric(12, 2), nullable=True
) )
__table_args__ = (
Index("ix_fact_census_neighbourhood_year", "neighbourhood_id", "census_year"),
)
class FactCrime(Base): class FactCrime(Base):
"""Crime statistics by neighbourhood and year. """Crime statistics by neighbourhood and year.
@@ -78,6 +79,11 @@ class FactCrime(Base):
""" """
__tablename__ = "fact_crime" __tablename__ = "fact_crime"
__table_args__ = (
Index("ix_fact_crime_neighbourhood_year", "neighbourhood_id", "year"),
Index("ix_fact_crime_type", "crime_type"),
{"schema": RAW_TORONTO_SCHEMA},
)
id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True) id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
neighbourhood_id: Mapped[int] = mapped_column(Integer, nullable=False) neighbourhood_id: Mapped[int] = mapped_column(Integer, nullable=False)
@@ -86,11 +92,6 @@ class FactCrime(Base):
count: Mapped[int] = mapped_column(Integer, nullable=False) count: Mapped[int] = mapped_column(Integer, nullable=False)
rate_per_100k: Mapped[float | None] = mapped_column(Numeric(10, 2), nullable=True) rate_per_100k: Mapped[float | None] = mapped_column(Numeric(10, 2), nullable=True)
__table_args__ = (
Index("ix_fact_crime_neighbourhood_year", "neighbourhood_id", "year"),
Index("ix_fact_crime_type", "crime_type"),
)
class FactAmenities(Base): class FactAmenities(Base):
"""Amenity counts by neighbourhood. """Amenity counts by neighbourhood.
@@ -99,6 +100,11 @@ class FactAmenities(Base):
""" """
__tablename__ = "fact_amenities" __tablename__ = "fact_amenities"
__table_args__ = (
Index("ix_fact_amenities_neighbourhood_year", "neighbourhood_id", "year"),
Index("ix_fact_amenities_type", "amenity_type"),
{"schema": RAW_TORONTO_SCHEMA},
)
id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True) id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
neighbourhood_id: Mapped[int] = mapped_column(Integer, nullable=False) neighbourhood_id: Mapped[int] = mapped_column(Integer, nullable=False)
@@ -106,11 +112,6 @@ class FactAmenities(Base):
count: Mapped[int] = mapped_column(Integer, nullable=False) count: Mapped[int] = mapped_column(Integer, nullable=False)
year: Mapped[int] = mapped_column(Integer, nullable=False) year: Mapped[int] = mapped_column(Integer, nullable=False)
__table_args__ = (
Index("ix_fact_amenities_neighbourhood_year", "neighbourhood_id", "year"),
Index("ix_fact_amenities_type", "amenity_type"),
)
class FactRentals(Base): class FactRentals(Base):
"""Fact table for CMHC rental market data. """Fact table for CMHC rental market data.
@@ -119,13 +120,16 @@ class FactRentals(Base):
""" """
__tablename__ = "fact_rentals" __tablename__ = "fact_rentals"
__table_args__ = {"schema": RAW_TORONTO_SCHEMA}
id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True) id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
date_key: Mapped[int] = mapped_column( date_key: Mapped[int] = mapped_column(
Integer, ForeignKey("dim_time.date_key"), nullable=False Integer, ForeignKey("public.dim_time.date_key"), nullable=False
) )
zone_key: Mapped[int] = mapped_column( zone_key: Mapped[int] = mapped_column(
Integer, ForeignKey("dim_cmhc_zone.zone_key"), nullable=False Integer,
ForeignKey(f"{RAW_TORONTO_SCHEMA}.dim_cmhc_zone.zone_key"),
nullable=False,
) )
bedroom_type: Mapped[str] = mapped_column(String(20), nullable=False) bedroom_type: Mapped[str] = mapped_column(String(20), nullable=False)
universe: Mapped[int | None] = mapped_column(Integer, nullable=True) universe: Mapped[int | None] = mapped_column(Integer, nullable=True)
@@ -139,6 +143,6 @@ class FactRentals(Base):
rent_change_pct: Mapped[float | None] = mapped_column(Numeric(5, 2), nullable=True) rent_change_pct: Mapped[float | None] = mapped_column(Numeric(5, 2), nullable=True)
reliability_code: Mapped[str | None] = mapped_column(String(2), nullable=True) reliability_code: Mapped[str | None] = mapped_column(String(2), nullable=True)
# Relationships # Relationships - explicit foreign_keys needed for cross-schema joins
time = relationship("DimTime", backref="rentals") time = relationship("DimTime", foreign_keys=[date_key], backref="rentals")
zone = relationship("DimCMHCZone", backref="rentals") zone = relationship("DimCMHCZone", foreign_keys=[zone_key], backref="rentals")

View File

@@ -15,6 +15,7 @@ from pathlib import Path
sys.path.insert(0, str(Path(__file__).parent.parent.parent)) sys.path.insert(0, str(Path(__file__).parent.parent.parent))
from portfolio_app.toronto.models import create_tables, get_engine # noqa: E402 from portfolio_app.toronto.models import create_tables, get_engine # noqa: E402
from portfolio_app.toronto.models.dimensions import RAW_TORONTO_SCHEMA # noqa: E402
def main() -> int: def main() -> int:
@@ -32,16 +33,30 @@ def main() -> int:
result.fetchone() result.fetchone()
print("Database connection successful") print("Database connection successful")
# Create domain-specific schemas
with engine.connect() as conn:
conn.execute(text(f"CREATE SCHEMA IF NOT EXISTS {RAW_TORONTO_SCHEMA}"))
conn.commit()
print(f"Created schema: {RAW_TORONTO_SCHEMA}")
# Create all tables # Create all tables
create_tables() create_tables()
print("Schema created successfully") print("Schema created successfully")
# List created tables # List created tables by schema
from sqlalchemy import inspect from sqlalchemy import inspect
inspector = inspect(engine) inspector = inspect(engine)
tables = inspector.get_table_names()
print(f"Created tables: {', '.join(tables)}") # Public schema tables
public_tables = inspector.get_table_names(schema="public")
if public_tables:
print(f"Public schema tables: {', '.join(public_tables)}")
# raw_toronto schema tables
toronto_tables = inspector.get_table_names(schema=RAW_TORONTO_SCHEMA)
if toronto_tables:
print(f"{RAW_TORONTO_SCHEMA} schema tables: {', '.join(toronto_tables)}")
return 0 return 0