docs: Rewrite documentation with accurate project state
- Delete obsolete change proposals and bio content source - Rewrite README.md with correct features, data sources, structure - Update PROJECT_REFERENCE.md with accurate status and completed work - Update CLAUDE.md references and sprint status - Add docs/CONTRIBUTING.md developer guide with: - How to add blog posts (frontmatter, markdown) - How to add new pages (Dash routing) - How to add dashboard tabs - How to create figure factories - Branch workflow and code standards Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
480
docs/CONTRIBUTING.md
Normal file
480
docs/CONTRIBUTING.md
Normal file
@@ -0,0 +1,480 @@
|
||||
# Developer Guide
|
||||
|
||||
Instructions for contributing to the Analytics Portfolio project.
|
||||
|
||||
---
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Development Setup](#development-setup)
|
||||
2. [Adding a Blog Post](#adding-a-blog-post)
|
||||
3. [Adding a New Page](#adding-a-new-page)
|
||||
4. [Adding a Dashboard Tab](#adding-a-dashboard-tab)
|
||||
5. [Creating Figure Factories](#creating-figure-factories)
|
||||
6. [Branch Workflow](#branch-workflow)
|
||||
7. [Code Standards](#code-standards)
|
||||
|
||||
---
|
||||
|
||||
## Development Setup
|
||||
|
||||
### Prerequisites
|
||||
|
||||
- Python 3.11+ (via pyenv)
|
||||
- Docker and Docker Compose
|
||||
- Git
|
||||
|
||||
### Initial Setup
|
||||
|
||||
```bash
|
||||
# Clone repository
|
||||
git clone https://gitea.hotserv.cloud/lmiranda/personal-portfolio.git
|
||||
cd personal-portfolio
|
||||
|
||||
# Run setup (creates venv, installs deps, copies .env.example)
|
||||
make setup
|
||||
|
||||
# Start PostgreSQL + PostGIS
|
||||
make docker-up
|
||||
|
||||
# Initialize database
|
||||
make db-init
|
||||
|
||||
# Start development server
|
||||
make run
|
||||
```
|
||||
|
||||
The app runs at `http://localhost:8050`.
|
||||
|
||||
### Useful Commands
|
||||
|
||||
```bash
|
||||
make test # Run tests
|
||||
make lint # Check code style
|
||||
make format # Auto-format code
|
||||
make ci # Run all checks (lint + test)
|
||||
make dbt-run # Run dbt transformations
|
||||
make dbt-test # Run dbt tests
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Adding a Blog Post
|
||||
|
||||
Blog posts are Markdown files with YAML frontmatter, stored in `portfolio_app/content/blog/`.
|
||||
|
||||
### Step 1: Create the Markdown File
|
||||
|
||||
Create a new file in `portfolio_app/content/blog/`:
|
||||
|
||||
```bash
|
||||
touch portfolio_app/content/blog/your-article-slug.md
|
||||
```
|
||||
|
||||
The filename becomes the URL slug: `/blog/your-article-slug`
|
||||
|
||||
### Step 2: Add Frontmatter
|
||||
|
||||
Every blog post requires YAML frontmatter at the top:
|
||||
|
||||
```markdown
|
||||
---
|
||||
title: "Your Article Title"
|
||||
date: "2026-01-17"
|
||||
description: "A brief description for the article card (1-2 sentences)"
|
||||
tags:
|
||||
- data-engineering
|
||||
- python
|
||||
- lessons-learned
|
||||
status: published
|
||||
---
|
||||
|
||||
Your article content starts here...
|
||||
```
|
||||
|
||||
**Required fields:**
|
||||
|
||||
| Field | Description |
|
||||
|-------|-------------|
|
||||
| `title` | Article title (displayed on cards and page) |
|
||||
| `date` | Publication date in `YYYY-MM-DD` format |
|
||||
| `description` | Short summary for article listing cards |
|
||||
| `tags` | List of tags (displayed as badges) |
|
||||
| `status` | `published` or `draft` (drafts are hidden from listing) |
|
||||
|
||||
### Step 3: Write Content
|
||||
|
||||
Use standard Markdown:
|
||||
|
||||
```markdown
|
||||
## Section Heading
|
||||
|
||||
Regular paragraph text.
|
||||
|
||||
### Subsection
|
||||
|
||||
- Bullet points
|
||||
- Another point
|
||||
|
||||
```python
|
||||
# Code blocks with syntax highlighting
|
||||
def example():
|
||||
return "Hello"
|
||||
```
|
||||
|
||||
**Bold text** and *italic text*.
|
||||
|
||||
> Blockquotes for callouts
|
||||
```
|
||||
|
||||
### Step 4: Test Locally
|
||||
|
||||
```bash
|
||||
make run
|
||||
```
|
||||
|
||||
Visit `http://localhost:8050/blog` to see the article listing.
|
||||
Visit `http://localhost:8050/blog/your-article-slug` for the full article.
|
||||
|
||||
### Example: Complete Blog Post
|
||||
|
||||
```markdown
|
||||
---
|
||||
title: "Building ETL Pipelines with Python"
|
||||
date: "2026-01-17"
|
||||
description: "Lessons from building production data pipelines at scale"
|
||||
tags:
|
||||
- python
|
||||
- etl
|
||||
- data-engineering
|
||||
status: published
|
||||
---
|
||||
|
||||
When I started building data pipelines, I made every mistake possible...
|
||||
|
||||
## The Problem
|
||||
|
||||
Most tutorials show toy examples. Real pipelines are different.
|
||||
|
||||
### Error Handling
|
||||
|
||||
```python
|
||||
def safe_transform(df: pd.DataFrame) -> pd.DataFrame:
|
||||
try:
|
||||
return df.apply(transform_row, axis=1)
|
||||
except ValueError as e:
|
||||
logger.error(f"Transform failed: {e}")
|
||||
raise
|
||||
```
|
||||
|
||||
## Conclusion
|
||||
|
||||
Ship something that works, then iterate.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Adding a New Page
|
||||
|
||||
Pages use Dash's automatic routing based on file location in `portfolio_app/pages/`.
|
||||
|
||||
### Step 1: Create the Page File
|
||||
|
||||
```bash
|
||||
touch portfolio_app/pages/your_page.py
|
||||
```
|
||||
|
||||
### Step 2: Register the Page
|
||||
|
||||
Every page must call `dash.register_page()`:
|
||||
|
||||
```python
|
||||
"""Your page description."""
|
||||
|
||||
import dash
|
||||
import dash_mantine_components as dmc
|
||||
|
||||
dash.register_page(
|
||||
__name__,
|
||||
path="/your-page", # URL path
|
||||
name="Your Page", # Display name (for nav)
|
||||
title="Your Page Title" # Browser tab title
|
||||
)
|
||||
|
||||
|
||||
def layout() -> dmc.Container:
|
||||
"""Page layout function."""
|
||||
return dmc.Container(
|
||||
dmc.Stack(
|
||||
[
|
||||
dmc.Title("Your Page", order=1),
|
||||
dmc.Text("Page content here."),
|
||||
],
|
||||
gap="lg",
|
||||
),
|
||||
size="md",
|
||||
py="xl",
|
||||
)
|
||||
```
|
||||
|
||||
### Step 3: Page with Dynamic Content
|
||||
|
||||
For pages with URL parameters:
|
||||
|
||||
```python
|
||||
# pages/blog/article.py
|
||||
dash.register_page(
|
||||
__name__,
|
||||
path_template="/blog/<slug>", # Dynamic parameter
|
||||
name="Article",
|
||||
)
|
||||
|
||||
|
||||
def layout(slug: str = "") -> dmc.Container:
|
||||
"""Layout receives URL parameters as arguments."""
|
||||
article = get_article(slug)
|
||||
if not article:
|
||||
return dmc.Text("Article not found")
|
||||
|
||||
return dmc.Container(
|
||||
dmc.Title(article["meta"]["title"]),
|
||||
# ...
|
||||
)
|
||||
```
|
||||
|
||||
### Step 4: Add Navigation (Optional)
|
||||
|
||||
To add the page to the sidebar, edit `portfolio_app/components/sidebar.py`:
|
||||
|
||||
```python
|
||||
NAV_ITEMS = [
|
||||
{"label": "Home", "href": "/", "icon": "tabler:home"},
|
||||
{"label": "Your Page", "href": "/your-page", "icon": "tabler:star"},
|
||||
# ...
|
||||
]
|
||||
```
|
||||
|
||||
### URL Routing Summary
|
||||
|
||||
| File Location | URL |
|
||||
|---------------|-----|
|
||||
| `pages/home.py` | `/` (if `path="/"`) |
|
||||
| `pages/about.py` | `/about` |
|
||||
| `pages/blog/index.py` | `/blog` |
|
||||
| `pages/blog/article.py` | `/blog/<slug>` |
|
||||
| `pages/toronto/dashboard.py` | `/toronto` |
|
||||
|
||||
---
|
||||
|
||||
## Adding a Dashboard Tab
|
||||
|
||||
Dashboard tabs are in `portfolio_app/pages/toronto/tabs/`.
|
||||
|
||||
### Step 1: Create Tab Layout
|
||||
|
||||
```python
|
||||
# pages/toronto/tabs/your_tab.py
|
||||
"""Your tab description."""
|
||||
|
||||
import dash_mantine_components as dmc
|
||||
|
||||
from portfolio_app.figures.choropleth import create_choropleth
|
||||
from portfolio_app.toronto.demo_data import get_demo_data
|
||||
|
||||
|
||||
def create_your_tab_layout() -> dmc.Stack:
|
||||
"""Create the tab layout."""
|
||||
data = get_demo_data()
|
||||
|
||||
return dmc.Stack(
|
||||
[
|
||||
dmc.Grid(
|
||||
[
|
||||
dmc.GridCol(
|
||||
# Map on left
|
||||
create_choropleth(data, "your_metric"),
|
||||
span=8,
|
||||
),
|
||||
dmc.GridCol(
|
||||
# KPI cards on right
|
||||
create_kpi_cards(data),
|
||||
span=4,
|
||||
),
|
||||
],
|
||||
),
|
||||
# Charts below
|
||||
create_supporting_charts(data),
|
||||
],
|
||||
gap="lg",
|
||||
)
|
||||
```
|
||||
|
||||
### Step 2: Register in Dashboard
|
||||
|
||||
Edit `pages/toronto/dashboard.py` to add the tab:
|
||||
|
||||
```python
|
||||
from portfolio_app.pages.toronto.tabs.your_tab import create_your_tab_layout
|
||||
|
||||
# In the tabs list:
|
||||
dmc.TabsTab("Your Tab", value="your-tab"),
|
||||
|
||||
# In the panels:
|
||||
dmc.TabsPanel(create_your_tab_layout(), value="your-tab"),
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Creating Figure Factories
|
||||
|
||||
Figure factories are in `portfolio_app/figures/`. They create reusable Plotly figures.
|
||||
|
||||
### Pattern
|
||||
|
||||
```python
|
||||
# figures/your_chart.py
|
||||
"""Your chart type factory."""
|
||||
|
||||
import plotly.express as px
|
||||
import plotly.graph_objects as go
|
||||
import pandas as pd
|
||||
|
||||
|
||||
def create_your_chart(
|
||||
df: pd.DataFrame,
|
||||
x_col: str,
|
||||
y_col: str,
|
||||
title: str = "",
|
||||
) -> go.Figure:
|
||||
"""Create a your_chart figure.
|
||||
|
||||
Args:
|
||||
df: DataFrame with data.
|
||||
x_col: Column for x-axis.
|
||||
y_col: Column for y-axis.
|
||||
title: Optional chart title.
|
||||
|
||||
Returns:
|
||||
Configured Plotly figure.
|
||||
"""
|
||||
fig = px.bar(df, x=x_col, y=y_col, title=title)
|
||||
|
||||
fig.update_layout(
|
||||
template="plotly_white",
|
||||
margin=dict(l=40, r=40, t=40, b=40),
|
||||
)
|
||||
|
||||
return fig
|
||||
```
|
||||
|
||||
### Export from `__init__.py`
|
||||
|
||||
```python
|
||||
# figures/__init__.py
|
||||
from .your_chart import create_your_chart
|
||||
|
||||
__all__ = [
|
||||
"create_your_chart",
|
||||
# ...
|
||||
]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Branch Workflow
|
||||
|
||||
```
|
||||
main (production)
|
||||
↑
|
||||
staging (pre-production)
|
||||
↑
|
||||
development (integration)
|
||||
↑
|
||||
feature/XX-description (your work)
|
||||
```
|
||||
|
||||
### Creating a Feature Branch
|
||||
|
||||
```bash
|
||||
# Start from development
|
||||
git checkout development
|
||||
git pull origin development
|
||||
|
||||
# Create feature branch
|
||||
git checkout -b feature/10-add-new-page
|
||||
|
||||
# Work, commit, push
|
||||
git add .
|
||||
git commit -m "feat: Add new page"
|
||||
git push -u origin feature/10-add-new-page
|
||||
```
|
||||
|
||||
### Merging
|
||||
|
||||
```bash
|
||||
# Merge into development
|
||||
git checkout development
|
||||
git merge feature/10-add-new-page
|
||||
git push origin development
|
||||
|
||||
# Delete feature branch
|
||||
git branch -d feature/10-add-new-page
|
||||
git push origin --delete feature/10-add-new-page
|
||||
```
|
||||
|
||||
**Rules:**
|
||||
- Never commit directly to `main` or `staging`
|
||||
- Never delete `development`
|
||||
- Feature branches are temporary
|
||||
|
||||
---
|
||||
|
||||
## Code Standards
|
||||
|
||||
### Type Hints
|
||||
|
||||
Use Python 3.10+ style:
|
||||
|
||||
```python
|
||||
def process(items: list[str], config: dict[str, int] | None = None) -> bool:
|
||||
...
|
||||
```
|
||||
|
||||
### Imports
|
||||
|
||||
| Context | Style |
|
||||
|---------|-------|
|
||||
| Same directory | `from .module import X` |
|
||||
| Sibling directory | `from ..schemas.model import Y` |
|
||||
| External packages | `import pandas as pd` |
|
||||
|
||||
### Formatting
|
||||
|
||||
```bash
|
||||
make format # Runs ruff formatter
|
||||
make lint # Checks style
|
||||
```
|
||||
|
||||
### Docstrings
|
||||
|
||||
Google style, only for non-obvious functions:
|
||||
|
||||
```python
|
||||
def calculate_score(values: list[float], weights: list[float]) -> float:
|
||||
"""Calculate weighted score.
|
||||
|
||||
Args:
|
||||
values: Raw metric values.
|
||||
weights: Weight for each metric.
|
||||
|
||||
Returns:
|
||||
Weighted average score.
|
||||
"""
|
||||
...
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Questions?
|
||||
|
||||
Check `CLAUDE.md` for AI assistant context and architectural decisions.
|
||||
@@ -1,21 +1,171 @@
|
||||
# Portfolio Project Reference
|
||||
|
||||
**Project**: Analytics Portfolio
|
||||
**Owner**: Leo
|
||||
**Status**: Ready for Sprint 1
|
||||
**Owner**: Leo Miranda
|
||||
**Status**: Sprint 9 Complete (Dashboard Implementation Done)
|
||||
**Last Updated**: January 2026
|
||||
|
||||
---
|
||||
|
||||
## Project Overview
|
||||
|
||||
Two-project analytics portfolio demonstrating end-to-end data engineering, visualization, and ML capabilities.
|
||||
Personal portfolio website with an interactive Toronto Neighbourhood Dashboard demonstrating data engineering, visualization, and analytics capabilities.
|
||||
|
||||
| Project | Domain | Key Skills | Phase |
|
||||
|---------|--------|------------|-------|
|
||||
| **Toronto Housing Dashboard** | Real estate | ETL, dimensional modeling, geospatial, choropleth | Phase 1 (Active) |
|
||||
| **Energy Pricing Analysis** | Utility markets | Time series, ML prediction, API integration | Phase 3 (Future) |
|
||||
| Component | Description | Status |
|
||||
|-----------|-------------|--------|
|
||||
| Portfolio Website | Bio, About, Projects, Resume, Contact, Blog | Complete |
|
||||
| Toronto Dashboard | 5-tab neighbourhood analysis | Complete |
|
||||
| Data Pipeline | dbt models, figure factories | Complete |
|
||||
| Deployment | Production deployment | Pending |
|
||||
|
||||
**Platform**: Monolithic Dash application on self-hosted VPS (bio landing page + dashboards).
|
||||
---
|
||||
|
||||
## Completed Work
|
||||
|
||||
### Sprint 1-6: Foundation
|
||||
- Repository setup, Docker, PostgreSQL + PostGIS
|
||||
- Bio landing page implementation
|
||||
- Initial data model design
|
||||
|
||||
### Sprint 7: Navigation & Theme
|
||||
- Sidebar navigation
|
||||
- Dark/light theme toggle
|
||||
- dash-mantine-components integration
|
||||
|
||||
### Sprint 8: Portfolio Website
|
||||
- About, Contact, Projects, Resume pages
|
||||
- Blog system with Markdown/frontmatter
|
||||
- Health endpoint
|
||||
|
||||
### Sprint 9: Neighbourhood Dashboard Transition
|
||||
- Phase 1: Deleted legacy TRREB code
|
||||
- Phase 2: Documentation cleanup
|
||||
- Phase 3: New neighbourhood-centric data model
|
||||
- Phase 4: dbt model restructuring
|
||||
- Phase 5: 5-tab dashboard implementation
|
||||
- Phase 6: 15 documentation notebooks
|
||||
- Phase 7: Final documentation review
|
||||
|
||||
---
|
||||
|
||||
## Application Architecture
|
||||
|
||||
### URL Routes
|
||||
|
||||
| URL | Page | File |
|
||||
|-----|------|------|
|
||||
| `/` | Home | `pages/home.py` |
|
||||
| `/about` | About | `pages/about.py` |
|
||||
| `/contact` | Contact | `pages/contact.py` |
|
||||
| `/projects` | Projects | `pages/projects.py` |
|
||||
| `/resume` | Resume | `pages/resume.py` |
|
||||
| `/blog` | Blog listing | `pages/blog/index.py` |
|
||||
| `/blog/{slug}` | Article | `pages/blog/article.py` |
|
||||
| `/toronto` | Dashboard | `pages/toronto/dashboard.py` |
|
||||
| `/toronto/methodology` | Methodology | `pages/toronto/methodology.py` |
|
||||
| `/health` | Health check | `pages/health.py` |
|
||||
|
||||
### Directory Structure
|
||||
|
||||
```
|
||||
portfolio_app/
|
||||
├── app.py # Dash app factory
|
||||
├── config.py # Pydantic BaseSettings
|
||||
├── assets/ # CSS, images
|
||||
├── callbacks/ # Global callbacks (sidebar, theme)
|
||||
├── components/ # Shared UI components
|
||||
├── content/blog/ # Markdown blog articles
|
||||
├── errors/ # Exception handling
|
||||
├── figures/ # Plotly figure factories
|
||||
├── pages/
|
||||
│ ├── home.py
|
||||
│ ├── about.py
|
||||
│ ├── contact.py
|
||||
│ ├── projects.py
|
||||
│ ├── resume.py
|
||||
│ ├── health.py
|
||||
│ ├── blog/
|
||||
│ │ ├── index.py
|
||||
│ │ └── article.py
|
||||
│ └── toronto/
|
||||
│ ├── dashboard.py
|
||||
│ ├── methodology.py
|
||||
│ ├── tabs/ # 5 tab layouts
|
||||
│ └── callbacks/ # Dashboard interactions
|
||||
├── toronto/ # Data logic
|
||||
│ ├── parsers/ # API extraction
|
||||
│ ├── loaders/ # Database operations
|
||||
│ ├── schemas/ # Pydantic models
|
||||
│ ├── models/ # SQLAlchemy ORM
|
||||
│ └── demo_data.py # Sample data
|
||||
└── utils/
|
||||
└── markdown_loader.py # Blog article loading
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Toronto Dashboard
|
||||
|
||||
### Data Sources
|
||||
|
||||
| Source | Data | Format |
|
||||
|--------|------|--------|
|
||||
| City of Toronto Open Data | Neighbourhoods (158), Census profiles, Parks, Schools, Childcare, TTC | GeoJSON, CSV, API |
|
||||
| Toronto Police Service | Crime rates, MCI, Shootings | CSV, API |
|
||||
| CMHC | Rental Market Survey | CSV |
|
||||
|
||||
### Geographic Model
|
||||
|
||||
```
|
||||
City of Toronto Neighbourhoods (158) ← Primary analysis unit
|
||||
CMHC Zones (~20) ← Rental data (Census Tract aligned)
|
||||
```
|
||||
|
||||
### Dashboard Tabs
|
||||
|
||||
| Tab | Choropleth Metric | Supporting Charts |
|
||||
|-----|-------------------|-------------------|
|
||||
| Overview | Livability score | Top/Bottom 10 bar, Income vs Safety scatter |
|
||||
| Housing | Affordability index | Rent trend line, Tenure breakdown bar |
|
||||
| Safety | Crime rate per 100K | Crime breakdown bar, Crime trend line |
|
||||
| Demographics | Median income | Age distribution, Population density bar |
|
||||
| Amenities | Amenity index | Amenity radar, Transit accessibility bar |
|
||||
|
||||
### Star Schema
|
||||
|
||||
| Table | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `dim_neighbourhood` | Dimension | 158 neighbourhoods with geometry |
|
||||
| `dim_time` | Dimension | Date dimension |
|
||||
| `dim_cmhc_zone` | Dimension | ~20 CMHC zones with geometry |
|
||||
| `fact_census` | Fact | Census indicators by neighbourhood |
|
||||
| `fact_crime` | Fact | Crime stats by neighbourhood |
|
||||
| `fact_rentals` | Fact | Rental data by CMHC zone |
|
||||
| `fact_amenities` | Fact | Amenity counts by neighbourhood |
|
||||
|
||||
### dbt Layers
|
||||
|
||||
| Layer | Naming | Example |
|
||||
|-------|--------|---------|
|
||||
| Staging | `stg_{source}__{entity}` | `stg_toronto__neighbourhoods` |
|
||||
| Intermediate | `int_{domain}__{transform}` | `int_neighbourhood__demographics` |
|
||||
| Marts | `mart_{domain}` | `mart_neighbourhood_overview` |
|
||||
|
||||
---
|
||||
|
||||
## Tech Stack
|
||||
|
||||
| Layer | Technology | Version |
|
||||
|-------|------------|---------|
|
||||
| Database | PostgreSQL + PostGIS | 16.x |
|
||||
| Validation | Pydantic | 2.x |
|
||||
| ORM | SQLAlchemy | 2.x |
|
||||
| Transformation | dbt-postgres | 1.7+ |
|
||||
| Data Processing | Pandas, GeoPandas | Latest |
|
||||
| Visualization | Dash + Plotly | 2.14+ |
|
||||
| UI Components | dash-mantine-components | Latest |
|
||||
| Testing | pytest | 7.0+ |
|
||||
| Python | 3.11+ | Via pyenv |
|
||||
|
||||
---
|
||||
|
||||
@@ -23,293 +173,51 @@ Two-project analytics portfolio demonstrating end-to-end data engineering, visua
|
||||
|
||||
| Branch | Purpose | Deploys To |
|
||||
|--------|---------|------------|
|
||||
| `main` | Production releases only | VPS (production) |
|
||||
| `main` | Production releases | VPS (production) |
|
||||
| `staging` | Pre-production testing | VPS (staging) |
|
||||
| `development` | Active development | Local only |
|
||||
|
||||
**Rules**:
|
||||
- All feature branches created FROM `development`
|
||||
- All feature branches merge INTO `development`
|
||||
- `development` → `staging` for testing
|
||||
- `staging` → `main` for release
|
||||
- Direct commits to `main` or `staging` are forbidden
|
||||
- Branch naming: `feature/{sprint}-{description}` or `fix/{issue-id}`
|
||||
**Rules:**
|
||||
- Feature branches from `development`: `feature/{sprint}-{description}`
|
||||
- Merge into `development` when complete
|
||||
- `development` → `staging` → `main` for releases
|
||||
- Never delete `development`
|
||||
|
||||
---
|
||||
|
||||
## Tech Stack (Locked)
|
||||
## Code Standards
|
||||
|
||||
| Layer | Technology | Version |
|
||||
|-------|------------|---------|
|
||||
| Database | PostgreSQL + PostGIS | 16.x |
|
||||
| Validation | Pydantic | ≥2.0 |
|
||||
| ORM | SQLAlchemy | ≥2.0 (2.0-style API only) |
|
||||
| Transformation | dbt-postgres | ≥1.7 |
|
||||
| Data Processing | Pandas | ≥2.1 |
|
||||
| Geospatial | GeoPandas + Shapely | ≥0.14 |
|
||||
| Visualization | Dash + Plotly | ≥2.14 |
|
||||
| UI Components | dash-mantine-components | Latest stable |
|
||||
| Testing | pytest | ≥7.0 |
|
||||
| Python | 3.11+ | Via pyenv |
|
||||
### Type Hints (Python 3.10+)
|
||||
|
||||
**Compatibility Notes**:
|
||||
- SQLAlchemy 2.0 + Pydantic 2.0 integrate well—never mix 1.x APIs
|
||||
- PostGIS extension required—enable during db init
|
||||
- Docker Compose V2 (no `version` field in compose files)
|
||||
```python
|
||||
def process(items: list[str], config: dict[str, int] | None = None) -> bool:
|
||||
...
|
||||
```
|
||||
|
||||
---
|
||||
### Imports
|
||||
|
||||
## Code Conventions
|
||||
|
||||
### Import Style
|
||||
|
||||
| Context | Style | Example |
|
||||
|---------|-------|---------|
|
||||
| Same directory | Single dot | `from .neighbourhood import NeighbourhoodParser` |
|
||||
| Sibling directory | Double dot | `from ..schemas.neighbourhood import CensusRecord` |
|
||||
| External packages | Absolute | `import pandas as pd` |
|
||||
|
||||
### Module Separation
|
||||
|
||||
| Directory | Contains | Purpose |
|
||||
|-----------|----------|---------|
|
||||
| `schemas/` | Pydantic models | Data validation |
|
||||
| `models/` | SQLAlchemy ORM | Database persistence |
|
||||
| `parsers/` | API/CSV extraction | Raw data ingestion |
|
||||
| `loaders/` | Database operations | Data loading |
|
||||
| `figures/` | Chart factories | Plotly figure generation |
|
||||
| `callbacks/` | Dash callbacks | Per-dashboard, in `pages/{dashboard}/callbacks/` |
|
||||
| `errors/` | Exceptions + handlers | Error handling |
|
||||
|
||||
### Code Standards
|
||||
|
||||
- **Type hints**: Mandatory, Python 3.10+ style (`list[str]`, `dict[str, int]`, `X | None`)
|
||||
- **Functions**: Single responsibility, verb naming, early returns over nesting
|
||||
- **Docstrings**: Google style, minimal—only for non-obvious behavior
|
||||
- **Constants**: Module-level for magic values, Pydantic BaseSettings for runtime config
|
||||
| Context | Style |
|
||||
|---------|-------|
|
||||
| Same directory | `from .module import X` |
|
||||
| Sibling directory | `from ..schemas.model import Y` |
|
||||
| External | `import pandas as pd` |
|
||||
|
||||
### Error Handling
|
||||
|
||||
```python
|
||||
# errors/exceptions.py
|
||||
class PortfolioError(Exception):
|
||||
"""Base exception."""
|
||||
|
||||
class ParseError(PortfolioError):
|
||||
"""PDF/CSV parsing failed."""
|
||||
"""Data parsing failed."""
|
||||
|
||||
class ValidationError(PortfolioError):
|
||||
"""Pydantic or business rule validation failed."""
|
||||
"""Validation failed."""
|
||||
|
||||
class LoadError(PortfolioError):
|
||||
"""Database load operation failed."""
|
||||
"""Database load failed."""
|
||||
```
|
||||
|
||||
- Decorators for infrastructure concerns (logging, retry, transactions)
|
||||
- Explicit handling for domain logic (business rules, recovery strategies)
|
||||
|
||||
---
|
||||
|
||||
## Application Architecture
|
||||
|
||||
### Dash Pages Structure
|
||||
|
||||
```
|
||||
portfolio_app/
|
||||
├── app.py # Dash app factory with Pages routing
|
||||
├── config.py # Pydantic BaseSettings
|
||||
├── assets/ # CSS, images (auto-served by Dash)
|
||||
├── pages/
|
||||
│ ├── home.py # Bio landing page → /
|
||||
│ ├── toronto/
|
||||
│ │ ├── dashboard.py # Layout only → /toronto
|
||||
│ │ └── callbacks/ # Interaction logic
|
||||
│ └── energy/ # Phase 3
|
||||
├── components/ # Shared UI (navbar, footer, cards)
|
||||
├── figures/ # Shared chart factories
|
||||
├── toronto/ # Toronto data logic
|
||||
│ ├── parsers/
|
||||
│ ├── loaders/
|
||||
│ ├── schemas/ # Pydantic
|
||||
│ └── models/ # SQLAlchemy
|
||||
└── errors/
|
||||
```
|
||||
|
||||
### URL Routing (Automatic)
|
||||
|
||||
| URL | Page | Status |
|
||||
|-----|------|--------|
|
||||
| `/` | Bio landing page | Sprint 2 |
|
||||
| `/toronto` | Toronto Housing Dashboard | Sprint 6 |
|
||||
| `/energy` | Energy Pricing Dashboard | Phase 3 |
|
||||
|
||||
---
|
||||
|
||||
## Phase 1: Toronto Neighbourhood Dashboard
|
||||
|
||||
### Data Sources
|
||||
|
||||
| Track | Source | Format | Geography | Frequency |
|
||||
|-------|--------|--------|-----------|-----------|
|
||||
| Rentals | CMHC Rental Market Survey | API/CSV | ~20 Zones | Annual |
|
||||
| Neighbourhoods | City of Toronto Open Data | GeoJSON/CSV | 158 Neighbourhoods | Census |
|
||||
| Policy Events | Curated list | CSV | N/A | Event-based |
|
||||
|
||||
### Geographic Reality
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ City of Toronto Neighbourhoods (158) │ ← Primary analysis unit
|
||||
├─────────────────────────────────────────────────────────────────┤
|
||||
│ CMHC Zones (~20) — Census Tract aligned │ ← Rental data
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Data Model (Star Schema)
|
||||
|
||||
| Table | Type | Keys |
|
||||
|-------|------|------|
|
||||
| `fact_rentals` | Fact | → dim_time, dim_cmhc_zone |
|
||||
| `dim_time` | Dimension | date_key (PK) |
|
||||
| `dim_cmhc_zone` | Dimension | zone_key (PK), geometry |
|
||||
| `dim_neighbourhood` | Dimension | neighbourhood_id (PK), geometry |
|
||||
| `dim_policy_event` | Dimension | event_id (PK) |
|
||||
|
||||
### dbt Layer Structure
|
||||
|
||||
| Layer | Naming | Purpose |
|
||||
|-------|--------|---------|
|
||||
| Staging | `stg_{source}__{entity}` | 1:1 source, cleaned, typed |
|
||||
| Intermediate | `int_{domain}__{transform}` | Business logic, filtering |
|
||||
| Marts | `mart_{domain}` | Final analytical tables |
|
||||
|
||||
---
|
||||
|
||||
## Sprint Overview
|
||||
|
||||
| Sprint | Focus | Milestone |
|
||||
|--------|-------|-----------|
|
||||
| 1-6 | Foundation and initial dashboard | **Launch 1: Bio Live** |
|
||||
| 7 | Navigation & theme modernization | — |
|
||||
| 8 | Portfolio website expansion | **Launch 2: Website Live** |
|
||||
| 9 | Neighbourhood dashboard transition | Cleanup complete |
|
||||
| 10+ | Dashboard implementation | **Launch 3: Dashboard Live** |
|
||||
|
||||
---
|
||||
|
||||
## Scope Boundaries
|
||||
|
||||
### Phase 1 — Build These
|
||||
|
||||
- Bio landing page and portfolio website
|
||||
- CMHC rental data processor
|
||||
- Toronto neighbourhood data integration
|
||||
- PostgreSQL + PostGIS database layer
|
||||
- Star schema (facts + dimensions)
|
||||
- dbt models with tests
|
||||
- Choropleth visualization (Dash)
|
||||
- Policy event annotation layer
|
||||
|
||||
### Deferred Features
|
||||
|
||||
| Feature | Reason | When |
|
||||
|---------|--------|------|
|
||||
| Historical boundary reconciliation (140→158) | 2021+ data only for V1 | Future phase |
|
||||
| ML prediction models | Energy project scope | Phase 3 |
|
||||
| Multi-project shared infrastructure | Build first, abstract second | Future |
|
||||
|
||||
If a task seems to require deferred features, **stop and flag it**.
|
||||
|
||||
---
|
||||
|
||||
## File Structure
|
||||
|
||||
### Root-Level Files (Allowed)
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `README.md` | Project overview |
|
||||
| `CLAUDE.md` | AI assistant context |
|
||||
| `pyproject.toml` | Python packaging |
|
||||
| `.gitignore` | Git ignore rules |
|
||||
| `.env.example` | Environment template |
|
||||
| `.python-version` | pyenv version |
|
||||
| `.pre-commit-config.yaml` | Pre-commit hooks |
|
||||
| `docker-compose.yml` | Container orchestration |
|
||||
| `Makefile` | Task automation |
|
||||
|
||||
### Directory Structure
|
||||
|
||||
```
|
||||
portfolio/
|
||||
├── portfolio_app/ # Monolithic Dash application
|
||||
│ ├── app.py
|
||||
│ ├── config.py
|
||||
│ ├── assets/
|
||||
│ ├── pages/
|
||||
│ ├── components/
|
||||
│ ├── figures/
|
||||
│ ├── toronto/
|
||||
│ └── errors/
|
||||
├── tests/
|
||||
├── dbt/
|
||||
├── data/
|
||||
│ └── toronto/
|
||||
│ ├── raw/
|
||||
│ ├── processed/ # gitignored
|
||||
│ └── reference/
|
||||
├── scripts/
|
||||
│ ├── db/
|
||||
│ ├── docker/
|
||||
│ ├── deploy/
|
||||
│ ├── dbt/
|
||||
│ └── dev/
|
||||
├── docs/
|
||||
├── notebooks/
|
||||
├── backups/ # gitignored
|
||||
└── reports/ # gitignored
|
||||
```
|
||||
|
||||
### Gitignored Directories
|
||||
|
||||
- `data/*/processed/`
|
||||
- `reports/`
|
||||
- `backups/`
|
||||
- `notebooks/*.html`
|
||||
- `.env`
|
||||
- `__pycache__/`
|
||||
- `.venv/`
|
||||
|
||||
---
|
||||
|
||||
## Makefile Targets
|
||||
|
||||
| Target | Purpose |
|
||||
|--------|---------|
|
||||
| `setup` | Install deps, create .env, init pre-commit |
|
||||
| `docker-up` | Start PostgreSQL + PostGIS |
|
||||
| `docker-down` | Stop containers |
|
||||
| `db-init` | Initialize database schema |
|
||||
| `run` | Start Dash dev server |
|
||||
| `test` | Run pytest |
|
||||
| `dbt-run` | Run dbt models |
|
||||
| `dbt-test` | Run dbt tests |
|
||||
| `lint` | Run ruff linter |
|
||||
| `format` | Run ruff formatter |
|
||||
| `ci` | Run all checks |
|
||||
| `deploy` | Deploy to production |
|
||||
|
||||
---
|
||||
|
||||
## Script Standards
|
||||
|
||||
All scripts in `scripts/`:
|
||||
- Include usage comments at top
|
||||
- Idempotent where possible
|
||||
- Exit codes: 0 = success, 1 = error
|
||||
- Use `set -euo pipefail` for bash
|
||||
- Log to stdout, errors to stderr
|
||||
|
||||
---
|
||||
|
||||
## Environment Variables
|
||||
@@ -328,41 +236,52 @@ LOG_LEVEL=INFO
|
||||
|
||||
---
|
||||
|
||||
## Success Criteria
|
||||
## Makefile Targets
|
||||
|
||||
### Launch 1 (Bio Live)
|
||||
- [x] Bio page accessible via HTTPS
|
||||
- [x] All bio content rendered
|
||||
- [x] No placeholder text visible
|
||||
- [x] Mobile responsive
|
||||
- [x] Social links functional
|
||||
|
||||
### Launch 2 (Website Live)
|
||||
- [x] Full portfolio website with navigation
|
||||
- [x] About, Contact, Projects, Resume, Blog pages
|
||||
- [x] Dark mode theme support
|
||||
- [x] Sidebar navigation
|
||||
|
||||
### Launch 3 (Dashboard Live)
|
||||
- [ ] Choropleth renders neighbourhoods and CMHC zones
|
||||
- [ ] Rental data visualization works
|
||||
- [ ] Time navigation works
|
||||
- [ ] Policy event markers visible
|
||||
- [ ] Methodology documentation published
|
||||
- [ ] Data sources cited
|
||||
| Target | Purpose |
|
||||
|--------|---------|
|
||||
| `setup` | Install deps, create .env, init pre-commit |
|
||||
| `docker-up` | Start PostgreSQL + PostGIS |
|
||||
| `docker-down` | Stop containers |
|
||||
| `db-init` | Initialize database schema |
|
||||
| `run` | Start Dash dev server |
|
||||
| `test` | Run pytest |
|
||||
| `dbt-run` | Run dbt models |
|
||||
| `dbt-test` | Run dbt tests |
|
||||
| `lint` | Run ruff linter |
|
||||
| `format` | Run ruff formatter |
|
||||
| `ci` | Run all checks |
|
||||
|
||||
---
|
||||
|
||||
## Reference Documents
|
||||
## Next Steps
|
||||
|
||||
For detailed specifications, see:
|
||||
### Deployment (Sprint 10+)
|
||||
- [ ] Production Docker configuration
|
||||
- [ ] CI/CD pipeline
|
||||
- [ ] HTTPS/SSL setup
|
||||
- [ ] Domain configuration
|
||||
|
||||
| Document | Location | Use When |
|
||||
|----------|----------|----------|
|
||||
| Dashboard vision | `docs/changes/Change-Toronto-Analysis.md` | Dashboard specification |
|
||||
| Implementation plan | `docs/changes/Change-Toronto-Analysis-Reviewed.md` | Sprint planning |
|
||||
### Data Enhancement
|
||||
- [ ] Connect to live APIs (currently using demo data)
|
||||
- [ ] Data refresh automation
|
||||
- [ ] Historical data loading
|
||||
|
||||
### Future Projects
|
||||
- Energy Pricing Analysis dashboard (planned)
|
||||
|
||||
---
|
||||
|
||||
*Reference Version: 2.0*
|
||||
*Updated: Sprint 9*
|
||||
## Related Documents
|
||||
|
||||
| Document | Purpose |
|
||||
|----------|---------|
|
||||
| `README.md` | Quick start guide |
|
||||
| `CLAUDE.md` | AI assistant context |
|
||||
| `docs/CONTRIBUTING.md` | Developer guide |
|
||||
| `notebooks/README.md` | Notebook documentation |
|
||||
|
||||
---
|
||||
|
||||
*Reference Version: 3.0*
|
||||
*Updated: January 2026*
|
||||
|
||||
@@ -1,134 +0,0 @@
|
||||
# Portfolio Bio Content
|
||||
|
||||
**Version**: 2.0
|
||||
**Last Updated**: January 2026
|
||||
**Purpose**: Content source for `portfolio_app/pages/home.py`
|
||||
|
||||
---
|
||||
|
||||
## Document Context
|
||||
|
||||
| Attribute | Value |
|
||||
|-----------|-------|
|
||||
| **Parent Document** | `portfolio_project_plan_v5.md` |
|
||||
| **Role** | Bio content and social links for landing page |
|
||||
| **Consumed By** | `portfolio_app/pages/home.py` |
|
||||
|
||||
---
|
||||
|
||||
## Headline
|
||||
|
||||
**Primary**: Leo | Data Engineer & Analytics Developer
|
||||
|
||||
**Tagline**: I build data infrastructure that actually gets used.
|
||||
|
||||
---
|
||||
|
||||
## Professional Summary
|
||||
|
||||
Over the past 5 years, I've designed and evolved an enterprise analytics platform from scratch—now processing 1B+ rows across 21 tables with Python-based ETL pipelines and dbt-style SQL transformations. The result: 40% efficiency gains, 30% reduction in call abandon rates, and dashboards that executives actually open.
|
||||
|
||||
My approach: dimensional modeling (star schema), layered transformations (staging → intermediate → marts), and automation that eliminates manual work. I've built everything from self-service analytics portals to OCR-powered receipt processing systems.
|
||||
|
||||
Currently at Summitt Energy supporting multi-market operations across Canada and 8 US states. Previously cut my teeth on IT infrastructure projects at Petrobras (Fortune 500) and the Project Management Institute.
|
||||
|
||||
---
|
||||
|
||||
## Tech Stack
|
||||
|
||||
| Category | Technologies |
|
||||
|----------|--------------|
|
||||
| **Languages** | Python, SQL |
|
||||
| **Data Processing** | Pandas, SQLAlchemy, FastAPI |
|
||||
| **Databases** | PostgreSQL, MSSQL |
|
||||
| **Visualization** | Power BI, Plotly, Dash |
|
||||
| **Patterns** | dbt, dimensional modeling, star schema |
|
||||
| **Other** | Genesys Cloud |
|
||||
|
||||
**Display Format** (for landing page):
|
||||
```
|
||||
Python (Pandas, SQLAlchemy, FastAPI) • SQL (MSSQL, PostgreSQL) • Power BI • Plotly/Dash • Genesys Cloud • dbt patterns
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Side Project
|
||||
|
||||
**Bandit Labs** — Building automation and AI tooling for small businesses.
|
||||
|
||||
*Note: Keep this brief on portfolio; link only if separate landing page exists.*
|
||||
|
||||
---
|
||||
|
||||
## Social Links
|
||||
|
||||
| Platform | URL | Icon |
|
||||
|----------|-----|------|
|
||||
| **LinkedIn** | `https://linkedin.com/in/[USERNAME]` | `lucide-react: Linkedin` |
|
||||
| **GitHub** | `https://github.com/[USERNAME]` | `lucide-react: Github` |
|
||||
|
||||
> **TODO**: Replace `[USERNAME]` placeholders with actual URLs before bio page launch.
|
||||
|
||||
---
|
||||
|
||||
## Availability Statement
|
||||
|
||||
Open to **Senior Data Analyst**, **Analytics Engineer**, and **BI Developer** opportunities in Toronto or remote.
|
||||
|
||||
---
|
||||
|
||||
## Portfolio Projects Section
|
||||
|
||||
*Dynamically populated based on deployed projects.*
|
||||
|
||||
| Project | Status | Link |
|
||||
|---------|--------|------|
|
||||
| Toronto Housing Dashboard | In Development | `/toronto` |
|
||||
| Energy Pricing Analysis | Planned | `/energy` |
|
||||
|
||||
**Display Logic**:
|
||||
- Show only projects with `status = deployed`
|
||||
- "In Development" projects can show as coming soon or be hidden (user preference)
|
||||
|
||||
---
|
||||
|
||||
## Implementation Notes
|
||||
|
||||
### Content Hierarchy for `home.py`
|
||||
|
||||
```
|
||||
1. Name + Tagline (hero section)
|
||||
2. Professional Summary (2-3 paragraphs)
|
||||
3. Tech Stack (horizontal chips or inline list)
|
||||
4. Portfolio Projects (cards linking to dashboards)
|
||||
5. Social Links (icon buttons)
|
||||
6. Availability statement (subtle, bottom)
|
||||
```
|
||||
|
||||
### Styling Recommendations
|
||||
|
||||
- Clean, minimal — let the projects speak
|
||||
- Dark/light mode support via dash-mantine-components theme
|
||||
- No headshot required (optional)
|
||||
- Mobile-responsive layout
|
||||
|
||||
### Content Updates
|
||||
|
||||
When updating bio content:
|
||||
1. Edit this document
|
||||
2. Update `home.py` to reflect changes
|
||||
3. Redeploy
|
||||
|
||||
---
|
||||
|
||||
## Related Documents
|
||||
|
||||
| Document | Relationship |
|
||||
|----------|--------------|
|
||||
| `portfolio_project_plan_v5.md` | Parent — references this for bio content |
|
||||
| `portfolio_app/pages/home.py` | Consumer — implements this content |
|
||||
|
||||
---
|
||||
|
||||
*Document Version: 2.0*
|
||||
*Updated: January 2026*
|
||||
@@ -1,276 +0,0 @@
|
||||
# Toronto Neighbourhood Dashboard — Implementation Plan
|
||||
|
||||
**Document Type:** Execution Guide
|
||||
**Target:** Transition from TRREB-based to Neighbourhood-based Dashboard
|
||||
**Version:** 2.0 | January 2026
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Transition from TRREB district-based housing dashboard to a comprehensive Toronto Neighbourhood Dashboard built around the city's 158 official neighbourhoods.
|
||||
|
||||
**Key Changes:**
|
||||
- Geographic foundation: TRREB districts (~35) → City Neighbourhoods (158)
|
||||
- Data sources: PDF parsing → Open APIs (Toronto Open Data, Toronto Police, CMHC)
|
||||
- Scope: Housing-only → 5 thematic tabs (Overview, Housing, Safety, Demographics, Amenities)
|
||||
|
||||
---
|
||||
|
||||
## Phase 1: Repository Cleanup
|
||||
|
||||
### Files to DELETE
|
||||
|
||||
| File | Reason |
|
||||
|------|--------|
|
||||
| `portfolio_app/toronto/schemas/trreb.py` | TRREB schema obsolete |
|
||||
| `portfolio_app/toronto/parsers/trreb.py` | PDF parsing no longer needed |
|
||||
| `portfolio_app/toronto/loaders/trreb.py` | TRREB loading logic obsolete |
|
||||
| `dbt/models/staging/stg_trreb__purchases.sql` | TRREB staging obsolete |
|
||||
| `dbt/models/intermediate/int_purchases__monthly.sql` | TRREB intermediate obsolete |
|
||||
| `dbt/models/marts/mart_toronto_purchases.sql` | Will rebuild for neighbourhood grain |
|
||||
|
||||
### Files to MODIFY (Remove TRREB References)
|
||||
|
||||
| File | Action |
|
||||
|------|--------|
|
||||
| `portfolio_app/toronto/schemas/__init__.py` | Remove TRREB imports |
|
||||
| `portfolio_app/toronto/parsers/__init__.py` | Remove TRREB parser imports |
|
||||
| `portfolio_app/toronto/loaders/__init__.py` | Remove TRREB loader imports |
|
||||
| `portfolio_app/toronto/models/facts.py` | Remove `FactPurchases` model |
|
||||
| `portfolio_app/toronto/models/dimensions.py` | Remove `DimTRREBDistrict` model |
|
||||
| `portfolio_app/toronto/demo_data.py` | Remove TRREB demo data |
|
||||
| `dbt/models/sources.yml` | Remove TRREB source definitions |
|
||||
| `dbt/models/schema.yml` | Remove TRREB model documentation |
|
||||
|
||||
### Files to KEEP (Reusable)
|
||||
|
||||
| File | Why |
|
||||
|------|-----|
|
||||
| `portfolio_app/toronto/schemas/cmhc.py` | CMHC data still used |
|
||||
| `portfolio_app/toronto/parsers/cmhc.py` | Reusable with modifications |
|
||||
| `portfolio_app/toronto/loaders/base.py` | Generic database utilities |
|
||||
| `portfolio_app/toronto/loaders/dimensions.py` | Dimension loading patterns |
|
||||
| `portfolio_app/toronto/models/base.py` | SQLAlchemy base class |
|
||||
| `portfolio_app/figures/*.py` | All chart factories reusable |
|
||||
| `portfolio_app/components/*.py` | All UI components reusable |
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: Documentation Updates
|
||||
|
||||
| Document | Action |
|
||||
|----------|--------|
|
||||
| `CLAUDE.md` | Update data model section, mark transition complete |
|
||||
| `docs/PROJECT_REFERENCE.md` | Update architecture, data sources |
|
||||
| `docs/toronto_housing_dashboard_spec_v5.md` | Archive or delete |
|
||||
| `docs/wbs_sprint_plan_v4.md` | Archive or delete |
|
||||
|
||||
---
|
||||
|
||||
## Phase 3: New Data Model
|
||||
|
||||
### Star Schema (Neighbourhood-Centric)
|
||||
|
||||
| Table | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `dim_neighbourhood` | Central Dimension | 158 neighbourhoods with geometry |
|
||||
| `dim_time` | Dimension | Date dimension (keep existing) |
|
||||
| `dim_cmhc_zone` | Bridge Dimension | 15 CMHC zones with neighbourhood mapping |
|
||||
| `bridge_cmhc_neighbourhood` | Bridge | Zone-to-neighbourhood area weights |
|
||||
| `fact_census` | Fact | Census indicators by neighbourhood |
|
||||
| `fact_crime` | Fact | Crime stats by neighbourhood |
|
||||
| `fact_rentals` | Fact | Rental data by CMHC zone (keep existing) |
|
||||
| `fact_amenities` | Fact | Amenity counts by neighbourhood |
|
||||
|
||||
### New Schema Files
|
||||
|
||||
| File | Contains |
|
||||
|------|----------|
|
||||
| `toronto/schemas/neighbourhood.py` | NeighbourhoodRecord, CensusRecord, CrimeRecord |
|
||||
| `toronto/schemas/amenities.py` | AmenityType enum, AmenityRecord |
|
||||
|
||||
### New Parser Files
|
||||
|
||||
| File | Data Source | API |
|
||||
|------|-------------|-----|
|
||||
| `toronto/parsers/toronto_open_data.py` | Neighbourhoods, Census, Parks, Schools, Childcare | Toronto Open Data Portal |
|
||||
| `toronto/parsers/toronto_police.py` | Crime Rates, MCI, Shootings | Toronto Police Portal |
|
||||
|
||||
### New Loader Files
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `toronto/loaders/neighbourhoods.py` | Load GeoJSON boundaries |
|
||||
| `toronto/loaders/census.py` | Load neighbourhood profiles |
|
||||
| `toronto/loaders/crime.py` | Load crime statistics |
|
||||
| `toronto/loaders/amenities.py` | Load parks, schools, childcare |
|
||||
| `toronto/loaders/cmhc_crosswalk.py` | Build CMHC-neighbourhood bridge |
|
||||
|
||||
---
|
||||
|
||||
## Phase 4: dbt Restructuring
|
||||
|
||||
### Staging Layer
|
||||
|
||||
| Model | Source |
|
||||
|-------|--------|
|
||||
| `stg_toronto__neighbourhoods` | dim_neighbourhood |
|
||||
| `stg_toronto__census` | fact_census |
|
||||
| `stg_toronto__crime` | fact_crime |
|
||||
| `stg_toronto__amenities` | fact_amenities |
|
||||
| `stg_cmhc__rentals` | fact_rentals (modify existing) |
|
||||
| `stg_cmhc__zone_crosswalk` | bridge_cmhc_neighbourhood |
|
||||
|
||||
### Intermediate Layer
|
||||
|
||||
| Model | Purpose |
|
||||
|-------|---------|
|
||||
| `int_neighbourhood__demographics` | Combined census demographics |
|
||||
| `int_neighbourhood__housing` | Housing indicators |
|
||||
| `int_neighbourhood__crime_summary` | Aggregated crime by type |
|
||||
| `int_neighbourhood__amenity_scores` | Normalized amenity metrics |
|
||||
| `int_rentals__neighbourhood_allocated` | CMHC rentals allocated to neighbourhoods |
|
||||
|
||||
### Mart Layer (One per Tab)
|
||||
|
||||
| Model | Tab | Key Metrics |
|
||||
|-------|-----|-------------|
|
||||
| `mart_neighbourhood_overview` | Overview | Composite livability score |
|
||||
| `mart_neighbourhood_housing` | Housing | Affordability index, rent-to-income |
|
||||
| `mart_neighbourhood_safety` | Safety | Crime rates, YoY change |
|
||||
| `mart_neighbourhood_demographics` | Demographics | Income, age, diversity |
|
||||
| `mart_neighbourhood_amenities` | Amenities | Parks, schools, transit per capita |
|
||||
|
||||
---
|
||||
|
||||
## Phase 5: Dashboard Implementation
|
||||
|
||||
### Tab Structure
|
||||
|
||||
```
|
||||
pages/toronto/
|
||||
├── dashboard.py # Main layout with tab navigation
|
||||
├── tabs/
|
||||
│ ├── overview.py # Composite livability
|
||||
│ ├── housing.py # Affordability
|
||||
│ ├── safety.py # Crime
|
||||
│ ├── demographics.py # Population
|
||||
│ └── amenities.py # Services
|
||||
└── callbacks/
|
||||
├── map_callbacks.py
|
||||
├── chart_callbacks.py
|
||||
└── selection_callbacks.py
|
||||
```
|
||||
|
||||
### Layout Pattern (All Tabs)
|
||||
|
||||
Each tab follows the same structure:
|
||||
1. **Choropleth Map** (left) — 158 neighbourhoods, click to select
|
||||
2. **KPI Cards** (right) — 3-4 contextual metrics
|
||||
3. **Supporting Charts** (bottom) — Trend + comparison visualizations
|
||||
4. **Details Panel** (collapsible) — All metrics for selected neighbourhood
|
||||
|
||||
### Graphs by Tab
|
||||
|
||||
| Tab | Choropleth Metric | Chart 1 | Chart 2 |
|
||||
|-----|-------------------|---------|---------|
|
||||
| Overview | Livability score | Top/Bottom 10 bar | Income vs Crime scatter |
|
||||
| Housing | Affordability index | Rent trend (5yr line) | Dwelling types (pie/bar) |
|
||||
| Safety | Crime rate per 100K | Crime breakdown (stacked bar) | Crime trend (5yr line) |
|
||||
| Demographics | Median income | Age pyramid | Top languages (bar) |
|
||||
| Amenities | Park area per capita | Amenity radar | Transit accessibility (bar) |
|
||||
|
||||
---
|
||||
|
||||
## Phase 6: Jupyter Notebooks
|
||||
|
||||
### Purpose
|
||||
|
||||
One notebook per graph to document:
|
||||
1. **Data Reference** — How the data was built (query, transformation steps, sample output)
|
||||
2. **Data Visualization** — Import figure factory, render the graph
|
||||
|
||||
### Directory Structure
|
||||
|
||||
```
|
||||
notebooks/
|
||||
├── README.md
|
||||
├── overview/
|
||||
├── housing/
|
||||
├── safety/
|
||||
├── demographics/
|
||||
└── amenities/
|
||||
```
|
||||
|
||||
### Notebook Template
|
||||
|
||||
```markdown
|
||||
# [Graph Name]
|
||||
|
||||
## 1. Data Reference
|
||||
|
||||
### Source Tables
|
||||
- List tables/marts used
|
||||
- Grain of each table
|
||||
|
||||
### Query
|
||||
```sql
|
||||
SELECT ... FROM ...
|
||||
```
|
||||
|
||||
### Transformation Steps
|
||||
1. Step description
|
||||
2. Step description
|
||||
|
||||
### Sample Data
|
||||
```python
|
||||
df = pd.read_sql(query, engine)
|
||||
df.head(10)
|
||||
```
|
||||
|
||||
## 2. Data Visualization
|
||||
|
||||
```python
|
||||
from portfolio_app.figures.choropleth import create_choropleth_figure
|
||||
fig = create_choropleth_figure(...)
|
||||
fig.show()
|
||||
```
|
||||
```
|
||||
|
||||
Create one notebook per graph as each is implemented (15 total across 5 tabs).
|
||||
|
||||
---
|
||||
|
||||
## Phase 7: Final Documentation Review
|
||||
|
||||
After all implementation, audit and update:
|
||||
|
||||
- [ ] `CLAUDE.md` — Project status, app structure, data model, URL routes
|
||||
- [ ] `README.md` — Project description, installation, quick start
|
||||
- [ ] `docs/PROJECT_REFERENCE.md` — Architecture matches implementation
|
||||
- [ ] Remove or archive legacy spec documents
|
||||
|
||||
---
|
||||
|
||||
## Data Source Reference
|
||||
|
||||
| Source | Datasets | URL |
|
||||
|--------|----------|-----|
|
||||
| Toronto Open Data | Neighbourhoods, Census Profiles, Parks, Schools, Childcare, TTC | open.toronto.ca |
|
||||
| Toronto Police | Crime Rates, MCI, Shootings | data.torontopolice.on.ca |
|
||||
| CMHC | Rental Market Survey | cmhc-schl.gc.ca |
|
||||
|
||||
---
|
||||
|
||||
## CMHC Zone Mapping Note
|
||||
|
||||
CMHC uses 15 zones that don't align with 158 neighbourhoods. Strategy:
|
||||
- Create `bridge_cmhc_neighbourhood` with area weights
|
||||
- Allocate rental metrics proportionally to overlapping neighbourhoods
|
||||
- Document methodology in `/toronto/methodology` page
|
||||
|
||||
---
|
||||
|
||||
*Document Version: 2.0*
|
||||
*Trimmed from v1.0 for execution clarity*
|
||||
@@ -1,423 +0,0 @@
|
||||
# Toronto Neighbourhood Dashboard — Deliverables
|
||||
|
||||
**Project Type:** Interactive Data Visualization Dashboard
|
||||
**Geographic Scope:** City of Toronto, 158 Official Neighbourhoods
|
||||
**Author:** Leo Miranda
|
||||
**Version:** 1.0 | January 2026
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
Multi-tab analytics dashboard built around Toronto's official neighbourhood boundaries. The core interaction is a choropleth map where users explore the city through different thematic lenses—housing affordability, safety, demographics, amenities—with supporting visualizations that tell a cohesive story per theme.
|
||||
|
||||
**Primary Goals:**
|
||||
1. Demonstrate interactive data visualization skills (Plotly/Dash)
|
||||
2. Showcase data engineering capabilities (multi-source ETL, dimensional modeling)
|
||||
3. Create a portfolio piece with genuine analytical value
|
||||
|
||||
---
|
||||
|
||||
## Part 1: Geographic Foundation (Required First)
|
||||
|
||||
| Dataset | Source | Format | Last Updated | Download |
|
||||
|---------|--------|--------|--------------|----------|
|
||||
| **Neighbourhoods Boundaries** | Toronto Open Data | GeoJSON | 2024 | [Link](https://open.toronto.ca/dataset/neighbourhoods/) |
|
||||
| **Neighbourhood Profiles** | Toronto Open Data | CSV | 2021 Census | [Link](https://open.toronto.ca/dataset/neighbourhood-profiles/) |
|
||||
|
||||
**Critical Notes:**
|
||||
- Toronto uses 158 official neighbourhoods (updated 2024, was 140)
|
||||
- GeoJSON includes `AREA_ID` for joining to tabular data
|
||||
- Neighbourhood Profiles has 2,400+ indicators per neighbourhood from Census
|
||||
|
||||
---
|
||||
|
||||
## Part 2: Tier 1 — MVP Datasets
|
||||
|
||||
| Dataset | Source | Measures Available | Update Freq | Granularity |
|
||||
|---------|--------|-------------------|-------------|-------------|
|
||||
| **Neighbourhoods GeoJSON** | Toronto Open Data | Boundary polygons, area IDs | Static | Neighbourhood |
|
||||
| **Neighbourhood Profiles (full)** | Toronto Open Data | 2,400+ Census indicators | Every 5 years | Neighbourhood |
|
||||
| **Neighbourhood Crime Rates** | Toronto Police Portal | MCI rates per 100K by year | Annual | Neighbourhood |
|
||||
| **CMHC Rental Market Survey** | CMHC Portal | Avg rent by bedroom, vacancy rate | Annual (Oct) | 15 CMHC Zones |
|
||||
| **Parks** | Toronto Open Data | Park locations, area, type | Annual | Point/Polygon |
|
||||
|
||||
**Total API/Download Calls:** 5
|
||||
**Data Volume:** ~50MB combined
|
||||
|
||||
### Tier 1 Measures to Extract
|
||||
|
||||
**From Neighbourhood Profiles:**
|
||||
- Population, population density
|
||||
- Median household income
|
||||
- Age distribution (0-14, 15-24, 25-44, 45-64, 65+)
|
||||
- % Immigrants, % Visible minorities
|
||||
- Top languages spoken
|
||||
- Unemployment rate
|
||||
- Education attainment (% with post-secondary)
|
||||
- Housing tenure (own vs rent %)
|
||||
- Dwelling types distribution
|
||||
- Average rent, housing costs as % of income
|
||||
|
||||
**From Crime Rates:**
|
||||
- Total MCI rate per 100K population
|
||||
- Year-over-year crime trend
|
||||
|
||||
**From CMHC:**
|
||||
- Average monthly rent (1BR, 2BR, 3BR)
|
||||
- Vacancy rates
|
||||
|
||||
**From Parks:**
|
||||
- Park count per neighbourhood
|
||||
- Park area per capita
|
||||
|
||||
---
|
||||
|
||||
## Part 3: Tier 2 — Expansion Datasets
|
||||
|
||||
| Dataset | Source | Measures Available | Update Freq | Granularity |
|
||||
|---------|--------|-------------------|-------------|-------------|
|
||||
| **Major Crime Indicators (MCI)** | Toronto Police Portal | Assault, B&E, auto theft, robbery, theft over | Quarterly | Neighbourhood |
|
||||
| **Shootings & Firearm Discharges** | Toronto Police Portal | Shooting incidents, injuries, fatalities | Quarterly | Neighbourhood |
|
||||
| **Building Permits** | Toronto Open Data | New construction, permits by type | Monthly | Address-level |
|
||||
| **Schools** | Toronto Open Data | Public/Catholic, elementary/secondary | Annual | Point |
|
||||
| **TTC Routes & Stops** | Toronto Open Data | Route geometry, stop locations | Static | Route/Stop |
|
||||
| **Licensed Child Care Centres** | Toronto Open Data | Capacity, ages served, locations | Annual | Point |
|
||||
|
||||
### Tier 2 Measures to Extract
|
||||
|
||||
**From MCI Details:**
|
||||
- Breakdown by crime type (assault, B&E, auto theft, robbery, theft over)
|
||||
|
||||
**From Shootings:**
|
||||
- Shooting incidents count
|
||||
- Injuries/fatalities
|
||||
|
||||
**From Building Permits:**
|
||||
- New construction permits (trailing 12 months)
|
||||
- Permit types distribution
|
||||
|
||||
**From Schools:**
|
||||
- Schools per 1000 children
|
||||
- School type breakdown
|
||||
|
||||
**From TTC:**
|
||||
- Transit stops within neighbourhood
|
||||
- Transit accessibility score
|
||||
|
||||
**From Child Care:**
|
||||
- Child care spaces per capita
|
||||
- Coverage by age group
|
||||
|
||||
---
|
||||
|
||||
## Part 4: Data Sources by Thematic Group
|
||||
|
||||
### GROUP A: Housing & Affordability
|
||||
|
||||
| Dataset | Tier | Measures | Update Freq |
|
||||
|---------|------|----------|-------------|
|
||||
| Neighbourhood Profiles (Housing) | 1 | Avg rent, ownership %, dwelling types, housing costs as % of income | Every 5 years |
|
||||
| CMHC Rental Market Survey | 1 | Avg rent by bedroom, vacancy rate, rental universe | Annual |
|
||||
| Building Permits | 2 | New construction, permits by type | Monthly |
|
||||
|
||||
**Calculated Metrics:**
|
||||
- Rent-to-Income Ratio (CMHC rent ÷ Census income)
|
||||
- Affordability Index (% of income spent on housing)
|
||||
|
||||
---
|
||||
|
||||
### GROUP B: Safety & Crime
|
||||
|
||||
| Dataset | Tier | Measures | Update Freq |
|
||||
|---------|------|----------|-------------|
|
||||
| Neighbourhood Crime Rates | 1 | MCI rates per 100K pop by year | Annual |
|
||||
| Major Crime Indicators (MCI) | 2 | Assault, B&E, auto theft, robbery, theft over | Quarterly |
|
||||
| Shootings & Firearm Discharges | 2 | Shooting incidents, injuries, fatalities | Quarterly |
|
||||
|
||||
**Calculated Metrics:**
|
||||
- Year-over-year crime change %
|
||||
- Crime type distribution
|
||||
|
||||
---
|
||||
|
||||
### GROUP C: Demographics & Community
|
||||
|
||||
| Dataset | Tier | Measures | Update Freq |
|
||||
|---------|------|----------|-------------|
|
||||
| Neighbourhood Profiles (Demographics) | 1 | Age distribution, household composition, income | Every 5 years |
|
||||
| Neighbourhood Profiles (Immigration) | 1 | Immigration status, visible minorities, languages | Every 5 years |
|
||||
| Neighbourhood Profiles (Education) | 1 | Education attainment, field of study | Every 5 years |
|
||||
| Neighbourhood Profiles (Labour) | 1 | Employment rate, occupation, industry | Every 5 years |
|
||||
|
||||
---
|
||||
|
||||
### GROUP D: Transportation & Mobility
|
||||
|
||||
| Dataset | Tier | Measures | Update Freq |
|
||||
|---------|------|----------|-------------|
|
||||
| Commute Mode (Census) | 1 | % car, transit, walk, bike | Every 5 years |
|
||||
| TTC Routes & Stops | 2 | Route geometry, stop locations | Static |
|
||||
|
||||
**Calculated Metrics:**
|
||||
- Transit accessibility (stops within 500m of neighbourhood centroid)
|
||||
|
||||
---
|
||||
|
||||
### GROUP E: Amenities & Services
|
||||
|
||||
| Dataset | Tier | Measures | Update Freq |
|
||||
|---------|------|----------|-------------|
|
||||
| Parks | 1 | Park locations, area, type | Annual |
|
||||
| Schools | 2 | Public/Catholic, elementary/secondary | Annual |
|
||||
| Licensed Child Care Centres | 2 | Capacity, ages served | Annual |
|
||||
|
||||
**Calculated Metrics:**
|
||||
- Park area per capita
|
||||
- Schools per 1000 children (ages 5-17)
|
||||
- Child care spaces per 1000 children (ages 0-4)
|
||||
|
||||
---
|
||||
|
||||
## Part 5: Tab Structure
|
||||
|
||||
### Tab Architecture
|
||||
|
||||
```
|
||||
┌────────────────────────────────────────────────────────────────┐
|
||||
│ [Overview] [Housing] [Safety] [Demographics] [Amenities] │
|
||||
├────────────────────────────────────────────────────────────────┤
|
||||
│ │
|
||||
│ ┌─────────────────────────────────┐ ┌────────────────┐ │
|
||||
│ │ │ │ KPI Card 1 │ │
|
||||
│ │ CHOROPLETH MAP │ ├────────────────┤ │
|
||||
│ │ (158 Neighbourhoods) │ │ KPI Card 2 │ │
|
||||
│ │ │ ├────────────────┤ │
|
||||
│ │ Click to select │ │ KPI Card 3 │ │
|
||||
│ │ │ └────────────────┘ │
|
||||
│ └─────────────────────────────────┘ │
|
||||
│ │
|
||||
│ ┌─────────────────────┐ ┌─────────────────────┐ │
|
||||
│ │ Supporting Chart 1 │ │ Supporting Chart 2 │ │
|
||||
│ │ (Context/Trend) │ │ (Comparison/Rank) │ │
|
||||
│ └─────────────────────┘ └─────────────────────┘ │
|
||||
│ │
|
||||
│ [Neighbourhood: Selected Name] ──────────────────────── │
|
||||
│ Details panel with all metrics for selected area │
|
||||
└────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Tab 1: Overview (Default Landing)
|
||||
|
||||
**Story:** "How do Toronto neighbourhoods compare across key livability metrics?"
|
||||
|
||||
| Element | Content | Data Source |
|
||||
|---------|---------|-------------|
|
||||
| Map Colour | Composite livability score | Calculated from weighted metrics |
|
||||
| KPI Cards | Population, Median Income, Avg Crime Rate | Neighbourhood Profiles, Crime Rates |
|
||||
| Chart 1 | Top 10 / Bottom 10 by livability score | Calculated |
|
||||
| Chart 2 | Income vs Crime scatter plot | Neighbourhood Profiles, Crime Rates |
|
||||
|
||||
**Metric Selector:** Allow user to change map colour by any single metric.
|
||||
|
||||
---
|
||||
|
||||
### Tab 2: Housing & Affordability
|
||||
|
||||
**Story:** "Where can you afford to live, and what's being built?"
|
||||
|
||||
| Element | Content | Data Source |
|
||||
|---------|---------|-------------|
|
||||
| Map Colour | Rent-to-Income Ratio (Affordability Index) | CMHC + Census income |
|
||||
| KPI Cards | Median Rent (1BR), Vacancy Rate, New Permits (12mo) | CMHC, Building Permits |
|
||||
| Chart 1 | Rent trend (5-year line chart by bedroom) | CMHC historical |
|
||||
| Chart 2 | Dwelling type breakdown (pie/bar) | Neighbourhood Profiles |
|
||||
|
||||
**Metric Selector:** Toggle between rent, ownership %, dwelling types.
|
||||
|
||||
---
|
||||
|
||||
### Tab 3: Safety
|
||||
|
||||
**Story:** "How safe is each neighbourhood, and what crimes are most common?"
|
||||
|
||||
| Element | Content | Data Source |
|
||||
|---------|---------|-------------|
|
||||
| Map Colour | Total MCI Rate per 100K | Crime Rates |
|
||||
| KPI Cards | Total Crimes, YoY Change %, Shooting Incidents | Crime Rates, Shootings |
|
||||
| Chart 1 | Crime type breakdown (stacked bar) | MCI Details |
|
||||
| Chart 2 | 5-year crime trend (line chart) | Crime Rates historical |
|
||||
|
||||
**Metric Selector:** Toggle between total crime, specific crime types, shootings.
|
||||
|
||||
---
|
||||
|
||||
### Tab 4: Demographics
|
||||
|
||||
**Story:** "Who lives here? Age, income, diversity."
|
||||
|
||||
| Element | Content | Data Source |
|
||||
|---------|---------|-------------|
|
||||
| Map Colour | Median Household Income | Neighbourhood Profiles |
|
||||
| KPI Cards | Population, % Immigrant, Unemployment Rate | Neighbourhood Profiles |
|
||||
| Chart 1 | Age distribution (population pyramid or bar) | Neighbourhood Profiles |
|
||||
| Chart 2 | Top languages spoken (horizontal bar) | Neighbourhood Profiles |
|
||||
|
||||
**Metric Selector:** Income, immigrant %, age groups, education.
|
||||
|
||||
---
|
||||
|
||||
### Tab 5: Amenities & Services
|
||||
|
||||
**Story:** "What's nearby? Parks, schools, child care, transit."
|
||||
|
||||
| Element | Content | Data Source |
|
||||
|---------|---------|-------------|
|
||||
| Map Colour | Park Area per Capita | Parks + Population |
|
||||
| KPI Cards | Parks Count, Schools Count, Child Care Spaces | Multiple datasets |
|
||||
| Chart 1 | Amenity density comparison (radar or bar) | Calculated |
|
||||
| Chart 2 | Transit accessibility (stops within 500m) | TTC Stops |
|
||||
|
||||
**Metric Selector:** Parks, schools, child care, transit access.
|
||||
|
||||
---
|
||||
|
||||
## Part 6: Data Pipeline Architecture
|
||||
|
||||
### ETL Flow
|
||||
|
||||
```
|
||||
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
|
||||
│ DATA SOURCES │ │ STAGING LAYER │ │ MART LAYER │
|
||||
│ │ │ │ │ │
|
||||
│ Toronto Open │────▶│ stg_geography │────▶│ dim_neighbourhood│
|
||||
│ Data Portal │ │ stg_census │ │ fact_crime │
|
||||
│ │ │ stg_crime │ │ fact_housing │
|
||||
│ CMHC Portal │────▶│ stg_rental │ │ fact_amenities │
|
||||
│ │ │ stg_permits │ │ │
|
||||
│ Toronto Police │────▶│ stg_amenities │ │ agg_dashboard │
|
||||
│ Portal │ │ stg_childcare │ │ (pre-computed) │
|
||||
└─────────────────┘ └─────────────────┘ └─────────────────┘
|
||||
```
|
||||
|
||||
### Key Transformations
|
||||
|
||||
| Transformation | Description |
|
||||
|----------------|-------------|
|
||||
| **Geography Standardization** | Ensure all datasets use `neighbourhood_id` (AREA_ID from GeoJSON) |
|
||||
| **Census Pivot** | Neighbourhood Profiles is wide format — pivot to metrics per neighbourhood |
|
||||
| **CMHC Zone Mapping** | Create crosswalk from 15 CMHC zones to 158 neighbourhoods |
|
||||
| **Amenity Aggregation** | Spatial join point data (schools, parks, child care) to neighbourhood polygons |
|
||||
| **Rate Calculations** | Normalize counts to per-capita or per-100K |
|
||||
|
||||
### Data Refresh Schedule
|
||||
|
||||
| Layer | Frequency | Trigger |
|
||||
|-------|-----------|---------|
|
||||
| Staging (API pulls) | Weekly | Scheduled job |
|
||||
| Marts (transforms) | Weekly | Post-staging |
|
||||
| Dashboard cache | On-demand | User refresh button |
|
||||
|
||||
---
|
||||
|
||||
## Part 7: Technical Stack
|
||||
|
||||
### Core Stack
|
||||
|
||||
| Component | Technology | Rationale |
|
||||
|-----------|------------|-----------|
|
||||
| **Frontend** | Plotly Dash | Production-ready, rapid iteration |
|
||||
| **Mapping** | Plotly `choropleth_mapbox` | Native Dash integration |
|
||||
| **Data Store** | PostgreSQL + PostGIS | Spatial queries, existing expertise |
|
||||
| **ETL** | Python (Pandas, SQLAlchemy) | Existing stack |
|
||||
| **Deployment** | Render / Railway | Free tier, easy Dash hosting |
|
||||
|
||||
### Alternative (Portfolio Stretch)
|
||||
|
||||
| Component | Technology | Why Consider |
|
||||
|-----------|------------|--------------|
|
||||
| **Frontend** | React + deck.gl | More "modern" for portfolio |
|
||||
| **Data Store** | DuckDB | Serverless, embeddable |
|
||||
| **ETL** | dbt | Aligns with skills roadmap |
|
||||
|
||||
---
|
||||
|
||||
## Appendix A: Data Source URLs
|
||||
|
||||
| Source | URL |
|
||||
|--------|-----|
|
||||
| Toronto Open Data — Neighbourhoods | https://open.toronto.ca/dataset/neighbourhoods/ |
|
||||
| Toronto Open Data — Neighbourhood Profiles | https://open.toronto.ca/dataset/neighbourhood-profiles/ |
|
||||
| Toronto Police — Neighbourhood Crime Rates | https://data.torontopolice.on.ca/datasets/neighbourhood-crime-rates-open-data |
|
||||
| Toronto Police — MCI | https://data.torontopolice.on.ca/datasets/major-crime-indicators-open-data |
|
||||
| Toronto Police — Shootings | https://data.torontopolice.on.ca/datasets/shootings-firearm-discharges-open-data |
|
||||
| CMHC Rental Market Survey | https://www.cmhc-schl.gc.ca/professionals/housing-markets-data-and-research/housing-data/data-tables/rental-market |
|
||||
| Toronto Open Data — Parks | https://open.toronto.ca/dataset/parks/ |
|
||||
| Toronto Open Data — Schools | https://open.toronto.ca/dataset/school-locations-all-types/ |
|
||||
| Toronto Open Data — Building Permits | https://open.toronto.ca/dataset/building-permits-cleared-permits/ |
|
||||
| Toronto Open Data — Child Care | https://open.toronto.ca/dataset/licensed-child-care-centres/ |
|
||||
| Toronto Open Data — TTC Routes | https://open.toronto.ca/dataset/ttc-routes-and-schedules/ |
|
||||
|
||||
---
|
||||
|
||||
## Appendix B: Colour Palettes
|
||||
|
||||
### Affordability (Diverging)
|
||||
| Status | Hex | Usage |
|
||||
|--------|-----|-------|
|
||||
| Affordable (<30% income) | `#2ecc71` | Green |
|
||||
| Stretched (30-50%) | `#f1c40f` | Yellow |
|
||||
| Unaffordable (>50%) | `#e74c3c` | Red |
|
||||
|
||||
### Safety (Sequential)
|
||||
| Status | Hex | Usage |
|
||||
|--------|-----|-------|
|
||||
| Safest (lowest crime) | `#27ae60` | Dark green |
|
||||
| Moderate | `#f39c12` | Orange |
|
||||
| Highest Crime | `#c0392b` | Dark red |
|
||||
|
||||
### Demographics — Income (Sequential)
|
||||
| Level | Hex | Usage |
|
||||
|-------|-----|-------|
|
||||
| Highest Income | `#1a5276` | Dark blue |
|
||||
| Mid Income | `#5dade2` | Light blue |
|
||||
| Lowest Income | `#ecf0f1` | Light gray |
|
||||
|
||||
### General Recommendation
|
||||
Use **Viridis** or **Plasma** colorscales for perceptually uniform gradients on continuous metrics.
|
||||
|
||||
---
|
||||
|
||||
## Appendix C: Glossary
|
||||
|
||||
| Term | Definition |
|
||||
|------|------------|
|
||||
| **MCI** | Major Crime Indicators — Assault, B&E, Auto Theft, Robbery, Theft Over |
|
||||
| **CMHC Zone** | Canada Mortgage and Housing Corporation rental market survey zones (15 in Toronto) |
|
||||
| **Rent-to-Income Ratio** | Monthly rent ÷ monthly household income; <30% is considered affordable |
|
||||
| **PostGIS** | PostgreSQL extension for geographic data |
|
||||
| **Choropleth** | Thematic map where areas are shaded based on a statistical variable |
|
||||
|
||||
---
|
||||
|
||||
## Appendix D: Interview Talking Points
|
||||
|
||||
When discussing this project in interviews, emphasize:
|
||||
|
||||
1. **Data Engineering:** "I built a multi-source ETL pipeline that standardizes geographic keys across Census data, police data, and CMHC rental surveys—three different granularities I had to reconcile."
|
||||
|
||||
2. **Dimensional Modeling:** "The data model follows star schema patterns with a central neighbourhood dimension table and fact tables for crime, housing, and amenities."
|
||||
|
||||
3. **dbt Patterns:** "The transformation layer uses staging → intermediate → mart patterns, which I've documented for maintainability."
|
||||
|
||||
4. **Business Value:** "The dashboard answers questions like 'Where can a young professional afford to live that's safe and has good transit?' — turning raw data into actionable insights."
|
||||
|
||||
5. **Technical Decisions:** "I chose Plotly Dash over a React frontend because it let me iterate faster while maintaining production-quality interactivity. For a portfolio piece, speed to working demo matters."
|
||||
|
||||
---
|
||||
|
||||
*Document Version: 1.0*
|
||||
*Created: January 2026*
|
||||
*Author: Leo Miranda / Claude*
|
||||
Reference in New Issue
Block a user