refactor: multi-dashboard structural migration
Some checks failed
CI / lint-and-test (pull_request) Has been cancelled
Some checks failed
CI / lint-and-test (pull_request) Has been cancelled
- Rename dbt project from toronto_housing to portfolio - Restructure dbt models into domain subdirectories: - shared/ for cross-domain dimensions (dim_time) - staging/toronto/, intermediate/toronto/, marts/toronto/ - Update SQLAlchemy models for raw_toronto schema - Add explicit cross-schema FK relationships for FactRentals - Namespace figure factories under figures/toronto/ - Namespace notebooks under notebooks/toronto/ - Update Makefile with domain-specific targets and env loading - Update all documentation for multi-dashboard structure This enables adding new dashboard projects (e.g., /football, /energy) without structural conflicts or naming collisions. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
79
CLAUDE.md
79
CLAUDE.md
@@ -1,5 +1,37 @@
|
|||||||
# CLAUDE.md
|
# CLAUDE.md
|
||||||
|
|
||||||
|
## ⛔ MANDATORY BEHAVIOR RULES - READ FIRST
|
||||||
|
|
||||||
|
**These rules are NON-NEGOTIABLE. Violating them wastes the user's time and money.**
|
||||||
|
|
||||||
|
### 1. WHEN USER ASKS YOU TO CHECK SOMETHING - CHECK EVERYTHING
|
||||||
|
- Search ALL locations, not just where you think it is
|
||||||
|
- Check cache directories: `~/.claude/plugins/cache/`
|
||||||
|
- Check installed: `~/.claude/plugins/marketplaces/`
|
||||||
|
- Check source directories
|
||||||
|
- **NEVER say "no" or "that's not the issue" without exhaustive verification**
|
||||||
|
|
||||||
|
### 2. WHEN USER SAYS SOMETHING IS WRONG - BELIEVE THEM
|
||||||
|
- The user knows their system better than you
|
||||||
|
- Investigate thoroughly before disagreeing
|
||||||
|
- **Your confidence is often wrong. User's instincts are often right.**
|
||||||
|
|
||||||
|
### 3. NEVER SAY "DONE" WITHOUT VERIFICATION
|
||||||
|
- Run the actual command/script to verify
|
||||||
|
- Show the output to the user
|
||||||
|
- **"Done" means VERIFIED WORKING, not "I made changes"**
|
||||||
|
|
||||||
|
### 4. SHOW EXACTLY WHAT USER ASKS FOR
|
||||||
|
- If user asks for messages, show the MESSAGES
|
||||||
|
- If user asks for code, show the CODE
|
||||||
|
- **Do not interpret or summarize unless asked**
|
||||||
|
|
||||||
|
**FAILURE TO FOLLOW THESE RULES = WASTED USER TIME = UNACCEPTABLE**
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
Working context for Claude Code on the Analytics Portfolio project.
|
Working context for Claude Code on the Analytics Portfolio project.
|
||||||
|
|
||||||
---
|
---
|
||||||
@@ -26,8 +58,9 @@ make db-init # Initialize database schema
|
|||||||
make db-reset # Drop and recreate database (DESTRUCTIVE)
|
make db-reset # Drop and recreate database (DESTRUCTIVE)
|
||||||
|
|
||||||
# Data Loading
|
# Data Loading
|
||||||
make load-data # Load Toronto data from APIs, seed dev data
|
make load-data # Load all project data (currently: Toronto)
|
||||||
make load-data-only # Load Toronto data without dbt or seeding
|
make load-toronto # Load Toronto data from APIs
|
||||||
|
make load-toronto-only # Load Toronto data without dbt or seeding
|
||||||
make seed-data # Seed sample development data
|
make seed-data # Seed sample development data
|
||||||
|
|
||||||
# Application
|
# Application
|
||||||
@@ -127,13 +160,21 @@ class LoadError(PortfolioError):
|
|||||||
| `pages/` | Dash Pages (file-based routing) | URLs match file paths |
|
| `pages/` | Dash Pages (file-based routing) | URLs match file paths |
|
||||||
| `pages/toronto/` | Toronto Dashboard | `tabs/` for layouts, `callbacks/` for interactions |
|
| `pages/toronto/` | Toronto Dashboard | `tabs/` for layouts, `callbacks/` for interactions |
|
||||||
| `components/` | Shared UI components | metric_card, sidebar, map_controls, time_slider |
|
| `components/` | Shared UI components | metric_card, sidebar, map_controls, time_slider |
|
||||||
| `figures/` | Plotly chart factories | choropleth, bar_charts, scatter, radar, time_series |
|
| `figures/toronto/` | Toronto chart factories | choropleth, bar_charts, scatter, radar, time_series |
|
||||||
| `toronto/` | Toronto data logic | parsers/, loaders/, schemas/, models/ |
|
| `toronto/` | Toronto data logic | parsers/, loaders/, schemas/, models/ |
|
||||||
| `content/blog/` | Markdown blog articles | Processed by `utils/markdown_loader.py` |
|
| `content/blog/` | Markdown blog articles | Processed by `utils/markdown_loader.py` |
|
||||||
| `notebooks/` | Data documentation | 5 domains: overview, housing, safety, demographics, amenities |
|
| `notebooks/toronto/` | Toronto documentation | 5 domains: overview, housing, safety, demographics, amenities |
|
||||||
|
|
||||||
**Key URLs:** `/` (home), `/toronto` (dashboard), `/blog` (listing), `/blog/{slug}` (articles)
|
**Key URLs:** `/` (home), `/toronto` (dashboard), `/blog` (listing), `/blog/{slug}` (articles)
|
||||||
|
|
||||||
|
### Multi-Dashboard Architecture
|
||||||
|
|
||||||
|
The codebase is structured to support multiple dashboard projects:
|
||||||
|
- **figures/**: Domain-namespaced figure factories (`figures/toronto/`, future: `figures/football/`)
|
||||||
|
- **notebooks/**: Domain-namespaced documentation (`notebooks/toronto/`, future: `notebooks/football/`)
|
||||||
|
- **dbt models**: Domain subdirectories (`staging/toronto/`, `marts/toronto/`)
|
||||||
|
- **Database schemas**: Domain-specific raw data (`raw_toronto`, future: `raw_football`)
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Tech Stack (Locked)
|
## Tech Stack (Locked)
|
||||||
@@ -161,6 +202,16 @@ class LoadError(PortfolioError):
|
|||||||
|
|
||||||
## Data Model Overview
|
## Data Model Overview
|
||||||
|
|
||||||
|
### Database Schemas
|
||||||
|
|
||||||
|
| Schema | Purpose |
|
||||||
|
|--------|---------|
|
||||||
|
| `public` | Shared dimensions (dim_time) |
|
||||||
|
| `raw_toronto` | Toronto-specific raw/dimension tables |
|
||||||
|
| `staging` | dbt staging views |
|
||||||
|
| `intermediate` | dbt intermediate views |
|
||||||
|
| `marts` | dbt mart tables |
|
||||||
|
|
||||||
### Geographic Reality (Toronto Housing)
|
### Geographic Reality (Toronto Housing)
|
||||||
|
|
||||||
```
|
```
|
||||||
@@ -168,20 +219,31 @@ City Neighbourhoods (158) - Primary geographic unit for analysis
|
|||||||
CMHC Zones (~20) - Rental data (Census Tract aligned)
|
CMHC Zones (~20) - Rental data (Census Tract aligned)
|
||||||
```
|
```
|
||||||
|
|
||||||
### Star Schema
|
### Star Schema (raw_toronto)
|
||||||
|
|
||||||
| Table | Type | Keys |
|
| Table | Type | Keys |
|
||||||
|-------|------|------|
|
|-------|------|------|
|
||||||
| `fact_rentals` | Fact | -> dim_time, dim_cmhc_zone |
|
| `fact_rentals` | Fact | -> dim_time, dim_cmhc_zone |
|
||||||
| `dim_time` | Dimension | date_key (PK) |
|
| `dim_time` | Dimension (public) | date_key (PK) - shared |
|
||||||
| `dim_cmhc_zone` | Dimension | zone_key (PK), geometry |
|
| `dim_cmhc_zone` | Dimension | zone_key (PK), geometry |
|
||||||
| `dim_neighbourhood` | Dimension | neighbourhood_id (PK), geometry |
|
| `dim_neighbourhood` | Dimension | neighbourhood_id (PK), geometry |
|
||||||
| `dim_policy_event` | Dimension | event_id (PK) |
|
| `dim_policy_event` | Dimension | event_id (PK) |
|
||||||
|
|
||||||
### dbt Layers
|
### dbt Project: `portfolio`
|
||||||
|
|
||||||
|
**Model Structure:**
|
||||||
|
```
|
||||||
|
dbt/models/
|
||||||
|
├── shared/ # Cross-domain dimensions
|
||||||
|
│ └── stg_dimensions__time.sql
|
||||||
|
├── staging/toronto/ # Toronto staging models
|
||||||
|
├── intermediate/toronto/ # Toronto intermediate models
|
||||||
|
└── marts/toronto/ # Toronto mart tables
|
||||||
|
```
|
||||||
|
|
||||||
| Layer | Naming | Purpose |
|
| Layer | Naming | Purpose |
|
||||||
|-------|--------|---------|
|
|-------|--------|---------|
|
||||||
|
| Shared | `stg_dimensions__*` | Cross-domain dimensions |
|
||||||
| Staging | `stg_{source}__{entity}` | 1:1 source, cleaned, typed |
|
| Staging | `stg_{source}__{entity}` | 1:1 source, cleaned, typed |
|
||||||
| Intermediate | `int_{domain}__{transform}` | Business logic |
|
| Intermediate | `int_{domain}__{transform}` | Business logic |
|
||||||
| Marts | `mart_{domain}` | Final analytical tables |
|
| Marts | `mart_{domain}` | Final analytical tables |
|
||||||
@@ -196,7 +258,6 @@ CMHC Zones (~20) - Rental data (Census Tract aligned)
|
|||||||
|---------|--------|
|
|---------|--------|
|
||||||
| Historical boundary reconciliation (140->158) | 2021+ data only for V1 |
|
| Historical boundary reconciliation (140->158) | 2021+ data only for V1 |
|
||||||
| ML prediction models | Energy project scope (future phase) |
|
| ML prediction models | Energy project scope (future phase) |
|
||||||
| Multi-project shared infrastructure | Build first, abstract second |
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -351,4 +412,4 @@ Use for git operations assistance.
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
*Last Updated: January 2026 (Post-Sprint 9)*
|
*Last Updated: February 2026 (Multi-Dashboard Architecture)*
|
||||||
|
|||||||
26
Makefile
26
Makefile
@@ -1,11 +1,12 @@
|
|||||||
.PHONY: setup docker-up docker-down db-init load-data seed-data run test dbt-run dbt-test lint format ci deploy clean help logs run-detached etl-toronto
|
.PHONY: setup docker-up docker-down db-init load-data load-all load-toronto load-toronto-only seed-data run test dbt-run dbt-test lint format ci deploy clean help logs run-detached etl-toronto
|
||||||
|
|
||||||
# Default target
|
# Default target
|
||||||
.DEFAULT_GOAL := help
|
.DEFAULT_GOAL := help
|
||||||
|
|
||||||
# Environment
|
# Environment
|
||||||
PYTHON := python3
|
VENV := .venv
|
||||||
PIP := pip
|
PYTHON := $(VENV)/bin/python3
|
||||||
|
PIP := $(VENV)/bin/pip
|
||||||
DOCKER_COMPOSE := docker compose
|
DOCKER_COMPOSE := docker compose
|
||||||
|
|
||||||
# Architecture detection for Docker images
|
# Architecture detection for Docker images
|
||||||
@@ -79,16 +80,23 @@ db-reset: ## Drop and recreate database (DESTRUCTIVE)
|
|||||||
@sleep 3
|
@sleep 3
|
||||||
$(MAKE) db-init
|
$(MAKE) db-init
|
||||||
|
|
||||||
load-data: ## Load Toronto data from APIs, seed dev data, run dbt
|
# Domain-specific data loading
|
||||||
|
load-toronto: ## Load Toronto data from APIs
|
||||||
@echo "$(GREEN)Loading Toronto neighbourhood data...$(NC)"
|
@echo "$(GREEN)Loading Toronto neighbourhood data...$(NC)"
|
||||||
$(PYTHON) scripts/data/load_toronto_data.py
|
$(PYTHON) scripts/data/load_toronto_data.py
|
||||||
@echo "$(GREEN)Seeding development data...$(NC)"
|
@echo "$(GREEN)Seeding Toronto development data...$(NC)"
|
||||||
$(PYTHON) scripts/data/seed_amenity_data.py
|
$(PYTHON) scripts/data/seed_amenity_data.py
|
||||||
|
|
||||||
load-data-only: ## Load Toronto data without running dbt or seeding
|
load-toronto-only: ## Load Toronto data without running dbt or seeding
|
||||||
@echo "$(GREEN)Loading Toronto data (skip dbt)...$(NC)"
|
@echo "$(GREEN)Loading Toronto data (skip dbt)...$(NC)"
|
||||||
$(PYTHON) scripts/data/load_toronto_data.py --skip-dbt
|
$(PYTHON) scripts/data/load_toronto_data.py --skip-dbt
|
||||||
|
|
||||||
|
# Aggregate data loading
|
||||||
|
load-data: load-toronto ## Load all project data (currently: Toronto)
|
||||||
|
@echo "$(GREEN)All data loaded!$(NC)"
|
||||||
|
|
||||||
|
load-all: load-data ## Alias for load-data
|
||||||
|
|
||||||
seed-data: ## Seed sample development data (amenities, median_age)
|
seed-data: ## Seed sample development data (amenities, median_age)
|
||||||
@echo "$(GREEN)Seeding development data...$(NC)"
|
@echo "$(GREEN)Seeding development data...$(NC)"
|
||||||
$(PYTHON) scripts/data/seed_amenity_data.py
|
$(PYTHON) scripts/data/seed_amenity_data.py
|
||||||
@@ -119,15 +127,15 @@ test-cov: ## Run pytest with coverage
|
|||||||
|
|
||||||
dbt-run: ## Run dbt models
|
dbt-run: ## Run dbt models
|
||||||
@echo "$(GREEN)Running dbt models...$(NC)"
|
@echo "$(GREEN)Running dbt models...$(NC)"
|
||||||
cd dbt && dbt run --profiles-dir .
|
@set -a && . ./.env && set +a && cd dbt && dbt run --profiles-dir .
|
||||||
|
|
||||||
dbt-test: ## Run dbt tests
|
dbt-test: ## Run dbt tests
|
||||||
@echo "$(GREEN)Running dbt tests...$(NC)"
|
@echo "$(GREEN)Running dbt tests...$(NC)"
|
||||||
cd dbt && dbt test --profiles-dir .
|
@set -a && . ./.env && set +a && cd dbt && dbt test --profiles-dir .
|
||||||
|
|
||||||
dbt-docs: ## Generate dbt documentation
|
dbt-docs: ## Generate dbt documentation
|
||||||
@echo "$(GREEN)Generating dbt docs...$(NC)"
|
@echo "$(GREEN)Generating dbt docs...$(NC)"
|
||||||
cd dbt && dbt docs generate --profiles-dir . && dbt docs serve --profiles-dir .
|
@set -a && . ./.env && set +a && cd dbt && dbt docs generate --profiles-dir . && dbt docs serve --profiles-dir .
|
||||||
|
|
||||||
# =============================================================================
|
# =============================================================================
|
||||||
# Code Quality
|
# Code Quality
|
||||||
|
|||||||
27
README.md
27
README.md
@@ -115,28 +115,31 @@ portfolio_app/
|
|||||||
│ ├── tabs/ # Tab layouts (5)
|
│ ├── tabs/ # Tab layouts (5)
|
||||||
│ └── callbacks/ # Interaction logic
|
│ └── callbacks/ # Interaction logic
|
||||||
├── components/ # Shared UI components
|
├── components/ # Shared UI components
|
||||||
├── figures/ # Plotly figure factories
|
├── figures/
|
||||||
|
│ └── toronto/ # Toronto figure factories
|
||||||
├── content/
|
├── content/
|
||||||
│ └── blog/ # Markdown blog articles
|
│ └── blog/ # Markdown blog articles
|
||||||
├── toronto/ # Toronto data logic
|
├── toronto/ # Toronto data logic
|
||||||
│ ├── parsers/ # API data extraction
|
│ ├── parsers/ # API data extraction
|
||||||
│ ├── loaders/ # Database operations
|
│ ├── loaders/ # Database operations
|
||||||
│ ├── schemas/ # Pydantic models
|
│ ├── schemas/ # Pydantic models
|
||||||
│ └── models/ # SQLAlchemy ORM
|
│ └── models/ # SQLAlchemy ORM (raw_toronto schema)
|
||||||
└── errors/ # Exception handling
|
└── errors/ # Exception handling
|
||||||
|
|
||||||
dbt/
|
dbt/ # dbt project: portfolio
|
||||||
├── models/
|
├── models/
|
||||||
│ ├── staging/ # 1:1 source tables
|
│ ├── shared/ # Cross-domain dimensions
|
||||||
│ ├── intermediate/ # Business logic
|
│ ├── staging/toronto/ # Toronto staging models
|
||||||
│ └── marts/ # Analytical tables
|
│ ├── intermediate/toronto/ # Toronto intermediate models
|
||||||
|
│ └── marts/toronto/ # Toronto analytical tables
|
||||||
|
|
||||||
notebooks/ # Data documentation (15 notebooks)
|
notebooks/
|
||||||
├── overview/ # Overview tab visualizations
|
└── toronto/ # Toronto documentation (15 notebooks)
|
||||||
├── housing/ # Housing tab visualizations
|
├── overview/ # Overview tab visualizations
|
||||||
├── safety/ # Safety tab visualizations
|
├── housing/ # Housing tab visualizations
|
||||||
├── demographics/ # Demographics tab visualizations
|
├── safety/ # Safety tab visualizations
|
||||||
└── amenities/ # Amenities tab visualizations
|
├── demographics/ # Demographics tab visualizations
|
||||||
|
└── amenities/ # Amenities tab visualizations
|
||||||
|
|
||||||
docs/
|
docs/
|
||||||
├── PROJECT_REFERENCE.md # Architecture reference
|
├── PROJECT_REFERENCE.md # Architecture reference
|
||||||
|
|||||||
@@ -1,8 +1,7 @@
|
|||||||
name: 'toronto_housing'
|
name: 'portfolio'
|
||||||
version: '1.0.0'
|
|
||||||
config-version: 2
|
config-version: 2
|
||||||
|
|
||||||
profile: 'toronto_housing'
|
profile: 'portfolio'
|
||||||
|
|
||||||
model-paths: ["models"]
|
model-paths: ["models"]
|
||||||
analysis-paths: ["analyses"]
|
analysis-paths: ["analyses"]
|
||||||
@@ -16,13 +15,19 @@ clean-targets:
|
|||||||
- "dbt_packages"
|
- "dbt_packages"
|
||||||
|
|
||||||
models:
|
models:
|
||||||
toronto_housing:
|
portfolio:
|
||||||
|
shared:
|
||||||
|
+materialized: view
|
||||||
|
+schema: shared
|
||||||
staging:
|
staging:
|
||||||
|
toronto:
|
||||||
+materialized: view
|
+materialized: view
|
||||||
+schema: staging
|
+schema: staging
|
||||||
intermediate:
|
intermediate:
|
||||||
|
toronto:
|
||||||
+materialized: view
|
+materialized: view
|
||||||
+schema: intermediate
|
+schema: intermediate
|
||||||
marts:
|
marts:
|
||||||
|
toronto:
|
||||||
+materialized: table
|
+materialized: table
|
||||||
+schema: marts
|
+schema: marts
|
||||||
|
|||||||
33
dbt/models/shared/_shared.yml
Normal file
33
dbt/models/shared/_shared.yml
Normal file
@@ -0,0 +1,33 @@
|
|||||||
|
version: 2
|
||||||
|
|
||||||
|
models:
|
||||||
|
- name: stg_dimensions__time
|
||||||
|
description: "Staged time dimension - shared across all projects"
|
||||||
|
columns:
|
||||||
|
- name: date_key
|
||||||
|
description: "Primary key (YYYYMM format)"
|
||||||
|
data_tests:
|
||||||
|
- unique
|
||||||
|
- not_null
|
||||||
|
- name: full_date
|
||||||
|
description: "First day of month"
|
||||||
|
data_tests:
|
||||||
|
- not_null
|
||||||
|
- name: year
|
||||||
|
description: "Calendar year"
|
||||||
|
data_tests:
|
||||||
|
- not_null
|
||||||
|
- name: month
|
||||||
|
description: "Month number (1-12)"
|
||||||
|
data_tests:
|
||||||
|
- not_null
|
||||||
|
- name: quarter
|
||||||
|
description: "Quarter (1-4)"
|
||||||
|
data_tests:
|
||||||
|
- not_null
|
||||||
|
- name: month_name
|
||||||
|
description: "Month name"
|
||||||
|
data_tests:
|
||||||
|
- not_null
|
||||||
|
- name: is_month_start
|
||||||
|
description: "Always true (monthly grain)"
|
||||||
25
dbt/models/shared/_sources.yml
Normal file
25
dbt/models/shared/_sources.yml
Normal file
@@ -0,0 +1,25 @@
|
|||||||
|
version: 2
|
||||||
|
|
||||||
|
sources:
|
||||||
|
- name: shared
|
||||||
|
description: "Shared dimension tables used across all dashboards"
|
||||||
|
database: portfolio
|
||||||
|
schema: public
|
||||||
|
tables:
|
||||||
|
- name: dim_time
|
||||||
|
description: "Time dimension (monthly grain) - shared across all projects"
|
||||||
|
columns:
|
||||||
|
- name: date_key
|
||||||
|
description: "Primary key (YYYYMM format)"
|
||||||
|
- name: full_date
|
||||||
|
description: "First day of month"
|
||||||
|
- name: year
|
||||||
|
description: "Calendar year"
|
||||||
|
- name: month
|
||||||
|
description: "Month number (1-12)"
|
||||||
|
- name: quarter
|
||||||
|
description: "Quarter (1-4)"
|
||||||
|
- name: month_name
|
||||||
|
description: "Month name"
|
||||||
|
- name: is_month_start
|
||||||
|
description: "Always true (monthly grain)"
|
||||||
@@ -1,9 +1,10 @@
|
|||||||
-- Staged time dimension
|
-- Staged time dimension
|
||||||
-- Source: dim_time table
|
-- Source: shared.dim_time table
|
||||||
-- Grain: One row per month
|
-- Grain: One row per month
|
||||||
|
-- Note: Shared dimension used across all dashboard projects
|
||||||
|
|
||||||
with source as (
|
with source as (
|
||||||
select * from {{ source('toronto_housing', 'dim_time') }}
|
select * from {{ source('shared', 'dim_time') }}
|
||||||
),
|
),
|
||||||
|
|
||||||
staged as (
|
staged as (
|
||||||
@@ -1,10 +1,10 @@
|
|||||||
version: 2
|
version: 2
|
||||||
|
|
||||||
sources:
|
sources:
|
||||||
- name: toronto_housing
|
- name: toronto
|
||||||
description: "Toronto housing data loaded from CMHC and City of Toronto sources"
|
description: "Toronto data loaded from CMHC and City of Toronto sources"
|
||||||
database: portfolio
|
database: portfolio
|
||||||
schema: public
|
schema: raw_toronto
|
||||||
tables:
|
tables:
|
||||||
- name: fact_rentals
|
- name: fact_rentals
|
||||||
description: "CMHC annual rental survey data by zone and bedroom type"
|
description: "CMHC annual rental survey data by zone and bedroom type"
|
||||||
@@ -16,12 +16,6 @@ sources:
|
|||||||
- name: zone_key
|
- name: zone_key
|
||||||
description: "Foreign key to dim_cmhc_zone"
|
description: "Foreign key to dim_cmhc_zone"
|
||||||
|
|
||||||
- name: dim_time
|
|
||||||
description: "Time dimension (monthly grain)"
|
|
||||||
columns:
|
|
||||||
- name: date_key
|
|
||||||
description: "Primary key (YYYYMMDD format)"
|
|
||||||
|
|
||||||
- name: dim_cmhc_zone
|
- name: dim_cmhc_zone
|
||||||
description: "CMHC zone dimension with geometry"
|
description: "CMHC zone dimension with geometry"
|
||||||
columns:
|
columns:
|
||||||
@@ -18,15 +18,6 @@ models:
|
|||||||
tests:
|
tests:
|
||||||
- not_null
|
- not_null
|
||||||
|
|
||||||
- name: stg_dimensions__time
|
|
||||||
description: "Staged time dimension"
|
|
||||||
columns:
|
|
||||||
- name: date_key
|
|
||||||
description: "Date dimension key (YYYYMMDD)"
|
|
||||||
tests:
|
|
||||||
- unique
|
|
||||||
- not_null
|
|
||||||
|
|
||||||
- name: stg_dimensions__cmhc_zones
|
- name: stg_dimensions__cmhc_zones
|
||||||
description: "Staged CMHC zone dimension"
|
description: "Staged CMHC zone dimension"
|
||||||
columns:
|
columns:
|
||||||
@@ -6,8 +6,8 @@ with source as (
|
|||||||
select
|
select
|
||||||
f.*,
|
f.*,
|
||||||
t.year as survey_year
|
t.year as survey_year
|
||||||
from {{ source('toronto_housing', 'fact_rentals') }} f
|
from {{ source('toronto', 'fact_rentals') }} f
|
||||||
join {{ source('toronto_housing', 'dim_time') }} t on f.date_key = t.date_key
|
join {{ source('shared', 'dim_time') }} t on f.date_key = t.date_key
|
||||||
),
|
),
|
||||||
|
|
||||||
staged as (
|
staged as (
|
||||||
@@ -3,7 +3,7 @@
|
|||||||
-- Grain: One row per zone-neighbourhood intersection
|
-- Grain: One row per zone-neighbourhood intersection
|
||||||
|
|
||||||
with source as (
|
with source as (
|
||||||
select * from {{ source('toronto_housing', 'bridge_cmhc_neighbourhood') }}
|
select * from {{ source('toronto', 'bridge_cmhc_neighbourhood') }}
|
||||||
),
|
),
|
||||||
|
|
||||||
staged as (
|
staged as (
|
||||||
@@ -3,7 +3,7 @@
|
|||||||
-- Grain: One row per zone
|
-- Grain: One row per zone
|
||||||
|
|
||||||
with source as (
|
with source as (
|
||||||
select * from {{ source('toronto_housing', 'dim_cmhc_zone') }}
|
select * from {{ source('toronto', 'dim_cmhc_zone') }}
|
||||||
),
|
),
|
||||||
|
|
||||||
staged as (
|
staged as (
|
||||||
@@ -3,7 +3,7 @@
|
|||||||
-- Grain: One row per neighbourhood per amenity type per year
|
-- Grain: One row per neighbourhood per amenity type per year
|
||||||
|
|
||||||
with source as (
|
with source as (
|
||||||
select * from {{ source('toronto_housing', 'fact_amenities') }}
|
select * from {{ source('toronto', 'fact_amenities') }}
|
||||||
),
|
),
|
||||||
|
|
||||||
staged as (
|
staged as (
|
||||||
@@ -3,7 +3,7 @@
|
|||||||
-- Grain: One row per neighbourhood per census year
|
-- Grain: One row per neighbourhood per census year
|
||||||
|
|
||||||
with source as (
|
with source as (
|
||||||
select * from {{ source('toronto_housing', 'fact_census') }}
|
select * from {{ source('toronto', 'fact_census') }}
|
||||||
),
|
),
|
||||||
|
|
||||||
staged as (
|
staged as (
|
||||||
@@ -3,7 +3,7 @@
|
|||||||
-- Grain: One row per neighbourhood per year per crime type
|
-- Grain: One row per neighbourhood per year per crime type
|
||||||
|
|
||||||
with source as (
|
with source as (
|
||||||
select * from {{ source('toronto_housing', 'fact_crime') }}
|
select * from {{ source('toronto', 'fact_crime') }}
|
||||||
),
|
),
|
||||||
|
|
||||||
staged as (
|
staged as (
|
||||||
@@ -3,7 +3,7 @@
|
|||||||
-- Grain: One row per neighbourhood (158 total)
|
-- Grain: One row per neighbourhood (158 total)
|
||||||
|
|
||||||
with source as (
|
with source as (
|
||||||
select * from {{ source('toronto_housing', 'dim_neighbourhood') }}
|
select * from {{ source('toronto', 'dim_neighbourhood') }}
|
||||||
),
|
),
|
||||||
|
|
||||||
staged as (
|
staged as (
|
||||||
@@ -1,4 +1,4 @@
|
|||||||
toronto_housing:
|
portfolio:
|
||||||
target: dev
|
target: dev
|
||||||
outputs:
|
outputs:
|
||||||
dev:
|
dev:
|
||||||
|
|||||||
@@ -290,7 +290,7 @@ Dashboard tabs are in `portfolio_app/pages/toronto/tabs/`.
|
|||||||
|
|
||||||
import dash_mantine_components as dmc
|
import dash_mantine_components as dmc
|
||||||
|
|
||||||
from portfolio_app.figures.choropleth import create_choropleth
|
from portfolio_app.figures.toronto.choropleth import create_choropleth
|
||||||
from portfolio_app.toronto.demo_data import get_demo_data
|
from portfolio_app.toronto.demo_data import get_demo_data
|
||||||
|
|
||||||
|
|
||||||
@@ -339,13 +339,13 @@ dmc.TabsPanel(create_your_tab_layout(), value="your-tab"),
|
|||||||
|
|
||||||
## Creating Figure Factories
|
## Creating Figure Factories
|
||||||
|
|
||||||
Figure factories are in `portfolio_app/figures/`. They create reusable Plotly figures.
|
Figure factories are organized by dashboard domain under `portfolio_app/figures/{domain}/`.
|
||||||
|
|
||||||
### Pattern
|
### Pattern
|
||||||
|
|
||||||
```python
|
```python
|
||||||
# figures/your_chart.py
|
# figures/toronto/your_chart.py
|
||||||
"""Your chart type factory."""
|
"""Your chart type factory for Toronto dashboard."""
|
||||||
|
|
||||||
import plotly.express as px
|
import plotly.express as px
|
||||||
import plotly.graph_objects as go
|
import plotly.graph_objects as go
|
||||||
@@ -382,7 +382,7 @@ def create_your_chart(
|
|||||||
### Export from `__init__.py`
|
### Export from `__init__.py`
|
||||||
|
|
||||||
```python
|
```python
|
||||||
# figures/__init__.py
|
# figures/toronto/__init__.py
|
||||||
from .your_chart import create_your_chart
|
from .your_chart import create_your_chart
|
||||||
|
|
||||||
__all__ = [
|
__all__ = [
|
||||||
@@ -391,6 +391,14 @@ __all__ = [
|
|||||||
]
|
]
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### Importing Figure Factories
|
||||||
|
|
||||||
|
```python
|
||||||
|
# In callbacks or tabs
|
||||||
|
from portfolio_app.figures.toronto import create_choropleth_figure
|
||||||
|
from portfolio_app.figures.toronto.bar_charts import create_ranking_bar
|
||||||
|
```
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Branch Workflow
|
## Branch Workflow
|
||||||
|
|||||||
@@ -116,16 +116,38 @@ erDiagram
|
|||||||
|
|
||||||
## Schema Layers
|
## Schema Layers
|
||||||
|
|
||||||
### Raw Schema
|
### Database Schemas
|
||||||
|
|
||||||
Raw data is loaded directly from external sources without transformation:
|
| Schema | Purpose | Managed By |
|
||||||
|
|--------|---------|------------|
|
||||||
|
| `public` | Shared dimensions (dim_time) | SQLAlchemy |
|
||||||
|
| `raw_toronto` | Toronto dimension and fact tables | SQLAlchemy |
|
||||||
|
| `staging` | Staging models | dbt |
|
||||||
|
| `intermediate` | Intermediate models | dbt |
|
||||||
|
| `marts` | Analytical tables | dbt |
|
||||||
|
|
||||||
|
### Raw Toronto Schema (raw_toronto)
|
||||||
|
|
||||||
|
Toronto-specific tables loaded by SQLAlchemy:
|
||||||
|
|
||||||
| Table | Source | Description |
|
| Table | Source | Description |
|
||||||
|-------|--------|-------------|
|
|-------|--------|-------------|
|
||||||
| `raw.neighbourhoods` | City of Toronto API | GeoJSON neighbourhood boundaries |
|
| `dim_neighbourhood` | City of Toronto API | 158 neighbourhood boundaries |
|
||||||
| `raw.census_profiles` | City of Toronto API | Census profile data |
|
| `dim_cmhc_zone` | CMHC | ~20 rental market zones |
|
||||||
| `raw.crime_data` | Toronto Police API | Crime statistics by neighbourhood |
|
| `dim_policy_event` | Manual | Policy events for annotation |
|
||||||
| `raw.cmhc_rentals` | CMHC Data Files | Rental market survey data |
|
| `fact_census` | City of Toronto API | Census profile data |
|
||||||
|
| `fact_crime` | Toronto Police API | Crime statistics |
|
||||||
|
| `fact_amenities` | City of Toronto API | Amenity counts |
|
||||||
|
| `fact_rentals` | CMHC Data Files | Rental market survey data |
|
||||||
|
| `bridge_cmhc_neighbourhood` | Computed | Zone-neighbourhood mapping |
|
||||||
|
|
||||||
|
### Public Schema
|
||||||
|
|
||||||
|
Shared dimensions used across all projects:
|
||||||
|
|
||||||
|
| Table | Description |
|
||||||
|
|-------|-------------|
|
||||||
|
| `dim_time` | Time dimension (monthly grain) |
|
||||||
|
|
||||||
### Staging Schema (dbt)
|
### Staging Schema (dbt)
|
||||||
|
|
||||||
|
|||||||
@@ -76,7 +76,8 @@ portfolio_app/
|
|||||||
├── components/ # Shared UI components
|
├── components/ # Shared UI components
|
||||||
├── content/blog/ # Markdown blog articles
|
├── content/blog/ # Markdown blog articles
|
||||||
├── errors/ # Exception handling
|
├── errors/ # Exception handling
|
||||||
├── figures/ # Plotly figure factories
|
├── figures/
|
||||||
|
│ └── toronto/ # Toronto figure factories
|
||||||
├── pages/
|
├── pages/
|
||||||
│ ├── home.py
|
│ ├── home.py
|
||||||
│ ├── about.py
|
│ ├── about.py
|
||||||
@@ -96,11 +97,21 @@ portfolio_app/
|
|||||||
│ ├── parsers/ # API extraction (geo, toronto_open_data, toronto_police, cmhc)
|
│ ├── parsers/ # API extraction (geo, toronto_open_data, toronto_police, cmhc)
|
||||||
│ ├── loaders/ # Database operations (base, cmhc, cmhc_crosswalk)
|
│ ├── loaders/ # Database operations (base, cmhc, cmhc_crosswalk)
|
||||||
│ ├── schemas/ # Pydantic models
|
│ ├── schemas/ # Pydantic models
|
||||||
│ ├── models/ # SQLAlchemy ORM
|
│ ├── models/ # SQLAlchemy ORM (raw_toronto schema)
|
||||||
│ ├── services/ # Query functions (neighbourhood_service, geometry_service)
|
│ ├── services/ # Query functions (neighbourhood_service, geometry_service)
|
||||||
│ └── demo_data.py # Sample data
|
│ └── demo_data.py # Sample data
|
||||||
└── utils/
|
└── utils/
|
||||||
└── markdown_loader.py # Blog article loading
|
└── markdown_loader.py # Blog article loading
|
||||||
|
|
||||||
|
dbt/ # dbt project: portfolio
|
||||||
|
├── models/
|
||||||
|
│ ├── shared/ # Cross-domain dimensions
|
||||||
|
│ ├── staging/toronto/ # Toronto staging models
|
||||||
|
│ ├── intermediate/toronto/ # Toronto intermediate models
|
||||||
|
│ └── marts/toronto/ # Toronto mart tables
|
||||||
|
|
||||||
|
notebooks/
|
||||||
|
└── toronto/ # Toronto documentation notebooks
|
||||||
```
|
```
|
||||||
|
|
||||||
---
|
---
|
||||||
@@ -144,10 +155,20 @@ CMHC Zones (~20) ← Rental data (Census Tract aligned)
|
|||||||
| `fact_rentals` | Fact | Rental data by CMHC zone |
|
| `fact_rentals` | Fact | Rental data by CMHC zone |
|
||||||
| `fact_amenities` | Fact | Amenity counts by neighbourhood |
|
| `fact_amenities` | Fact | Amenity counts by neighbourhood |
|
||||||
|
|
||||||
### dbt Layers
|
### dbt Project: `portfolio`
|
||||||
|
|
||||||
|
**Model Structure:**
|
||||||
|
```
|
||||||
|
dbt/models/
|
||||||
|
├── shared/ # Cross-domain dimensions (stg_dimensions__time)
|
||||||
|
├── staging/toronto/ # Toronto staging models
|
||||||
|
├── intermediate/toronto/ # Toronto intermediate models
|
||||||
|
└── marts/toronto/ # Toronto mart tables
|
||||||
|
```
|
||||||
|
|
||||||
| Layer | Naming | Example |
|
| Layer | Naming | Example |
|
||||||
|-------|--------|---------|
|
|-------|--------|---------|
|
||||||
|
| Shared | `stg_dimensions__*` | `stg_dimensions__time` |
|
||||||
| Staging | `stg_{source}__{entity}` | `stg_toronto__neighbourhoods` |
|
| Staging | `stg_{source}__{entity}` | `stg_toronto__neighbourhoods` |
|
||||||
| Intermediate | `int_{domain}__{transform}` | `int_neighbourhood__demographics` |
|
| Intermediate | `int_{domain}__{transform}` | `int_neighbourhood__demographics` |
|
||||||
| Marts | `mart_{domain}` | `mart_neighbourhood_overview` |
|
| Marts | `mart_{domain}` | `mart_neighbourhood_overview` |
|
||||||
|
|||||||
@@ -10,7 +10,9 @@ This runbook describes how to add a new data dashboard to the portfolio applicat
|
|||||||
|
|
||||||
## Directory Structure
|
## Directory Structure
|
||||||
|
|
||||||
Create the following structure under `portfolio_app/`:
|
Create the following structure:
|
||||||
|
|
||||||
|
### Application Code (`portfolio_app/`)
|
||||||
|
|
||||||
```
|
```
|
||||||
portfolio_app/
|
portfolio_app/
|
||||||
@@ -33,8 +35,40 @@ portfolio_app/
|
|||||||
│ │ └── __init__.py
|
│ │ └── __init__.py
|
||||||
│ ├── schemas/ # Pydantic models
|
│ ├── schemas/ # Pydantic models
|
||||||
│ │ └── __init__.py
|
│ │ └── __init__.py
|
||||||
│ └── models/ # SQLAlchemy ORM
|
│ └── models/ # SQLAlchemy ORM (schema: raw_{dashboard_name})
|
||||||
│ └── __init__.py
|
│ └── __init__.py
|
||||||
|
└── figures/
|
||||||
|
└── {dashboard_name}/ # Figure factories for this dashboard
|
||||||
|
├── __init__.py
|
||||||
|
└── ... # Chart modules
|
||||||
|
```
|
||||||
|
|
||||||
|
### dbt Models (`dbt/models/`)
|
||||||
|
|
||||||
|
```
|
||||||
|
dbt/models/
|
||||||
|
├── staging/
|
||||||
|
│ └── {dashboard_name}/ # Staging models
|
||||||
|
│ ├── _sources.yml # Source definitions (schema: raw_{dashboard_name})
|
||||||
|
│ ├── _staging.yml # Model tests/docs
|
||||||
|
│ └── stg_*.sql # Staging models
|
||||||
|
├── intermediate/
|
||||||
|
│ └── {dashboard_name}/ # Intermediate models
|
||||||
|
│ ├── _intermediate.yml
|
||||||
|
│ └── int_*.sql
|
||||||
|
└── marts/
|
||||||
|
└── {dashboard_name}/ # Mart tables
|
||||||
|
├── _marts.yml
|
||||||
|
└── mart_*.sql
|
||||||
|
```
|
||||||
|
|
||||||
|
### Documentation (`notebooks/`)
|
||||||
|
|
||||||
|
```
|
||||||
|
notebooks/
|
||||||
|
└── {dashboard_name}/ # Domain subdirectories
|
||||||
|
├── overview/
|
||||||
|
├── ...
|
||||||
```
|
```
|
||||||
|
|
||||||
## Step-by-Step Checklist
|
## Step-by-Step Checklist
|
||||||
@@ -47,24 +81,47 @@ portfolio_app/
|
|||||||
- [ ] Create loaders in `{dashboard_name}/loaders/`
|
- [ ] Create loaders in `{dashboard_name}/loaders/`
|
||||||
- [ ] Add database migrations if needed
|
- [ ] Add database migrations if needed
|
||||||
|
|
||||||
### 2. dbt Models
|
### 2. Database Schema
|
||||||
|
|
||||||
|
- [ ] Define schema constant in models (e.g., `RAW_FOOTBALL_SCHEMA = "raw_football"`)
|
||||||
|
- [ ] Add `__table_args__ = {"schema": RAW_FOOTBALL_SCHEMA}` to all models
|
||||||
|
- [ ] Update `scripts/db/init_schema.py` to create the new schema
|
||||||
|
|
||||||
|
### 3. dbt Models
|
||||||
|
|
||||||
Create dbt models in `dbt/models/`:
|
Create dbt models in `dbt/models/`:
|
||||||
|
|
||||||
- [ ] `staging/stg_{source}__{entity}.sql` - Raw data cleaning
|
- [ ] `staging/{dashboard_name}/_sources.yml` - Source definitions pointing to `raw_{dashboard_name}` schema
|
||||||
- [ ] `intermediate/int_{domain}__{transform}.sql` - Business logic
|
- [ ] `staging/{dashboard_name}/stg_{source}__{entity}.sql` - Raw data cleaning
|
||||||
- [ ] `marts/mart_{domain}.sql` - Final analytical tables
|
- [ ] `intermediate/{dashboard_name}/int_{domain}__{transform}.sql` - Business logic
|
||||||
|
- [ ] `marts/{dashboard_name}/mart_{domain}.sql` - Final analytical tables
|
||||||
|
|
||||||
|
Update `dbt/dbt_project.yml` with new subdirectory config:
|
||||||
|
```yaml
|
||||||
|
models:
|
||||||
|
portfolio:
|
||||||
|
staging:
|
||||||
|
{dashboard_name}:
|
||||||
|
+materialized: view
|
||||||
|
+schema: staging
|
||||||
|
```
|
||||||
|
|
||||||
Follow naming conventions:
|
Follow naming conventions:
|
||||||
- Staging: `stg_{source}__{entity}`
|
- Staging: `stg_{source}__{entity}`
|
||||||
- Intermediate: `int_{domain}__{transform}`
|
- Intermediate: `int_{domain}__{transform}`
|
||||||
- Marts: `mart_{domain}`
|
- Marts: `mart_{domain}`
|
||||||
|
|
||||||
### 3. Visualization Layer
|
### 4. Visualization Layer
|
||||||
|
|
||||||
- [ ] Create figure factories in `figures/` (or reuse existing)
|
- [ ] Create figure factories in `figures/{dashboard_name}/`
|
||||||
|
- [ ] Create `figures/{dashboard_name}/__init__.py` with exports
|
||||||
- [ ] Follow the factory pattern: `create_{chart_type}_figure(data, **kwargs)`
|
- [ ] Follow the factory pattern: `create_{chart_type}_figure(data, **kwargs)`
|
||||||
|
|
||||||
|
Import pattern:
|
||||||
|
```python
|
||||||
|
from portfolio_app.figures.{dashboard_name} import create_choropleth_figure
|
||||||
|
```
|
||||||
|
|
||||||
### 4. Dashboard Pages
|
### 4. Dashboard Pages
|
||||||
|
|
||||||
#### Main Dashboard (`pages/{dashboard_name}/dashboard.py`)
|
#### Main Dashboard (`pages/{dashboard_name}/dashboard.py`)
|
||||||
|
|||||||
@@ -1,17 +1,18 @@
|
|||||||
# Toronto Neighbourhood Dashboard - Notebooks
|
# Dashboard Documentation Notebooks
|
||||||
|
|
||||||
Documentation notebooks for the Toronto Neighbourhood Dashboard visualizations. Each notebook documents how data is queried, transformed, and visualized using the figure factory pattern.
|
Documentation notebooks organized by dashboard project. Each notebook documents how data is queried, transformed, and visualized using the figure factory pattern.
|
||||||
|
|
||||||
## Directory Structure
|
## Directory Structure
|
||||||
|
|
||||||
```
|
```
|
||||||
notebooks/
|
notebooks/
|
||||||
├── README.md # This file
|
├── README.md # This file
|
||||||
├── overview/ # Overview tab visualizations
|
└── toronto/ # Toronto Neighbourhood Dashboard
|
||||||
├── housing/ # Housing tab visualizations
|
├── overview/ # Overview tab visualizations
|
||||||
├── safety/ # Safety tab visualizations
|
├── housing/ # Housing tab visualizations
|
||||||
├── demographics/ # Demographics tab visualizations
|
├── safety/ # Safety tab visualizations
|
||||||
└── amenities/ # Amenities tab visualizations
|
├── demographics/ # Demographics tab visualizations
|
||||||
|
└── amenities/ # Amenities tab visualizations
|
||||||
```
|
```
|
||||||
|
|
||||||
## Notebook Template
|
## Notebook Template
|
||||||
|
|||||||
@@ -1,123 +0,0 @@
|
|||||||
{
|
|
||||||
"cells": [
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"# Amenity Radar Chart\n",
|
|
||||||
"\n",
|
|
||||||
"Spider/radar chart comparing amenity categories for selected neighbourhoods."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"## 1. Data Reference\n",
|
|
||||||
"\n",
|
|
||||||
"### Source Tables\n",
|
|
||||||
"\n",
|
|
||||||
"| Table | Grain | Key Columns |\n",
|
|
||||||
"|-------|-------|-------------|\n",
|
|
||||||
"| `mart_neighbourhood_amenities` | neighbourhood × year | parks_index, schools_index, transit_index |\n",
|
|
||||||
"\n",
|
|
||||||
"### SQL Query"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": "import pandas as pd\nfrom sqlalchemy import create_engine\nfrom dotenv import load_dotenv\nimport os\n\n# Load .env from project root\nload_dotenv('../../.env')\n\nengine = create_engine(os.environ['DATABASE_URL'])\n\nquery = \"\"\"\nSELECT\n neighbourhood_name,\n parks_index,\n schools_index,\n transit_index,\n amenity_index,\n amenity_tier\nFROM public_marts.mart_neighbourhood_amenities\nWHERE year = (SELECT MAX(year) FROM public_marts.mart_neighbourhood_amenities)\nORDER BY amenity_index DESC\n\"\"\"\n\ndf = pd.read_sql(query, engine)\nprint(f\"Loaded {len(df)} neighbourhoods\")"
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"### Transformation Steps\n",
|
|
||||||
"\n",
|
|
||||||
"1. Select top 5 and bottom 5 neighbourhoods by amenity index\n",
|
|
||||||
"2. Reshape for radar chart format"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"# Select representative neighbourhoods\n",
|
|
||||||
"top_5 = df.head(5)\n",
|
|
||||||
"bottom_5 = df.tail(5)\n",
|
|
||||||
"\n",
|
|
||||||
"# Prepare radar data\n",
|
|
||||||
"categories = ['Parks', 'Schools', 'Transit']\n",
|
|
||||||
"index_columns = ['parks_index', 'schools_index', 'transit_index']"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"### Sample Output"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": [
|
|
||||||
"print(\"Top 5 Amenity-Rich Neighbourhoods:\")\n",
|
|
||||||
"display(top_5[['neighbourhood_name', 'parks_index', 'schools_index', 'transit_index', 'amenity_index']])\n",
|
|
||||||
"print(\"\\nBottom 5 Underserved Neighbourhoods:\")\n",
|
|
||||||
"display(bottom_5[['neighbourhood_name', 'parks_index', 'schools_index', 'transit_index', 'amenity_index']])"
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"## 2. Data Visualization\n",
|
|
||||||
"\n",
|
|
||||||
"### Figure Factory\n",
|
|
||||||
"\n",
|
|
||||||
"Uses `create_radar` from `portfolio_app.figures.radar`."
|
|
||||||
]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "code",
|
|
||||||
"execution_count": null,
|
|
||||||
"metadata": {},
|
|
||||||
"outputs": [],
|
|
||||||
"source": "import sys\nsys.path.insert(0, '../..')\n\nfrom portfolio_app.figures.radar import create_comparison_radar\n\n# Compare top neighbourhood vs city average (100)\ntop_hood = top_5.iloc[0]\nmetrics = ['parks_index', 'schools_index', 'transit_index']\n\nfig = create_comparison_radar(\n selected_data=top_hood.to_dict(),\n average_data={'parks_index': 100, 'schools_index': 100, 'transit_index': 100},\n metrics=metrics,\n selected_name=top_hood['neighbourhood_name'],\n average_name='City Average',\n title=f\"Amenity Profile: {top_hood['neighbourhood_name']} vs City Average\",\n)\n\nfig.show()"
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"cell_type": "markdown",
|
|
||||||
"metadata": {},
|
|
||||||
"source": [
|
|
||||||
"### Index Interpretation\n",
|
|
||||||
"\n",
|
|
||||||
"| Value | Meaning |\n",
|
|
||||||
"|-------|--------|\n",
|
|
||||||
"| < 100 | Below city average |\n",
|
|
||||||
"| = 100 | City average |\n",
|
|
||||||
"| > 100 | Above city average |"
|
|
||||||
]
|
|
||||||
}
|
|
||||||
],
|
|
||||||
"metadata": {
|
|
||||||
"kernelspec": {
|
|
||||||
"display_name": "Python 3",
|
|
||||||
"language": "python",
|
|
||||||
"name": "python3"
|
|
||||||
},
|
|
||||||
"language_info": {
|
|
||||||
"name": "python",
|
|
||||||
"version": "3.11.0"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"nbformat": 4,
|
|
||||||
"nbformat_minor": 4
|
|
||||||
}
|
|
||||||
@@ -19,7 +19,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"| Table | Grain | Key Columns |\n",
|
"| Table | Grain | Key Columns |\n",
|
||||||
"|-------|-------|-------------|\n",
|
"|-------|-------|-------------|\n",
|
||||||
"| `mart_neighbourhood_amenities` | neighbourhood \u00d7 year | amenity_index, total_amenities_per_1000, amenity_tier, geometry |\n",
|
"| `mart_neighbourhood_amenities` | neighbourhood × year | amenity_index, total_amenities_per_1000, amenity_tier, geometry |\n",
|
||||||
"\n",
|
"\n",
|
||||||
"### SQL Query"
|
"### SQL Query"
|
||||||
]
|
]
|
||||||
@@ -30,15 +30,16 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"import pandas as pd\n",
|
|
||||||
"from sqlalchemy import create_engine\n",
|
|
||||||
"from dotenv import load_dotenv\n",
|
|
||||||
"import os\n",
|
"import os\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# Load .env from project root\n",
|
"import pandas as pd\n",
|
||||||
"load_dotenv('../../.env')\n",
|
"from dotenv import load_dotenv\n",
|
||||||
|
"from sqlalchemy import create_engine\n",
|
||||||
"\n",
|
"\n",
|
||||||
"engine = create_engine(os.environ['DATABASE_URL'])\n",
|
"# Load .env from project root\n",
|
||||||
|
"load_dotenv(\"../../.env\")\n",
|
||||||
|
"\n",
|
||||||
|
"engine = create_engine(os.environ[\"DATABASE_URL\"])\n",
|
||||||
"\n",
|
"\n",
|
||||||
"query = \"\"\"\n",
|
"query = \"\"\"\n",
|
||||||
"SELECT\n",
|
"SELECT\n",
|
||||||
@@ -79,17 +80,16 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"import geopandas as gpd\n",
|
|
||||||
"import json\n",
|
"import json\n",
|
||||||
"\n",
|
"\n",
|
||||||
|
"import geopandas as gpd\n",
|
||||||
|
"\n",
|
||||||
"gdf = gpd.GeoDataFrame(\n",
|
"gdf = gpd.GeoDataFrame(\n",
|
||||||
" df,\n",
|
" df, geometry=gpd.GeoSeries.from_wkb(df[\"geometry\"]), crs=\"EPSG:4326\"\n",
|
||||||
" geometry=gpd.GeoSeries.from_wkb(df['geometry']),\n",
|
|
||||||
" crs='EPSG:4326'\n",
|
|
||||||
")\n",
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
"geojson = json.loads(gdf.to_json())\n",
|
"geojson = json.loads(gdf.to_json())\n",
|
||||||
"data = df.drop(columns=['geometry']).to_dict('records')"
|
"data = df.drop(columns=[\"geometry\"]).to_dict(\"records\")"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -105,7 +105,9 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"df[['neighbourhood_name', 'total_amenities_per_1000', 'amenity_index', 'amenity_tier']].head(10)"
|
"df[\n",
|
||||||
|
" [\"neighbourhood_name\", \"total_amenities_per_1000\", \"amenity_index\", \"amenity_tier\"]\n",
|
||||||
|
"].head(10)"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -116,7 +118,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"### Figure Factory\n",
|
"### Figure Factory\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Uses `create_choropleth_figure` from `portfolio_app.figures.choropleth`."
|
"Uses `create_choropleth_figure` from `portfolio_app.figures.toronto.choropleth`."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -126,18 +128,24 @@
|
|||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"import sys\n",
|
"import sys\n",
|
||||||
"sys.path.insert(0, '../..')\n",
|
|
||||||
"\n",
|
"\n",
|
||||||
"from portfolio_app.figures.choropleth import create_choropleth_figure\n",
|
"sys.path.insert(0, \"../..\")\n",
|
||||||
|
"\n",
|
||||||
|
"from portfolio_app.figures.toronto.choropleth import create_choropleth_figure\n",
|
||||||
"\n",
|
"\n",
|
||||||
"fig = create_choropleth_figure(\n",
|
"fig = create_choropleth_figure(\n",
|
||||||
" geojson=geojson,\n",
|
" geojson=geojson,\n",
|
||||||
" data=data,\n",
|
" data=data,\n",
|
||||||
" location_key='neighbourhood_id',\n",
|
" location_key=\"neighbourhood_id\",\n",
|
||||||
" color_column='total_amenities_per_1000',\n",
|
" color_column=\"total_amenities_per_1000\",\n",
|
||||||
" hover_data=['neighbourhood_name', 'amenity_index', 'parks_per_1000', 'schools_per_1000'],\n",
|
" hover_data=[\n",
|
||||||
" color_scale='Greens',\n",
|
" \"neighbourhood_name\",\n",
|
||||||
" title='Toronto Amenities per 1,000 Population',\n",
|
" \"amenity_index\",\n",
|
||||||
|
" \"parks_per_1000\",\n",
|
||||||
|
" \"schools_per_1000\",\n",
|
||||||
|
" ],\n",
|
||||||
|
" color_scale=\"Greens\",\n",
|
||||||
|
" title=\"Toronto Amenities per 1,000 Population\",\n",
|
||||||
" zoom=10,\n",
|
" zoom=10,\n",
|
||||||
")\n",
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
191
notebooks/toronto/amenities/amenity_radar.ipynb
Normal file
191
notebooks/toronto/amenities/amenity_radar.ipynb
Normal file
@@ -0,0 +1,191 @@
|
|||||||
|
{
|
||||||
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"# Amenity Radar Chart\n",
|
||||||
|
"\n",
|
||||||
|
"Spider/radar chart comparing amenity categories for selected neighbourhoods."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## 1. Data Reference\n",
|
||||||
|
"\n",
|
||||||
|
"### Source Tables\n",
|
||||||
|
"\n",
|
||||||
|
"| Table | Grain | Key Columns |\n",
|
||||||
|
"|-------|-------|-------------|\n",
|
||||||
|
"| `mart_neighbourhood_amenities` | neighbourhood × year | parks_index, schools_index, transit_index |\n",
|
||||||
|
"\n",
|
||||||
|
"### SQL Query"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import os\n",
|
||||||
|
"\n",
|
||||||
|
"import pandas as pd\n",
|
||||||
|
"from dotenv import load_dotenv\n",
|
||||||
|
"from sqlalchemy import create_engine\n",
|
||||||
|
"\n",
|
||||||
|
"# Load .env from project root\n",
|
||||||
|
"load_dotenv(\"../../.env\")\n",
|
||||||
|
"\n",
|
||||||
|
"engine = create_engine(os.environ[\"DATABASE_URL\"])\n",
|
||||||
|
"\n",
|
||||||
|
"query = \"\"\"\n",
|
||||||
|
"SELECT\n",
|
||||||
|
" neighbourhood_name,\n",
|
||||||
|
" parks_index,\n",
|
||||||
|
" schools_index,\n",
|
||||||
|
" transit_index,\n",
|
||||||
|
" amenity_index,\n",
|
||||||
|
" amenity_tier\n",
|
||||||
|
"FROM public_marts.mart_neighbourhood_amenities\n",
|
||||||
|
"WHERE year = (SELECT MAX(year) FROM public_marts.mart_neighbourhood_amenities)\n",
|
||||||
|
"ORDER BY amenity_index DESC\n",
|
||||||
|
"\"\"\"\n",
|
||||||
|
"\n",
|
||||||
|
"df = pd.read_sql(query, engine)\n",
|
||||||
|
"print(f\"Loaded {len(df)} neighbourhoods\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Transformation Steps\n",
|
||||||
|
"\n",
|
||||||
|
"1. Select top 5 and bottom 5 neighbourhoods by amenity index\n",
|
||||||
|
"2. Reshape for radar chart format"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"# Select representative neighbourhoods\n",
|
||||||
|
"top_5 = df.head(5)\n",
|
||||||
|
"bottom_5 = df.tail(5)\n",
|
||||||
|
"\n",
|
||||||
|
"# Prepare radar data\n",
|
||||||
|
"categories = [\"Parks\", \"Schools\", \"Transit\"]\n",
|
||||||
|
"index_columns = [\"parks_index\", \"schools_index\", \"transit_index\"]"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Sample Output"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"print(\"Top 5 Amenity-Rich Neighbourhoods:\")\n",
|
||||||
|
"display(\n",
|
||||||
|
" top_5[\n",
|
||||||
|
" [\n",
|
||||||
|
" \"neighbourhood_name\",\n",
|
||||||
|
" \"parks_index\",\n",
|
||||||
|
" \"schools_index\",\n",
|
||||||
|
" \"transit_index\",\n",
|
||||||
|
" \"amenity_index\",\n",
|
||||||
|
" ]\n",
|
||||||
|
" ]\n",
|
||||||
|
")\n",
|
||||||
|
"print(\"\\nBottom 5 Underserved Neighbourhoods:\")\n",
|
||||||
|
"display(\n",
|
||||||
|
" bottom_5[\n",
|
||||||
|
" [\n",
|
||||||
|
" \"neighbourhood_name\",\n",
|
||||||
|
" \"parks_index\",\n",
|
||||||
|
" \"schools_index\",\n",
|
||||||
|
" \"transit_index\",\n",
|
||||||
|
" \"amenity_index\",\n",
|
||||||
|
" ]\n",
|
||||||
|
" ]\n",
|
||||||
|
")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## 2. Data Visualization\n",
|
||||||
|
"\n",
|
||||||
|
"### Figure Factory\n",
|
||||||
|
"\n",
|
||||||
|
"Uses `create_radar` from `portfolio_app.figures.toronto.radar`."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import sys\n",
|
||||||
|
"\n",
|
||||||
|
"sys.path.insert(0, \"../..\")\n",
|
||||||
|
"\n",
|
||||||
|
"from portfolio_app.figures.toronto.radar import create_comparison_radar\n",
|
||||||
|
"\n",
|
||||||
|
"# Compare top neighbourhood vs city average (100)\n",
|
||||||
|
"top_hood = top_5.iloc[0]\n",
|
||||||
|
"metrics = [\"parks_index\", \"schools_index\", \"transit_index\"]\n",
|
||||||
|
"\n",
|
||||||
|
"fig = create_comparison_radar(\n",
|
||||||
|
" selected_data=top_hood.to_dict(),\n",
|
||||||
|
" average_data={\"parks_index\": 100, \"schools_index\": 100, \"transit_index\": 100},\n",
|
||||||
|
" metrics=metrics,\n",
|
||||||
|
" selected_name=top_hood[\"neighbourhood_name\"],\n",
|
||||||
|
" average_name=\"City Average\",\n",
|
||||||
|
" title=f\"Amenity Profile: {top_hood['neighbourhood_name']} vs City Average\",\n",
|
||||||
|
")\n",
|
||||||
|
"\n",
|
||||||
|
"fig.show()"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"### Index Interpretation\n",
|
||||||
|
"\n",
|
||||||
|
"| Value | Meaning |\n",
|
||||||
|
"|-------|--------|\n",
|
||||||
|
"| < 100 | Below city average |\n",
|
||||||
|
"| = 100 | City average |\n",
|
||||||
|
"| > 100 | Above city average |"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"metadata": {
|
||||||
|
"kernelspec": {
|
||||||
|
"display_name": "Python 3",
|
||||||
|
"language": "python",
|
||||||
|
"name": "python3"
|
||||||
|
},
|
||||||
|
"language_info": {
|
||||||
|
"name": "python",
|
||||||
|
"version": "3.11.0"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nbformat": 4,
|
||||||
|
"nbformat_minor": 4
|
||||||
|
}
|
||||||
@@ -19,7 +19,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"| Table | Grain | Key Columns |\n",
|
"| Table | Grain | Key Columns |\n",
|
||||||
"|-------|-------|-------------|\n",
|
"|-------|-------|-------------|\n",
|
||||||
"| `mart_neighbourhood_amenities` | neighbourhood \u00d7 year | transit_per_1000, transit_index, transit_count |\n",
|
"| `mart_neighbourhood_amenities` | neighbourhood × year | transit_per_1000, transit_index, transit_count |\n",
|
||||||
"\n",
|
"\n",
|
||||||
"### SQL Query"
|
"### SQL Query"
|
||||||
]
|
]
|
||||||
@@ -30,15 +30,16 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"import pandas as pd\n",
|
|
||||||
"from sqlalchemy import create_engine\n",
|
|
||||||
"from dotenv import load_dotenv\n",
|
|
||||||
"import os\n",
|
"import os\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# Load .env from project root\n",
|
"import pandas as pd\n",
|
||||||
"load_dotenv('../../.env')\n",
|
"from dotenv import load_dotenv\n",
|
||||||
|
"from sqlalchemy import create_engine\n",
|
||||||
"\n",
|
"\n",
|
||||||
"engine = create_engine(os.environ['DATABASE_URL'])\n",
|
"# Load .env from project root\n",
|
||||||
|
"load_dotenv(\"../../.env\")\n",
|
||||||
|
"\n",
|
||||||
|
"engine = create_engine(os.environ[\"DATABASE_URL\"])\n",
|
||||||
"\n",
|
"\n",
|
||||||
"query = \"\"\"\n",
|
"query = \"\"\"\n",
|
||||||
"SELECT\n",
|
"SELECT\n",
|
||||||
@@ -74,7 +75,7 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"data = df.head(20).to_dict('records')"
|
"data = df.head(20).to_dict(\"records\")"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -90,7 +91,9 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"df[['neighbourhood_name', 'transit_per_1000', 'transit_index', 'transit_count']].head(10)"
|
"df[[\"neighbourhood_name\", \"transit_per_1000\", \"transit_index\", \"transit_count\"]].head(\n",
|
||||||
|
" 10\n",
|
||||||
|
")"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -101,7 +104,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"### Figure Factory\n",
|
"### Figure Factory\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Uses `create_horizontal_bar` from `portfolio_app.figures.bar_charts`."
|
"Uses `create_horizontal_bar` from `portfolio_app.figures.toronto.bar_charts`."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -111,17 +114,18 @@
|
|||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"import sys\n",
|
"import sys\n",
|
||||||
"sys.path.insert(0, '../..')\n",
|
|
||||||
"\n",
|
"\n",
|
||||||
"from portfolio_app.figures.bar_charts import create_horizontal_bar\n",
|
"sys.path.insert(0, \"../..\")\n",
|
||||||
|
"\n",
|
||||||
|
"from portfolio_app.figures.toronto.bar_charts import create_horizontal_bar\n",
|
||||||
"\n",
|
"\n",
|
||||||
"fig = create_horizontal_bar(\n",
|
"fig = create_horizontal_bar(\n",
|
||||||
" data=data,\n",
|
" data=data,\n",
|
||||||
" name_column='neighbourhood_name',\n",
|
" name_column=\"neighbourhood_name\",\n",
|
||||||
" value_column='transit_per_1000',\n",
|
" value_column=\"transit_per_1000\",\n",
|
||||||
" title='Top 20 Neighbourhoods by Transit Accessibility',\n",
|
" title=\"Top 20 Neighbourhoods by Transit Accessibility\",\n",
|
||||||
" color='#00BCD4',\n",
|
" color=\"#00BCD4\",\n",
|
||||||
" value_format='.2f',\n",
|
" value_format=\".2f\",\n",
|
||||||
")\n",
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
"fig.show()"
|
"fig.show()"
|
||||||
@@ -140,7 +144,7 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"print(f\"City-wide Transit Statistics:\")\n",
|
"print(\"City-wide Transit Statistics:\")\n",
|
||||||
"print(f\" Total Transit Stops: {df['transit_count'].sum():,.0f}\")\n",
|
"print(f\" Total Transit Stops: {df['transit_count'].sum():,.0f}\")\n",
|
||||||
"print(f\" Average per 1,000 pop: {df['transit_per_1000'].mean():.2f}\")\n",
|
"print(f\" Average per 1,000 pop: {df['transit_per_1000'].mean():.2f}\")\n",
|
||||||
"print(f\" Median per 1,000 pop: {df['transit_per_1000'].median():.2f}\")\n",
|
"print(f\" Median per 1,000 pop: {df['transit_per_1000'].median():.2f}\")\n",
|
||||||
@@ -19,7 +19,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"| Table | Grain | Key Columns |\n",
|
"| Table | Grain | Key Columns |\n",
|
||||||
"|-------|-------|-------------|\n",
|
"|-------|-------|-------------|\n",
|
||||||
"| `mart_neighbourhood_demographics` | neighbourhood \u00d7 year | median_age, age_index, city_avg_age |\n",
|
"| `mart_neighbourhood_demographics` | neighbourhood × year | median_age, age_index, city_avg_age |\n",
|
||||||
"\n",
|
"\n",
|
||||||
"### SQL Query"
|
"### SQL Query"
|
||||||
]
|
]
|
||||||
@@ -30,15 +30,16 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"import pandas as pd\n",
|
|
||||||
"from sqlalchemy import create_engine\n",
|
|
||||||
"from dotenv import load_dotenv\n",
|
|
||||||
"import os\n",
|
"import os\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# Load .env from project root\n",
|
"import pandas as pd\n",
|
||||||
"load_dotenv('../../.env')\n",
|
"from dotenv import load_dotenv\n",
|
||||||
|
"from sqlalchemy import create_engine\n",
|
||||||
"\n",
|
"\n",
|
||||||
"engine = create_engine(os.environ['DATABASE_URL'])\n",
|
"# Load .env from project root\n",
|
||||||
|
"load_dotenv(\"../../.env\")\n",
|
||||||
|
"\n",
|
||||||
|
"engine = create_engine(os.environ[\"DATABASE_URL\"])\n",
|
||||||
"\n",
|
"\n",
|
||||||
"query = \"\"\"\n",
|
"query = \"\"\"\n",
|
||||||
"SELECT\n",
|
"SELECT\n",
|
||||||
@@ -76,13 +77,13 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"city_avg = df['city_avg_age'].iloc[0]\n",
|
"city_avg = df[\"city_avg_age\"].iloc[0]\n",
|
||||||
"df['age_category'] = df['median_age'].apply(\n",
|
"df[\"age_category\"] = df[\"median_age\"].apply(\n",
|
||||||
" lambda x: 'Younger' if x < city_avg else 'Older'\n",
|
" lambda x: \"Younger\" if x < city_avg else \"Older\"\n",
|
||||||
")\n",
|
")\n",
|
||||||
"df['age_deviation'] = df['median_age'] - city_avg\n",
|
"df[\"age_deviation\"] = df[\"median_age\"] - city_avg\n",
|
||||||
"\n",
|
"\n",
|
||||||
"data = df.to_dict('records')"
|
"data = df.to_dict(\"records\")"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -100,9 +101,13 @@
|
|||||||
"source": [
|
"source": [
|
||||||
"print(f\"City Average Age: {city_avg:.1f}\")\n",
|
"print(f\"City Average Age: {city_avg:.1f}\")\n",
|
||||||
"print(\"\\nYoungest Neighbourhoods:\")\n",
|
"print(\"\\nYoungest Neighbourhoods:\")\n",
|
||||||
"display(df.tail(5)[['neighbourhood_name', 'median_age', 'age_index', 'pct_renter_occupied']])\n",
|
"display(\n",
|
||||||
|
" df.tail(5)[[\"neighbourhood_name\", \"median_age\", \"age_index\", \"pct_renter_occupied\"]]\n",
|
||||||
|
")\n",
|
||||||
"print(\"\\nOldest Neighbourhoods:\")\n",
|
"print(\"\\nOldest Neighbourhoods:\")\n",
|
||||||
"display(df.head(5)[['neighbourhood_name', 'median_age', 'age_index', 'pct_renter_occupied']])"
|
"display(\n",
|
||||||
|
" df.head(5)[[\"neighbourhood_name\", \"median_age\", \"age_index\", \"pct_renter_occupied\"]]\n",
|
||||||
|
")"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -113,7 +118,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"### Figure Factory\n",
|
"### Figure Factory\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Uses `create_ranking_bar` from `portfolio_app.figures.bar_charts`."
|
"Uses `create_ranking_bar` from `portfolio_app.figures.toronto.bar_charts`."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -123,20 +128,21 @@
|
|||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"import sys\n",
|
"import sys\n",
|
||||||
"sys.path.insert(0, '../..')\n",
|
|
||||||
"\n",
|
"\n",
|
||||||
"from portfolio_app.figures.bar_charts import create_ranking_bar\n",
|
"sys.path.insert(0, \"../..\")\n",
|
||||||
|
"\n",
|
||||||
|
"from portfolio_app.figures.toronto.bar_charts import create_ranking_bar\n",
|
||||||
"\n",
|
"\n",
|
||||||
"fig = create_ranking_bar(\n",
|
"fig = create_ranking_bar(\n",
|
||||||
" data=data,\n",
|
" data=data,\n",
|
||||||
" name_column='neighbourhood_name',\n",
|
" name_column=\"neighbourhood_name\",\n",
|
||||||
" value_column='median_age',\n",
|
" value_column=\"median_age\",\n",
|
||||||
" title='Youngest & Oldest Neighbourhoods (Median Age)',\n",
|
" title=\"Youngest & Oldest Neighbourhoods (Median Age)\",\n",
|
||||||
" top_n=10,\n",
|
" top_n=10,\n",
|
||||||
" bottom_n=10,\n",
|
" bottom_n=10,\n",
|
||||||
" color_top='#FF9800', # Orange for older\n",
|
" color_top=\"#FF9800\", # Orange for older\n",
|
||||||
" color_bottom='#2196F3', # Blue for younger\n",
|
" color_bottom=\"#2196F3\", # Blue for younger\n",
|
||||||
" value_format='.1f',\n",
|
" value_format=\".1f\",\n",
|
||||||
")\n",
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
"fig.show()"
|
"fig.show()"
|
||||||
@@ -157,7 +163,7 @@
|
|||||||
"source": [
|
"source": [
|
||||||
"# Age by income quintile\n",
|
"# Age by income quintile\n",
|
||||||
"print(\"Median Age by Income Quintile:\")\n",
|
"print(\"Median Age by Income Quintile:\")\n",
|
||||||
"df.groupby('income_quintile')['median_age'].mean().round(1)"
|
"df.groupby(\"income_quintile\")[\"median_age\"].mean().round(1)"
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
@@ -19,7 +19,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"| Table | Grain | Key Columns |\n",
|
"| Table | Grain | Key Columns |\n",
|
||||||
"|-------|-------|-------------|\n",
|
"|-------|-------|-------------|\n",
|
||||||
"| `mart_neighbourhood_demographics` | neighbourhood \u00d7 year | median_household_income, income_index, income_quintile, geometry |\n",
|
"| `mart_neighbourhood_demographics` | neighbourhood × year | median_household_income, income_index, income_quintile, geometry |\n",
|
||||||
"\n",
|
"\n",
|
||||||
"### SQL Query"
|
"### SQL Query"
|
||||||
]
|
]
|
||||||
@@ -30,15 +30,16 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"import pandas as pd\n",
|
|
||||||
"from sqlalchemy import create_engine\n",
|
|
||||||
"from dotenv import load_dotenv\n",
|
|
||||||
"import os\n",
|
"import os\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# Load .env from project root\n",
|
"import pandas as pd\n",
|
||||||
"load_dotenv('../../.env')\n",
|
"from dotenv import load_dotenv\n",
|
||||||
|
"from sqlalchemy import create_engine\n",
|
||||||
"\n",
|
"\n",
|
||||||
"engine = create_engine(os.environ['DATABASE_URL'])\n",
|
"# Load .env from project root\n",
|
||||||
|
"load_dotenv(\"../../.env\")\n",
|
||||||
|
"\n",
|
||||||
|
"engine = create_engine(os.environ[\"DATABASE_URL\"])\n",
|
||||||
"\n",
|
"\n",
|
||||||
"query = \"\"\"\n",
|
"query = \"\"\"\n",
|
||||||
"SELECT\n",
|
"SELECT\n",
|
||||||
@@ -77,19 +78,18 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"import geopandas as gpd\n",
|
|
||||||
"import json\n",
|
"import json\n",
|
||||||
"\n",
|
"\n",
|
||||||
"df['income_thousands'] = df['median_household_income'] / 1000\n",
|
"import geopandas as gpd\n",
|
||||||
|
"\n",
|
||||||
|
"df[\"income_thousands\"] = df[\"median_household_income\"] / 1000\n",
|
||||||
"\n",
|
"\n",
|
||||||
"gdf = gpd.GeoDataFrame(\n",
|
"gdf = gpd.GeoDataFrame(\n",
|
||||||
" df,\n",
|
" df, geometry=gpd.GeoSeries.from_wkb(df[\"geometry\"]), crs=\"EPSG:4326\"\n",
|
||||||
" geometry=gpd.GeoSeries.from_wkb(df['geometry']),\n",
|
|
||||||
" crs='EPSG:4326'\n",
|
|
||||||
")\n",
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
"geojson = json.loads(gdf.to_json())\n",
|
"geojson = json.loads(gdf.to_json())\n",
|
||||||
"data = df.drop(columns=['geometry']).to_dict('records')"
|
"data = df.drop(columns=[\"geometry\"]).to_dict(\"records\")"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -105,7 +105,9 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"df[['neighbourhood_name', 'median_household_income', 'income_index', 'income_quintile']].head(10)"
|
"df[\n",
|
||||||
|
" [\"neighbourhood_name\", \"median_household_income\", \"income_index\", \"income_quintile\"]\n",
|
||||||
|
"].head(10)"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -116,7 +118,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"### Figure Factory\n",
|
"### Figure Factory\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Uses `create_choropleth_figure` from `portfolio_app.figures.choropleth`."
|
"Uses `create_choropleth_figure` from `portfolio_app.figures.toronto.choropleth`."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -126,18 +128,19 @@
|
|||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"import sys\n",
|
"import sys\n",
|
||||||
"sys.path.insert(0, '../..')\n",
|
|
||||||
"\n",
|
"\n",
|
||||||
"from portfolio_app.figures.choropleth import create_choropleth_figure\n",
|
"sys.path.insert(0, \"../..\")\n",
|
||||||
|
"\n",
|
||||||
|
"from portfolio_app.figures.toronto.choropleth import create_choropleth_figure\n",
|
||||||
"\n",
|
"\n",
|
||||||
"fig = create_choropleth_figure(\n",
|
"fig = create_choropleth_figure(\n",
|
||||||
" geojson=geojson,\n",
|
" geojson=geojson,\n",
|
||||||
" data=data,\n",
|
" data=data,\n",
|
||||||
" location_key='neighbourhood_id',\n",
|
" location_key=\"neighbourhood_id\",\n",
|
||||||
" color_column='median_household_income',\n",
|
" color_column=\"median_household_income\",\n",
|
||||||
" hover_data=['neighbourhood_name', 'income_index', 'income_quintile'],\n",
|
" hover_data=[\"neighbourhood_name\", \"income_index\", \"income_quintile\"],\n",
|
||||||
" color_scale='Viridis',\n",
|
" color_scale=\"Viridis\",\n",
|
||||||
" title='Toronto Median Household Income by Neighbourhood',\n",
|
" title=\"Toronto Median Household Income by Neighbourhood\",\n",
|
||||||
" zoom=10,\n",
|
" zoom=10,\n",
|
||||||
")\n",
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
@@ -157,7 +160,9 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"df.groupby('income_quintile')['median_household_income'].agg(['count', 'mean', 'min', 'max']).round(0)"
|
"df.groupby(\"income_quintile\")[\"median_household_income\"].agg(\n",
|
||||||
|
" [\"count\", \"mean\", \"min\", \"max\"]\n",
|
||||||
|
").round(0)"
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
@@ -19,7 +19,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"| Table | Grain | Key Columns |\n",
|
"| Table | Grain | Key Columns |\n",
|
||||||
"|-------|-------|-------------|\n",
|
"|-------|-------|-------------|\n",
|
||||||
"| `mart_neighbourhood_demographics` | neighbourhood \u00d7 year | population_density, population, land_area_sqkm |\n",
|
"| `mart_neighbourhood_demographics` | neighbourhood × year | population_density, population, land_area_sqkm |\n",
|
||||||
"\n",
|
"\n",
|
||||||
"### SQL Query"
|
"### SQL Query"
|
||||||
]
|
]
|
||||||
@@ -30,15 +30,16 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"import pandas as pd\n",
|
|
||||||
"from sqlalchemy import create_engine\n",
|
|
||||||
"from dotenv import load_dotenv\n",
|
|
||||||
"import os\n",
|
"import os\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# Load .env from project root\n",
|
"import pandas as pd\n",
|
||||||
"load_dotenv('../../.env')\n",
|
"from dotenv import load_dotenv\n",
|
||||||
|
"from sqlalchemy import create_engine\n",
|
||||||
"\n",
|
"\n",
|
||||||
"engine = create_engine(os.environ['DATABASE_URL'])\n",
|
"# Load .env from project root\n",
|
||||||
|
"load_dotenv(\"../../.env\")\n",
|
||||||
|
"\n",
|
||||||
|
"engine = create_engine(os.environ[\"DATABASE_URL\"])\n",
|
||||||
"\n",
|
"\n",
|
||||||
"query = \"\"\"\n",
|
"query = \"\"\"\n",
|
||||||
"SELECT\n",
|
"SELECT\n",
|
||||||
@@ -74,7 +75,7 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"data = df.head(20).to_dict('records')"
|
"data = df.head(20).to_dict(\"records\")"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -90,7 +91,9 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"df[['neighbourhood_name', 'population_density', 'population', 'land_area_sqkm']].head(10)"
|
"df[[\"neighbourhood_name\", \"population_density\", \"population\", \"land_area_sqkm\"]].head(\n",
|
||||||
|
" 10\n",
|
||||||
|
")"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -101,7 +104,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"### Figure Factory\n",
|
"### Figure Factory\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Uses `create_horizontal_bar` from `portfolio_app.figures.bar_charts`."
|
"Uses `create_horizontal_bar` from `portfolio_app.figures.toronto.bar_charts`."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -111,17 +114,18 @@
|
|||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"import sys\n",
|
"import sys\n",
|
||||||
"sys.path.insert(0, '../..')\n",
|
|
||||||
"\n",
|
"\n",
|
||||||
"from portfolio_app.figures.bar_charts import create_horizontal_bar\n",
|
"sys.path.insert(0, \"../..\")\n",
|
||||||
|
"\n",
|
||||||
|
"from portfolio_app.figures.toronto.bar_charts import create_horizontal_bar\n",
|
||||||
"\n",
|
"\n",
|
||||||
"fig = create_horizontal_bar(\n",
|
"fig = create_horizontal_bar(\n",
|
||||||
" data=data,\n",
|
" data=data,\n",
|
||||||
" name_column='neighbourhood_name',\n",
|
" name_column=\"neighbourhood_name\",\n",
|
||||||
" value_column='population_density',\n",
|
" value_column=\"population_density\",\n",
|
||||||
" title='Top 20 Most Dense Neighbourhoods',\n",
|
" title=\"Top 20 Most Dense Neighbourhoods\",\n",
|
||||||
" color='#9C27B0',\n",
|
" color=\"#9C27B0\",\n",
|
||||||
" value_format=',.0f',\n",
|
" value_format=\",.0f\",\n",
|
||||||
")\n",
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
"fig.show()"
|
"fig.show()"
|
||||||
@@ -140,7 +144,7 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"print(f\"City-wide Statistics:\")\n",
|
"print(\"City-wide Statistics:\")\n",
|
||||||
"print(f\" Total Population: {df['population'].sum():,.0f}\")\n",
|
"print(f\" Total Population: {df['population'].sum():,.0f}\")\n",
|
||||||
"print(f\" Total Area: {df['land_area_sqkm'].sum():,.1f} sq km\")\n",
|
"print(f\" Total Area: {df['land_area_sqkm'].sum():,.1f} sq km\")\n",
|
||||||
"print(f\" Average Density: {df['population_density'].mean():,.0f} per sq km\")\n",
|
"print(f\" Average Density: {df['population_density'].mean():,.0f} per sq km\")\n",
|
||||||
@@ -19,7 +19,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"| Table | Grain | Key Columns |\n",
|
"| Table | Grain | Key Columns |\n",
|
||||||
"|-------|-------|-------------|\n",
|
"|-------|-------|-------------|\n",
|
||||||
"| `mart_neighbourhood_housing` | neighbourhood \u00d7 year | affordability_index, rent_to_income_pct, avg_rent_2bed, geometry |\n",
|
"| `mart_neighbourhood_housing` | neighbourhood × year | affordability_index, rent_to_income_pct, avg_rent_2bed, geometry |\n",
|
||||||
"\n",
|
"\n",
|
||||||
"### SQL Query"
|
"### SQL Query"
|
||||||
]
|
]
|
||||||
@@ -30,15 +30,16 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"import pandas as pd\n",
|
|
||||||
"from sqlalchemy import create_engine\n",
|
|
||||||
"from dotenv import load_dotenv\n",
|
|
||||||
"import os\n",
|
"import os\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# Load .env from project root\n",
|
"import pandas as pd\n",
|
||||||
"load_dotenv('../../.env')\n",
|
"from dotenv import load_dotenv\n",
|
||||||
|
"from sqlalchemy import create_engine\n",
|
||||||
"\n",
|
"\n",
|
||||||
"engine = create_engine(os.environ['DATABASE_URL'])\n",
|
"# Load .env from project root\n",
|
||||||
|
"load_dotenv(\"../../.env\")\n",
|
||||||
|
"\n",
|
||||||
|
"engine = create_engine(os.environ[\"DATABASE_URL\"])\n",
|
||||||
"\n",
|
"\n",
|
||||||
"query = \"\"\"\n",
|
"query = \"\"\"\n",
|
||||||
"SELECT\n",
|
"SELECT\n",
|
||||||
@@ -77,17 +78,16 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"import geopandas as gpd\n",
|
|
||||||
"import json\n",
|
"import json\n",
|
||||||
"\n",
|
"\n",
|
||||||
|
"import geopandas as gpd\n",
|
||||||
|
"\n",
|
||||||
"gdf = gpd.GeoDataFrame(\n",
|
"gdf = gpd.GeoDataFrame(\n",
|
||||||
" df,\n",
|
" df, geometry=gpd.GeoSeries.from_wkb(df[\"geometry\"]), crs=\"EPSG:4326\"\n",
|
||||||
" geometry=gpd.GeoSeries.from_wkb(df['geometry']),\n",
|
|
||||||
" crs='EPSG:4326'\n",
|
|
||||||
")\n",
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
"geojson = json.loads(gdf.to_json())\n",
|
"geojson = json.loads(gdf.to_json())\n",
|
||||||
"data = df.drop(columns=['geometry']).to_dict('records')"
|
"data = df.drop(columns=[\"geometry\"]).to_dict(\"records\")"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -103,7 +103,15 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"df[['neighbourhood_name', 'affordability_index', 'rent_to_income_pct', 'avg_rent_2bed', 'is_affordable']].head(10)"
|
"df[\n",
|
||||||
|
" [\n",
|
||||||
|
" \"neighbourhood_name\",\n",
|
||||||
|
" \"affordability_index\",\n",
|
||||||
|
" \"rent_to_income_pct\",\n",
|
||||||
|
" \"avg_rent_2bed\",\n",
|
||||||
|
" \"is_affordable\",\n",
|
||||||
|
" ]\n",
|
||||||
|
"].head(10)"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -114,7 +122,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"### Figure Factory\n",
|
"### Figure Factory\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Uses `create_choropleth_figure` from `portfolio_app.figures.choropleth`.\n",
|
"Uses `create_choropleth_figure` from `portfolio_app.figures.toronto.choropleth`.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"**Key Parameters:**\n",
|
"**Key Parameters:**\n",
|
||||||
"- `color_column`: 'affordability_index'\n",
|
"- `color_column`: 'affordability_index'\n",
|
||||||
@@ -128,18 +136,19 @@
|
|||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"import sys\n",
|
"import sys\n",
|
||||||
"sys.path.insert(0, '../..')\n",
|
|
||||||
"\n",
|
"\n",
|
||||||
"from portfolio_app.figures.choropleth import create_choropleth_figure\n",
|
"sys.path.insert(0, \"../..\")\n",
|
||||||
|
"\n",
|
||||||
|
"from portfolio_app.figures.toronto.choropleth import create_choropleth_figure\n",
|
||||||
"\n",
|
"\n",
|
||||||
"fig = create_choropleth_figure(\n",
|
"fig = create_choropleth_figure(\n",
|
||||||
" geojson=geojson,\n",
|
" geojson=geojson,\n",
|
||||||
" data=data,\n",
|
" data=data,\n",
|
||||||
" location_key='neighbourhood_id',\n",
|
" location_key=\"neighbourhood_id\",\n",
|
||||||
" color_column='affordability_index',\n",
|
" color_column=\"affordability_index\",\n",
|
||||||
" hover_data=['neighbourhood_name', 'rent_to_income_pct', 'avg_rent_2bed'],\n",
|
" hover_data=[\"neighbourhood_name\", \"rent_to_income_pct\", \"avg_rent_2bed\"],\n",
|
||||||
" color_scale='RdYlGn_r', # Reversed: lower index (affordable) = green\n",
|
" color_scale=\"RdYlGn_r\", # Reversed: lower index (affordable) = green\n",
|
||||||
" title='Toronto Housing Affordability Index',\n",
|
" title=\"Toronto Housing Affordability Index\",\n",
|
||||||
" zoom=10,\n",
|
" zoom=10,\n",
|
||||||
")\n",
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
@@ -19,7 +19,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"| Table | Grain | Key Columns |\n",
|
"| Table | Grain | Key Columns |\n",
|
||||||
"|-------|-------|-------------|\n",
|
"|-------|-------|-------------|\n",
|
||||||
"| `mart_neighbourhood_housing` | neighbourhood \u00d7 year | year, avg_rent_2bed, rent_yoy_change_pct |\n",
|
"| `mart_neighbourhood_housing` | neighbourhood × year | year, avg_rent_2bed, rent_yoy_change_pct |\n",
|
||||||
"\n",
|
"\n",
|
||||||
"### SQL Query"
|
"### SQL Query"
|
||||||
]
|
]
|
||||||
@@ -30,15 +30,16 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"import pandas as pd\n",
|
|
||||||
"from sqlalchemy import create_engine\n",
|
|
||||||
"from dotenv import load_dotenv\n",
|
|
||||||
"import os\n",
|
"import os\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# Load .env from project root\n",
|
"import pandas as pd\n",
|
||||||
"load_dotenv('../../.env')\n",
|
"from dotenv import load_dotenv\n",
|
||||||
|
"from sqlalchemy import create_engine\n",
|
||||||
"\n",
|
"\n",
|
||||||
"engine = create_engine(os.environ['DATABASE_URL'])\n",
|
"# Load .env from project root\n",
|
||||||
|
"load_dotenv(\"../../.env\")\n",
|
||||||
|
"\n",
|
||||||
|
"engine = create_engine(os.environ[\"DATABASE_URL\"])\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# City-wide average rent by year\n",
|
"# City-wide average rent by year\n",
|
||||||
"query = \"\"\"\n",
|
"query = \"\"\"\n",
|
||||||
@@ -77,23 +78,25 @@
|
|||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"# Create date column from year\n",
|
"# Create date column from year\n",
|
||||||
"df['date'] = pd.to_datetime(df['year'].astype(str) + '-01-01')\n",
|
"df[\"date\"] = pd.to_datetime(df[\"year\"].astype(str) + \"-01-01\")\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# Melt for multi-line chart\n",
|
"# Melt for multi-line chart\n",
|
||||||
"df_melted = df.melt(\n",
|
"df_melted = df.melt(\n",
|
||||||
" id_vars=['year', 'date'],\n",
|
" id_vars=[\"year\", \"date\"],\n",
|
||||||
" value_vars=['avg_rent_bachelor', 'avg_rent_1bed', 'avg_rent_2bed', 'avg_rent_3bed'],\n",
|
" value_vars=[\"avg_rent_bachelor\", \"avg_rent_1bed\", \"avg_rent_2bed\", \"avg_rent_3bed\"],\n",
|
||||||
" var_name='bedroom_type',\n",
|
" var_name=\"bedroom_type\",\n",
|
||||||
" value_name='avg_rent'\n",
|
" value_name=\"avg_rent\",\n",
|
||||||
")\n",
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# Clean labels\n",
|
"# Clean labels\n",
|
||||||
"df_melted['bedroom_type'] = df_melted['bedroom_type'].map({\n",
|
"df_melted[\"bedroom_type\"] = df_melted[\"bedroom_type\"].map(\n",
|
||||||
" 'avg_rent_bachelor': 'Bachelor',\n",
|
" {\n",
|
||||||
" 'avg_rent_1bed': '1 Bedroom',\n",
|
" \"avg_rent_bachelor\": \"Bachelor\",\n",
|
||||||
" 'avg_rent_2bed': '2 Bedroom',\n",
|
" \"avg_rent_1bed\": \"1 Bedroom\",\n",
|
||||||
" 'avg_rent_3bed': '3 Bedroom'\n",
|
" \"avg_rent_2bed\": \"2 Bedroom\",\n",
|
||||||
"})"
|
" \"avg_rent_3bed\": \"3 Bedroom\",\n",
|
||||||
|
" }\n",
|
||||||
|
")"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -109,7 +112,16 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"df[['year', 'avg_rent_bachelor', 'avg_rent_1bed', 'avg_rent_2bed', 'avg_rent_3bed', 'avg_yoy_change']]"
|
"df[\n",
|
||||||
|
" [\n",
|
||||||
|
" \"year\",\n",
|
||||||
|
" \"avg_rent_bachelor\",\n",
|
||||||
|
" \"avg_rent_1bed\",\n",
|
||||||
|
" \"avg_rent_2bed\",\n",
|
||||||
|
" \"avg_rent_3bed\",\n",
|
||||||
|
" \"avg_yoy_change\",\n",
|
||||||
|
" ]\n",
|
||||||
|
"]"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -120,7 +132,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"### Figure Factory\n",
|
"### Figure Factory\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Uses `create_price_time_series` from `portfolio_app.figures.time_series`.\n",
|
"Uses `create_price_time_series` from `portfolio_app.figures.toronto.time_series`.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"**Key Parameters:**\n",
|
"**Key Parameters:**\n",
|
||||||
"- `date_column`: 'date'\n",
|
"- `date_column`: 'date'\n",
|
||||||
@@ -135,18 +147,19 @@
|
|||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"import sys\n",
|
"import sys\n",
|
||||||
"sys.path.insert(0, '../..')\n",
|
|
||||||
"\n",
|
"\n",
|
||||||
"from portfolio_app.figures.time_series import create_price_time_series\n",
|
"sys.path.insert(0, \"../..\")\n",
|
||||||
"\n",
|
"\n",
|
||||||
"data = df_melted.to_dict('records')\n",
|
"from portfolio_app.figures.toronto.time_series import create_price_time_series\n",
|
||||||
|
"\n",
|
||||||
|
"data = df_melted.to_dict(\"records\")\n",
|
||||||
"\n",
|
"\n",
|
||||||
"fig = create_price_time_series(\n",
|
"fig = create_price_time_series(\n",
|
||||||
" data=data,\n",
|
" data=data,\n",
|
||||||
" date_column='date',\n",
|
" date_column=\"date\",\n",
|
||||||
" price_column='avg_rent',\n",
|
" price_column=\"avg_rent\",\n",
|
||||||
" group_column='bedroom_type',\n",
|
" group_column=\"bedroom_type\",\n",
|
||||||
" title='Toronto Average Rent Trend (5 Years)',\n",
|
" title=\"Toronto Average Rent Trend (5 Years)\",\n",
|
||||||
")\n",
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
"fig.show()"
|
"fig.show()"
|
||||||
@@ -167,7 +180,7 @@
|
|||||||
"source": [
|
"source": [
|
||||||
"# Show year-over-year changes\n",
|
"# Show year-over-year changes\n",
|
||||||
"print(\"Year-over-Year Rent Change (%)\")\n",
|
"print(\"Year-over-Year Rent Change (%)\")\n",
|
||||||
"df[['year', 'avg_yoy_change']].dropna()"
|
"df[[\"year\", \"avg_yoy_change\"]].dropna()"
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
@@ -19,7 +19,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"| Table | Grain | Key Columns |\n",
|
"| Table | Grain | Key Columns |\n",
|
||||||
"|-------|-------|-------------|\n",
|
"|-------|-------|-------------|\n",
|
||||||
"| `mart_neighbourhood_housing` | neighbourhood \u00d7 year | pct_owner_occupied, pct_renter_occupied, income_quintile |\n",
|
"| `mart_neighbourhood_housing` | neighbourhood × year | pct_owner_occupied, pct_renter_occupied, income_quintile |\n",
|
||||||
"\n",
|
"\n",
|
||||||
"### SQL Query"
|
"### SQL Query"
|
||||||
]
|
]
|
||||||
@@ -30,15 +30,16 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"import pandas as pd\n",
|
|
||||||
"from sqlalchemy import create_engine\n",
|
|
||||||
"from dotenv import load_dotenv\n",
|
|
||||||
"import os\n",
|
"import os\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# Load .env from project root\n",
|
"import pandas as pd\n",
|
||||||
"load_dotenv('../../.env')\n",
|
"from dotenv import load_dotenv\n",
|
||||||
|
"from sqlalchemy import create_engine\n",
|
||||||
"\n",
|
"\n",
|
||||||
"engine = create_engine(os.environ['DATABASE_URL'])\n",
|
"# Load .env from project root\n",
|
||||||
|
"load_dotenv(\"../../.env\")\n",
|
||||||
|
"\n",
|
||||||
|
"engine = create_engine(os.environ[\"DATABASE_URL\"])\n",
|
||||||
"\n",
|
"\n",
|
||||||
"query = \"\"\"\n",
|
"query = \"\"\"\n",
|
||||||
"SELECT\n",
|
"SELECT\n",
|
||||||
@@ -77,18 +78,17 @@
|
|||||||
"source": [
|
"source": [
|
||||||
"# Prepare for stacked bar\n",
|
"# Prepare for stacked bar\n",
|
||||||
"df_stacked = df.melt(\n",
|
"df_stacked = df.melt(\n",
|
||||||
" id_vars=['neighbourhood_name', 'income_quintile'],\n",
|
" id_vars=[\"neighbourhood_name\", \"income_quintile\"],\n",
|
||||||
" value_vars=['pct_owner_occupied', 'pct_renter_occupied'],\n",
|
" value_vars=[\"pct_owner_occupied\", \"pct_renter_occupied\"],\n",
|
||||||
" var_name='tenure_type',\n",
|
" var_name=\"tenure_type\",\n",
|
||||||
" value_name='percentage'\n",
|
" value_name=\"percentage\",\n",
|
||||||
")\n",
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
"df_stacked['tenure_type'] = df_stacked['tenure_type'].map({\n",
|
"df_stacked[\"tenure_type\"] = df_stacked[\"tenure_type\"].map(\n",
|
||||||
" 'pct_owner_occupied': 'Owner',\n",
|
" {\"pct_owner_occupied\": \"Owner\", \"pct_renter_occupied\": \"Renter\"}\n",
|
||||||
" 'pct_renter_occupied': 'Renter'\n",
|
")\n",
|
||||||
"})\n",
|
|
||||||
"\n",
|
"\n",
|
||||||
"data = df_stacked.to_dict('records')"
|
"data = df_stacked.to_dict(\"records\")"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -105,7 +105,14 @@
|
|||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"print(\"Highest Renter Neighbourhoods:\")\n",
|
"print(\"Highest Renter Neighbourhoods:\")\n",
|
||||||
"df[['neighbourhood_name', 'pct_renter_occupied', 'pct_owner_occupied', 'income_quintile']].head(10)"
|
"df[\n",
|
||||||
|
" [\n",
|
||||||
|
" \"neighbourhood_name\",\n",
|
||||||
|
" \"pct_renter_occupied\",\n",
|
||||||
|
" \"pct_owner_occupied\",\n",
|
||||||
|
" \"income_quintile\",\n",
|
||||||
|
" ]\n",
|
||||||
|
"].head(10)"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -116,7 +123,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"### Figure Factory\n",
|
"### Figure Factory\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Uses `create_stacked_bar` from `portfolio_app.figures.bar_charts`.\n",
|
"Uses `create_stacked_bar` from `portfolio_app.figures.toronto.bar_charts`.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"**Key Parameters:**\n",
|
"**Key Parameters:**\n",
|
||||||
"- `x_column`: 'neighbourhood_name'\n",
|
"- `x_column`: 'neighbourhood_name'\n",
|
||||||
@@ -132,21 +139,22 @@
|
|||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"import sys\n",
|
"import sys\n",
|
||||||
"sys.path.insert(0, '../..')\n",
|
|
||||||
"\n",
|
"\n",
|
||||||
"from portfolio_app.figures.bar_charts import create_stacked_bar\n",
|
"sys.path.insert(0, \"../..\")\n",
|
||||||
|
"\n",
|
||||||
|
"from portfolio_app.figures.toronto.bar_charts import create_stacked_bar\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# Show top 20 by renter percentage\n",
|
"# Show top 20 by renter percentage\n",
|
||||||
"top_20_names = df.head(20)['neighbourhood_name'].tolist()\n",
|
"top_20_names = df.head(20)[\"neighbourhood_name\"].tolist()\n",
|
||||||
"data_filtered = [d for d in data if d['neighbourhood_name'] in top_20_names]\n",
|
"data_filtered = [d for d in data if d[\"neighbourhood_name\"] in top_20_names]\n",
|
||||||
"\n",
|
"\n",
|
||||||
"fig = create_stacked_bar(\n",
|
"fig = create_stacked_bar(\n",
|
||||||
" data=data_filtered,\n",
|
" data=data_filtered,\n",
|
||||||
" x_column='neighbourhood_name',\n",
|
" x_column=\"neighbourhood_name\",\n",
|
||||||
" value_column='percentage',\n",
|
" value_column=\"percentage\",\n",
|
||||||
" category_column='tenure_type',\n",
|
" category_column=\"tenure_type\",\n",
|
||||||
" title='Housing Tenure Mix - Top 20 Renter Neighbourhoods',\n",
|
" title=\"Housing Tenure Mix - Top 20 Renter Neighbourhoods\",\n",
|
||||||
" color_map={'Owner': '#4CAF50', 'Renter': '#2196F3'},\n",
|
" color_map={\"Owner\": \"#4CAF50\", \"Renter\": \"#2196F3\"},\n",
|
||||||
" show_percentages=True,\n",
|
" show_percentages=True,\n",
|
||||||
")\n",
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
@@ -172,7 +180,9 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"# By income quintile\n",
|
"# By income quintile\n",
|
||||||
"print(\"\\nTenure by Income Quintile:\")\n",
|
"print(\"\\nTenure by Income Quintile:\")\n",
|
||||||
"df.groupby('income_quintile')[['pct_owner_occupied', 'pct_renter_occupied']].mean().round(1)"
|
"df.groupby(\"income_quintile\")[\n",
|
||||||
|
" [\"pct_owner_occupied\", \"pct_renter_occupied\"]\n",
|
||||||
|
"].mean().round(1)"
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
@@ -19,7 +19,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"| Table | Grain | Key Columns |\n",
|
"| Table | Grain | Key Columns |\n",
|
||||||
"|-------|-------|-------------|\n",
|
"|-------|-------|-------------|\n",
|
||||||
"| `mart_neighbourhood_overview` | neighbourhood \u00d7 year | neighbourhood_name, median_household_income, safety_score, population |\n",
|
"| `mart_neighbourhood_overview` | neighbourhood × year | neighbourhood_name, median_household_income, safety_score, population |\n",
|
||||||
"\n",
|
"\n",
|
||||||
"### SQL Query"
|
"### SQL Query"
|
||||||
]
|
]
|
||||||
@@ -30,15 +30,16 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"import pandas as pd\n",
|
|
||||||
"from sqlalchemy import create_engine\n",
|
|
||||||
"from dotenv import load_dotenv\n",
|
|
||||||
"import os\n",
|
"import os\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# Load .env from project root\n",
|
"import pandas as pd\n",
|
||||||
"load_dotenv('../../.env')\n",
|
"from dotenv import load_dotenv\n",
|
||||||
|
"from sqlalchemy import create_engine\n",
|
||||||
"\n",
|
"\n",
|
||||||
"engine = create_engine(os.environ['DATABASE_URL'])\n",
|
"# Load .env from project root\n",
|
||||||
|
"load_dotenv(\"../../.env\")\n",
|
||||||
|
"\n",
|
||||||
|
"engine = create_engine(os.environ[\"DATABASE_URL\"])\n",
|
||||||
"\n",
|
"\n",
|
||||||
"query = \"\"\"\n",
|
"query = \"\"\"\n",
|
||||||
"SELECT\n",
|
"SELECT\n",
|
||||||
@@ -77,10 +78,10 @@
|
|||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"# Scale income to thousands for better axis readability\n",
|
"# Scale income to thousands for better axis readability\n",
|
||||||
"df['income_thousands'] = df['median_household_income'] / 1000\n",
|
"df[\"income_thousands\"] = df[\"median_household_income\"] / 1000\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# Prepare data for figure factory\n",
|
"# Prepare data for figure factory\n",
|
||||||
"data = df.to_dict('records')"
|
"data = df.to_dict(\"records\")"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -96,7 +97,14 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"df[['neighbourhood_name', 'median_household_income', 'safety_score', 'crime_rate_per_100k']].head(10)"
|
"df[\n",
|
||||||
|
" [\n",
|
||||||
|
" \"neighbourhood_name\",\n",
|
||||||
|
" \"median_household_income\",\n",
|
||||||
|
" \"safety_score\",\n",
|
||||||
|
" \"crime_rate_per_100k\",\n",
|
||||||
|
" ]\n",
|
||||||
|
"].head(10)"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -107,7 +115,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"### Figure Factory\n",
|
"### Figure Factory\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Uses `create_scatter_figure` from `portfolio_app.figures.scatter`.\n",
|
"Uses `create_scatter_figure` from `portfolio_app.figures.toronto.scatter`.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"**Key Parameters:**\n",
|
"**Key Parameters:**\n",
|
||||||
"- `x_column`: 'income_thousands' (median household income in $K)\n",
|
"- `x_column`: 'income_thousands' (median household income in $K)\n",
|
||||||
@@ -124,19 +132,20 @@
|
|||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"import sys\n",
|
"import sys\n",
|
||||||
"sys.path.insert(0, '../..')\n",
|
|
||||||
"\n",
|
"\n",
|
||||||
"from portfolio_app.figures.scatter import create_scatter_figure\n",
|
"sys.path.insert(0, \"../..\")\n",
|
||||||
|
"\n",
|
||||||
|
"from portfolio_app.figures.toronto.scatter import create_scatter_figure\n",
|
||||||
"\n",
|
"\n",
|
||||||
"fig = create_scatter_figure(\n",
|
"fig = create_scatter_figure(\n",
|
||||||
" data=data,\n",
|
" data=data,\n",
|
||||||
" x_column='income_thousands',\n",
|
" x_column=\"income_thousands\",\n",
|
||||||
" y_column='safety_score',\n",
|
" y_column=\"safety_score\",\n",
|
||||||
" name_column='neighbourhood_name',\n",
|
" name_column=\"neighbourhood_name\",\n",
|
||||||
" size_column='population',\n",
|
" size_column=\"population\",\n",
|
||||||
" title='Income vs Safety by Neighbourhood',\n",
|
" title=\"Income vs Safety by Neighbourhood\",\n",
|
||||||
" x_title='Median Household Income ($K)',\n",
|
" x_title=\"Median Household Income ($K)\",\n",
|
||||||
" y_title='Safety Score (0-100)',\n",
|
" y_title=\"Safety Score (0-100)\",\n",
|
||||||
" trendline=True,\n",
|
" trendline=True,\n",
|
||||||
")\n",
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
@@ -166,7 +175,7 @@
|
|||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"# Calculate correlation coefficient\n",
|
"# Calculate correlation coefficient\n",
|
||||||
"correlation = df['median_household_income'].corr(df['safety_score'])\n",
|
"correlation = df[\"median_household_income\"].corr(df[\"safety_score\"])\n",
|
||||||
"print(f\"Correlation coefficient (Income vs Safety): {correlation:.3f}\")"
|
"print(f\"Correlation coefficient (Income vs Safety): {correlation:.3f}\")"
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
@@ -29,7 +29,38 @@
|
|||||||
"execution_count": null,
|
"execution_count": null,
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": "import pandas as pd\nfrom sqlalchemy import create_engine\nfrom dotenv import load_dotenv\nimport os\n\n# Load .env from project root\nload_dotenv('../../.env')\n\nengine = create_engine(os.environ['DATABASE_URL'])\n\nquery = \"\"\"\nSELECT\n neighbourhood_id,\n neighbourhood_name,\n geometry,\n year,\n livability_score,\n safety_score,\n affordability_score,\n amenity_score,\n population,\n median_household_income\nFROM public_marts.mart_neighbourhood_overview\nWHERE year = (SELECT MAX(year) FROM public_marts.mart_neighbourhood_overview)\nORDER BY livability_score DESC\n\"\"\"\n\ndf = pd.read_sql(query, engine)\nprint(f\"Loaded {len(df)} neighbourhoods\")"
|
"source": [
|
||||||
|
"import os\n",
|
||||||
|
"\n",
|
||||||
|
"import pandas as pd\n",
|
||||||
|
"from dotenv import load_dotenv\n",
|
||||||
|
"from sqlalchemy import create_engine\n",
|
||||||
|
"\n",
|
||||||
|
"# Load .env from project root\n",
|
||||||
|
"load_dotenv(\"../../.env\")\n",
|
||||||
|
"\n",
|
||||||
|
"engine = create_engine(os.environ[\"DATABASE_URL\"])\n",
|
||||||
|
"\n",
|
||||||
|
"query = \"\"\"\n",
|
||||||
|
"SELECT\n",
|
||||||
|
" neighbourhood_id,\n",
|
||||||
|
" neighbourhood_name,\n",
|
||||||
|
" geometry,\n",
|
||||||
|
" year,\n",
|
||||||
|
" livability_score,\n",
|
||||||
|
" safety_score,\n",
|
||||||
|
" affordability_score,\n",
|
||||||
|
" amenity_score,\n",
|
||||||
|
" population,\n",
|
||||||
|
" median_household_income\n",
|
||||||
|
"FROM public_marts.mart_neighbourhood_overview\n",
|
||||||
|
"WHERE year = (SELECT MAX(year) FROM public_marts.mart_neighbourhood_overview)\n",
|
||||||
|
"ORDER BY livability_score DESC\n",
|
||||||
|
"\"\"\"\n",
|
||||||
|
"\n",
|
||||||
|
"df = pd.read_sql(query, engine)\n",
|
||||||
|
"print(f\"Loaded {len(df)} neighbourhoods\")"
|
||||||
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
@@ -49,21 +80,20 @@
|
|||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"# Transform geometry to GeoJSON\n",
|
"# Transform geometry to GeoJSON\n",
|
||||||
"import geopandas as gpd\n",
|
|
||||||
"import json\n",
|
"import json\n",
|
||||||
"\n",
|
"\n",
|
||||||
|
"import geopandas as gpd\n",
|
||||||
|
"\n",
|
||||||
"# Convert WKB geometry to GeoDataFrame\n",
|
"# Convert WKB geometry to GeoDataFrame\n",
|
||||||
"gdf = gpd.GeoDataFrame(\n",
|
"gdf = gpd.GeoDataFrame(\n",
|
||||||
" df,\n",
|
" df, geometry=gpd.GeoSeries.from_wkb(df[\"geometry\"]), crs=\"EPSG:4326\"\n",
|
||||||
" geometry=gpd.GeoSeries.from_wkb(df['geometry']),\n",
|
|
||||||
" crs='EPSG:4326'\n",
|
|
||||||
")\n",
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# Create GeoJSON FeatureCollection\n",
|
"# Create GeoJSON FeatureCollection\n",
|
||||||
"geojson = json.loads(gdf.to_json())\n",
|
"geojson = json.loads(gdf.to_json())\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# Prepare data for figure factory\n",
|
"# Prepare data for figure factory\n",
|
||||||
"data = df.drop(columns=['geometry']).to_dict('records')"
|
"data = df.drop(columns=[\"geometry\"]).to_dict(\"records\")"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -79,7 +109,15 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"df[['neighbourhood_name', 'livability_score', 'safety_score', 'affordability_score', 'amenity_score']].head(10)"
|
"df[\n",
|
||||||
|
" [\n",
|
||||||
|
" \"neighbourhood_name\",\n",
|
||||||
|
" \"livability_score\",\n",
|
||||||
|
" \"safety_score\",\n",
|
||||||
|
" \"affordability_score\",\n",
|
||||||
|
" \"amenity_score\",\n",
|
||||||
|
" ]\n",
|
||||||
|
"].head(10)"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -90,7 +128,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"### Figure Factory\n",
|
"### Figure Factory\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Uses `create_choropleth_figure` from `portfolio_app.figures.choropleth`.\n",
|
"Uses `create_choropleth_figure` from `portfolio_app.figures.toronto.choropleth`.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"**Key Parameters:**\n",
|
"**Key Parameters:**\n",
|
||||||
"- `geojson`: GeoJSON FeatureCollection with neighbourhood boundaries\n",
|
"- `geojson`: GeoJSON FeatureCollection with neighbourhood boundaries\n",
|
||||||
@@ -107,18 +145,24 @@
|
|||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"import sys\n",
|
"import sys\n",
|
||||||
"sys.path.insert(0, '../..')\n",
|
|
||||||
"\n",
|
"\n",
|
||||||
"from portfolio_app.figures.choropleth import create_choropleth_figure\n",
|
"sys.path.insert(0, \"../..\")\n",
|
||||||
|
"\n",
|
||||||
|
"from portfolio_app.figures.toronto.choropleth import create_choropleth_figure\n",
|
||||||
"\n",
|
"\n",
|
||||||
"fig = create_choropleth_figure(\n",
|
"fig = create_choropleth_figure(\n",
|
||||||
" geojson=geojson,\n",
|
" geojson=geojson,\n",
|
||||||
" data=data,\n",
|
" data=data,\n",
|
||||||
" location_key='neighbourhood_id',\n",
|
" location_key=\"neighbourhood_id\",\n",
|
||||||
" color_column='livability_score',\n",
|
" color_column=\"livability_score\",\n",
|
||||||
" hover_data=['neighbourhood_name', 'safety_score', 'affordability_score', 'amenity_score'],\n",
|
" hover_data=[\n",
|
||||||
" color_scale='RdYlGn',\n",
|
" \"neighbourhood_name\",\n",
|
||||||
" title='Toronto Neighbourhood Livability Score',\n",
|
" \"safety_score\",\n",
|
||||||
|
" \"affordability_score\",\n",
|
||||||
|
" \"amenity_score\",\n",
|
||||||
|
" ],\n",
|
||||||
|
" color_scale=\"RdYlGn\",\n",
|
||||||
|
" title=\"Toronto Neighbourhood Livability Score\",\n",
|
||||||
" zoom=10,\n",
|
" zoom=10,\n",
|
||||||
")\n",
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
@@ -19,7 +19,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"| Table | Grain | Key Columns |\n",
|
"| Table | Grain | Key Columns |\n",
|
||||||
"|-------|-------|-------------|\n",
|
"|-------|-------|-------------|\n",
|
||||||
"| `mart_neighbourhood_overview` | neighbourhood \u00d7 year | neighbourhood_name, livability_score |\n",
|
"| `mart_neighbourhood_overview` | neighbourhood × year | neighbourhood_name, livability_score |\n",
|
||||||
"\n",
|
"\n",
|
||||||
"### SQL Query"
|
"### SQL Query"
|
||||||
]
|
]
|
||||||
@@ -30,15 +30,16 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"import pandas as pd\n",
|
|
||||||
"from sqlalchemy import create_engine\n",
|
|
||||||
"from dotenv import load_dotenv\n",
|
|
||||||
"import os\n",
|
"import os\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# Load .env from project root\n",
|
"import pandas as pd\n",
|
||||||
"load_dotenv('../../.env')\n",
|
"from dotenv import load_dotenv\n",
|
||||||
|
"from sqlalchemy import create_engine\n",
|
||||||
"\n",
|
"\n",
|
||||||
"engine = create_engine(os.environ['DATABASE_URL'])\n",
|
"# Load .env from project root\n",
|
||||||
|
"load_dotenv(\"../../.env\")\n",
|
||||||
|
"\n",
|
||||||
|
"engine = create_engine(os.environ[\"DATABASE_URL\"])\n",
|
||||||
"\n",
|
"\n",
|
||||||
"query = \"\"\"\n",
|
"query = \"\"\"\n",
|
||||||
"SELECT\n",
|
"SELECT\n",
|
||||||
@@ -76,7 +77,7 @@
|
|||||||
"source": [
|
"source": [
|
||||||
"# The figure factory handles top/bottom selection internally\n",
|
"# The figure factory handles top/bottom selection internally\n",
|
||||||
"# Just prepare as list of dicts\n",
|
"# Just prepare as list of dicts\n",
|
||||||
"data = df.to_dict('records')"
|
"data = df.to_dict(\"records\")"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -106,7 +107,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"### Figure Factory\n",
|
"### Figure Factory\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Uses `create_ranking_bar` from `portfolio_app.figures.bar_charts`.\n",
|
"Uses `create_ranking_bar` from `portfolio_app.figures.toronto.bar_charts`.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"**Key Parameters:**\n",
|
"**Key Parameters:**\n",
|
||||||
"- `data`: List of dicts with all neighbourhoods\n",
|
"- `data`: List of dicts with all neighbourhoods\n",
|
||||||
@@ -123,20 +124,21 @@
|
|||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"import sys\n",
|
"import sys\n",
|
||||||
"sys.path.insert(0, '../..')\n",
|
|
||||||
"\n",
|
"\n",
|
||||||
"from portfolio_app.figures.bar_charts import create_ranking_bar\n",
|
"sys.path.insert(0, \"../..\")\n",
|
||||||
|
"\n",
|
||||||
|
"from portfolio_app.figures.toronto.bar_charts import create_ranking_bar\n",
|
||||||
"\n",
|
"\n",
|
||||||
"fig = create_ranking_bar(\n",
|
"fig = create_ranking_bar(\n",
|
||||||
" data=data,\n",
|
" data=data,\n",
|
||||||
" name_column='neighbourhood_name',\n",
|
" name_column=\"neighbourhood_name\",\n",
|
||||||
" value_column='livability_score',\n",
|
" value_column=\"livability_score\",\n",
|
||||||
" title='Top & Bottom 10 Neighbourhoods by Livability',\n",
|
" title=\"Top & Bottom 10 Neighbourhoods by Livability\",\n",
|
||||||
" top_n=10,\n",
|
" top_n=10,\n",
|
||||||
" bottom_n=10,\n",
|
" bottom_n=10,\n",
|
||||||
" color_top='#4CAF50', # Green for top performers\n",
|
" color_top=\"#4CAF50\", # Green for top performers\n",
|
||||||
" color_bottom='#F44336', # Red for bottom performers\n",
|
" color_bottom=\"#F44336\", # Red for bottom performers\n",
|
||||||
" value_format='.1f',\n",
|
" value_format=\".1f\",\n",
|
||||||
")\n",
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
"fig.show()"
|
"fig.show()"
|
||||||
@@ -19,7 +19,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"| Table | Grain | Key Columns |\n",
|
"| Table | Grain | Key Columns |\n",
|
||||||
"|-------|-------|-------------|\n",
|
"|-------|-------|-------------|\n",
|
||||||
"| `mart_neighbourhood_safety` | neighbourhood \u00d7 year | assault_count, auto_theft_count, break_enter_count, robbery_count, etc. |\n",
|
"| `mart_neighbourhood_safety` | neighbourhood × year | assault_count, auto_theft_count, break_enter_count, robbery_count, etc. |\n",
|
||||||
"\n",
|
"\n",
|
||||||
"### SQL Query"
|
"### SQL Query"
|
||||||
]
|
]
|
||||||
@@ -30,15 +30,16 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"import pandas as pd\n",
|
|
||||||
"from sqlalchemy import create_engine\n",
|
|
||||||
"from dotenv import load_dotenv\n",
|
|
||||||
"import os\n",
|
"import os\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# Load .env from project root\n",
|
"import pandas as pd\n",
|
||||||
"load_dotenv('../../.env')\n",
|
"from dotenv import load_dotenv\n",
|
||||||
|
"from sqlalchemy import create_engine\n",
|
||||||
"\n",
|
"\n",
|
||||||
"engine = create_engine(os.environ['DATABASE_URL'])\n",
|
"# Load .env from project root\n",
|
||||||
|
"load_dotenv(\"../../.env\")\n",
|
||||||
|
"\n",
|
||||||
|
"engine = create_engine(os.environ[\"DATABASE_URL\"])\n",
|
||||||
"\n",
|
"\n",
|
||||||
"query = \"\"\"\n",
|
"query = \"\"\"\n",
|
||||||
"SELECT\n",
|
"SELECT\n",
|
||||||
@@ -79,17 +80,25 @@
|
|||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"df_melted = df.melt(\n",
|
"df_melted = df.melt(\n",
|
||||||
" id_vars=['neighbourhood_name', 'total_incidents'],\n",
|
" id_vars=[\"neighbourhood_name\", \"total_incidents\"],\n",
|
||||||
" value_vars=['assault_count', 'auto_theft_count', 'break_enter_count', \n",
|
" value_vars=[\n",
|
||||||
" 'robbery_count', 'theft_over_count', 'homicide_count'],\n",
|
" \"assault_count\",\n",
|
||||||
" var_name='crime_type',\n",
|
" \"auto_theft_count\",\n",
|
||||||
" value_name='count'\n",
|
" \"break_enter_count\",\n",
|
||||||
|
" \"robbery_count\",\n",
|
||||||
|
" \"theft_over_count\",\n",
|
||||||
|
" \"homicide_count\",\n",
|
||||||
|
" ],\n",
|
||||||
|
" var_name=\"crime_type\",\n",
|
||||||
|
" value_name=\"count\",\n",
|
||||||
")\n",
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# Clean labels\n",
|
"# Clean labels\n",
|
||||||
"df_melted['crime_type'] = df_melted['crime_type'].str.replace('_count', '').str.replace('_', ' ').str.title()\n",
|
"df_melted[\"crime_type\"] = (\n",
|
||||||
|
" df_melted[\"crime_type\"].str.replace(\"_count\", \"\").str.replace(\"_\", \" \").str.title()\n",
|
||||||
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
"data = df_melted.to_dict('records')"
|
"data = df_melted.to_dict(\"records\")"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -105,7 +114,15 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"df[['neighbourhood_name', 'assault_count', 'auto_theft_count', 'break_enter_count', 'total_incidents']].head(10)"
|
"df[\n",
|
||||||
|
" [\n",
|
||||||
|
" \"neighbourhood_name\",\n",
|
||||||
|
" \"assault_count\",\n",
|
||||||
|
" \"auto_theft_count\",\n",
|
||||||
|
" \"break_enter_count\",\n",
|
||||||
|
" \"total_incidents\",\n",
|
||||||
|
" ]\n",
|
||||||
|
"].head(10)"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -116,7 +133,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"### Figure Factory\n",
|
"### Figure Factory\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Uses `create_stacked_bar` from `portfolio_app.figures.bar_charts`."
|
"Uses `create_stacked_bar` from `portfolio_app.figures.toronto.bar_charts`."
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -126,23 +143,24 @@
|
|||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"import sys\n",
|
"import sys\n",
|
||||||
"sys.path.insert(0, '../..')\n",
|
|
||||||
"\n",
|
"\n",
|
||||||
"from portfolio_app.figures.bar_charts import create_stacked_bar\n",
|
"sys.path.insert(0, \"../..\")\n",
|
||||||
|
"\n",
|
||||||
|
"from portfolio_app.figures.toronto.bar_charts import create_stacked_bar\n",
|
||||||
"\n",
|
"\n",
|
||||||
"fig = create_stacked_bar(\n",
|
"fig = create_stacked_bar(\n",
|
||||||
" data=data,\n",
|
" data=data,\n",
|
||||||
" x_column='neighbourhood_name',\n",
|
" x_column=\"neighbourhood_name\",\n",
|
||||||
" value_column='count',\n",
|
" value_column=\"count\",\n",
|
||||||
" category_column='crime_type',\n",
|
" category_column=\"crime_type\",\n",
|
||||||
" title='Crime Type Breakdown - Top 15 Neighbourhoods',\n",
|
" title=\"Crime Type Breakdown - Top 15 Neighbourhoods\",\n",
|
||||||
" color_map={\n",
|
" color_map={\n",
|
||||||
" 'Assault': '#d62728',\n",
|
" \"Assault\": \"#d62728\",\n",
|
||||||
" 'Auto Theft': '#ff7f0e',\n",
|
" \"Auto Theft\": \"#ff7f0e\",\n",
|
||||||
" 'Break Enter': '#9467bd',\n",
|
" \"Break Enter\": \"#9467bd\",\n",
|
||||||
" 'Robbery': '#8c564b',\n",
|
" \"Robbery\": \"#8c564b\",\n",
|
||||||
" 'Theft Over': '#e377c2',\n",
|
" \"Theft Over\": \"#e377c2\",\n",
|
||||||
" 'Homicide': '#1f77b4'\n",
|
" \"Homicide\": \"#1f77b4\",\n",
|
||||||
" },\n",
|
" },\n",
|
||||||
")\n",
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
@@ -19,7 +19,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"| Table | Grain | Key Columns |\n",
|
"| Table | Grain | Key Columns |\n",
|
||||||
"|-------|-------|-------------|\n",
|
"|-------|-------|-------------|\n",
|
||||||
"| `mart_neighbourhood_safety` | neighbourhood \u00d7 year | crime_rate_per_100k, crime_index, safety_tier, geometry |\n",
|
"| `mart_neighbourhood_safety` | neighbourhood × year | crime_rate_per_100k, crime_index, safety_tier, geometry |\n",
|
||||||
"\n",
|
"\n",
|
||||||
"### SQL Query"
|
"### SQL Query"
|
||||||
]
|
]
|
||||||
@@ -30,15 +30,16 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"import pandas as pd\n",
|
|
||||||
"from sqlalchemy import create_engine\n",
|
|
||||||
"from dotenv import load_dotenv\n",
|
|
||||||
"import os\n",
|
"import os\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# Load .env from project root\n",
|
"import pandas as pd\n",
|
||||||
"load_dotenv('../../.env')\n",
|
"from dotenv import load_dotenv\n",
|
||||||
|
"from sqlalchemy import create_engine\n",
|
||||||
"\n",
|
"\n",
|
||||||
"engine = create_engine(os.environ['DATABASE_URL'])\n",
|
"# Load .env from project root\n",
|
||||||
|
"load_dotenv(\"../../.env\")\n",
|
||||||
|
"\n",
|
||||||
|
"engine = create_engine(os.environ[\"DATABASE_URL\"])\n",
|
||||||
"\n",
|
"\n",
|
||||||
"query = \"\"\"\n",
|
"query = \"\"\"\n",
|
||||||
"SELECT\n",
|
"SELECT\n",
|
||||||
@@ -77,17 +78,16 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"import geopandas as gpd\n",
|
|
||||||
"import json\n",
|
"import json\n",
|
||||||
"\n",
|
"\n",
|
||||||
|
"import geopandas as gpd\n",
|
||||||
|
"\n",
|
||||||
"gdf = gpd.GeoDataFrame(\n",
|
"gdf = gpd.GeoDataFrame(\n",
|
||||||
" df,\n",
|
" df, geometry=gpd.GeoSeries.from_wkb(df[\"geometry\"]), crs=\"EPSG:4326\"\n",
|
||||||
" geometry=gpd.GeoSeries.from_wkb(df['geometry']),\n",
|
|
||||||
" crs='EPSG:4326'\n",
|
|
||||||
")\n",
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
"geojson = json.loads(gdf.to_json())\n",
|
"geojson = json.loads(gdf.to_json())\n",
|
||||||
"data = df.drop(columns=['geometry']).to_dict('records')"
|
"data = df.drop(columns=[\"geometry\"]).to_dict(\"records\")"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -103,7 +103,15 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"df[['neighbourhood_name', 'crime_rate_per_100k', 'crime_index', 'safety_tier', 'total_incidents']].head(10)"
|
"df[\n",
|
||||||
|
" [\n",
|
||||||
|
" \"neighbourhood_name\",\n",
|
||||||
|
" \"crime_rate_per_100k\",\n",
|
||||||
|
" \"crime_index\",\n",
|
||||||
|
" \"safety_tier\",\n",
|
||||||
|
" \"total_incidents\",\n",
|
||||||
|
" ]\n",
|
||||||
|
"].head(10)"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -114,7 +122,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"### Figure Factory\n",
|
"### Figure Factory\n",
|
||||||
"\n",
|
"\n",
|
||||||
"Uses `create_choropleth_figure` from `portfolio_app.figures.choropleth`.\n",
|
"Uses `create_choropleth_figure` from `portfolio_app.figures.toronto.choropleth`.\n",
|
||||||
"\n",
|
"\n",
|
||||||
"**Key Parameters:**\n",
|
"**Key Parameters:**\n",
|
||||||
"- `color_column`: 'crime_rate_per_100k'\n",
|
"- `color_column`: 'crime_rate_per_100k'\n",
|
||||||
@@ -128,18 +136,19 @@
|
|||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"import sys\n",
|
"import sys\n",
|
||||||
"sys.path.insert(0, '../..')\n",
|
|
||||||
"\n",
|
"\n",
|
||||||
"from portfolio_app.figures.choropleth import create_choropleth_figure\n",
|
"sys.path.insert(0, \"../..\")\n",
|
||||||
|
"\n",
|
||||||
|
"from portfolio_app.figures.toronto.choropleth import create_choropleth_figure\n",
|
||||||
"\n",
|
"\n",
|
||||||
"fig = create_choropleth_figure(\n",
|
"fig = create_choropleth_figure(\n",
|
||||||
" geojson=geojson,\n",
|
" geojson=geojson,\n",
|
||||||
" data=data,\n",
|
" data=data,\n",
|
||||||
" location_key='neighbourhood_id',\n",
|
" location_key=\"neighbourhood_id\",\n",
|
||||||
" color_column='crime_rate_per_100k',\n",
|
" color_column=\"crime_rate_per_100k\",\n",
|
||||||
" hover_data=['neighbourhood_name', 'crime_index', 'total_incidents'],\n",
|
" hover_data=[\"neighbourhood_name\", \"crime_index\", \"total_incidents\"],\n",
|
||||||
" color_scale='RdYlGn_r',\n",
|
" color_scale=\"RdYlGn_r\",\n",
|
||||||
" title='Toronto Crime Rate per 100,000 Population',\n",
|
" title=\"Toronto Crime Rate per 100,000 Population\",\n",
|
||||||
" zoom=10,\n",
|
" zoom=10,\n",
|
||||||
")\n",
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
@@ -19,7 +19,7 @@
|
|||||||
"\n",
|
"\n",
|
||||||
"| Table | Grain | Key Columns |\n",
|
"| Table | Grain | Key Columns |\n",
|
||||||
"|-------|-------|-------------|\n",
|
"|-------|-------|-------------|\n",
|
||||||
"| `mart_neighbourhood_safety` | neighbourhood \u00d7 year | year, crime_rate_per_100k, crime_yoy_change_pct |\n",
|
"| `mart_neighbourhood_safety` | neighbourhood × year | year, crime_rate_per_100k, crime_yoy_change_pct |\n",
|
||||||
"\n",
|
"\n",
|
||||||
"### SQL Query"
|
"### SQL Query"
|
||||||
]
|
]
|
||||||
@@ -30,15 +30,16 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"import pandas as pd\n",
|
|
||||||
"from sqlalchemy import create_engine\n",
|
|
||||||
"from dotenv import load_dotenv\n",
|
|
||||||
"import os\n",
|
"import os\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# Load .env from project root\n",
|
"import pandas as pd\n",
|
||||||
"load_dotenv('../../.env')\n",
|
"from dotenv import load_dotenv\n",
|
||||||
|
"from sqlalchemy import create_engine\n",
|
||||||
"\n",
|
"\n",
|
||||||
"engine = create_engine(os.environ['DATABASE_URL'])\n",
|
"# Load .env from project root\n",
|
||||||
|
"load_dotenv(\"../../.env\")\n",
|
||||||
|
"\n",
|
||||||
|
"engine = create_engine(os.environ[\"DATABASE_URL\"])\n",
|
||||||
"\n",
|
"\n",
|
||||||
"query = \"\"\"\n",
|
"query = \"\"\"\n",
|
||||||
"SELECT\n",
|
"SELECT\n",
|
||||||
@@ -76,21 +77,23 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"df['date'] = pd.to_datetime(df['year'].astype(str) + '-01-01')\n",
|
"df[\"date\"] = pd.to_datetime(df[\"year\"].astype(str) + \"-01-01\")\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# Melt for multi-line\n",
|
"# Melt for multi-line\n",
|
||||||
"df_melted = df.melt(\n",
|
"df_melted = df.melt(\n",
|
||||||
" id_vars=['year', 'date'],\n",
|
" id_vars=[\"year\", \"date\"],\n",
|
||||||
" value_vars=['avg_assault_rate', 'avg_auto_theft_rate', 'avg_break_enter_rate'],\n",
|
" value_vars=[\"avg_assault_rate\", \"avg_auto_theft_rate\", \"avg_break_enter_rate\"],\n",
|
||||||
" var_name='crime_type',\n",
|
" var_name=\"crime_type\",\n",
|
||||||
" value_name='rate_per_100k'\n",
|
" value_name=\"rate_per_100k\",\n",
|
||||||
")\n",
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
"df_melted['crime_type'] = df_melted['crime_type'].map({\n",
|
"df_melted[\"crime_type\"] = df_melted[\"crime_type\"].map(\n",
|
||||||
" 'avg_assault_rate': 'Assault',\n",
|
" {\n",
|
||||||
" 'avg_auto_theft_rate': 'Auto Theft',\n",
|
" \"avg_assault_rate\": \"Assault\",\n",
|
||||||
" 'avg_break_enter_rate': 'Break & Enter'\n",
|
" \"avg_auto_theft_rate\": \"Auto Theft\",\n",
|
||||||
"})"
|
" \"avg_break_enter_rate\": \"Break & Enter\",\n",
|
||||||
|
" }\n",
|
||||||
|
")"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -106,7 +109,7 @@
|
|||||||
"metadata": {},
|
"metadata": {},
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"df[['year', 'avg_crime_rate', 'total_city_incidents', 'avg_yoy_change']]"
|
"df[[\"year\", \"avg_crime_rate\", \"total_city_incidents\", \"avg_yoy_change\"]]"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
@@ -127,22 +130,23 @@
|
|||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"import sys\n",
|
"import sys\n",
|
||||||
"sys.path.insert(0, '../..')\n",
|
|
||||||
"\n",
|
"\n",
|
||||||
"from portfolio_app.figures.time_series import create_price_time_series\n",
|
"sys.path.insert(0, \"../..\")\n",
|
||||||
"\n",
|
"\n",
|
||||||
"data = df_melted.to_dict('records')\n",
|
"from portfolio_app.figures.toronto.time_series import create_price_time_series\n",
|
||||||
|
"\n",
|
||||||
|
"data = df_melted.to_dict(\"records\")\n",
|
||||||
"\n",
|
"\n",
|
||||||
"fig = create_price_time_series(\n",
|
"fig = create_price_time_series(\n",
|
||||||
" data=data,\n",
|
" data=data,\n",
|
||||||
" date_column='date',\n",
|
" date_column=\"date\",\n",
|
||||||
" price_column='rate_per_100k',\n",
|
" price_column=\"rate_per_100k\",\n",
|
||||||
" group_column='crime_type',\n",
|
" group_column=\"crime_type\",\n",
|
||||||
" title='Toronto Crime Trends by Type (5 Years)',\n",
|
" title=\"Toronto Crime Trends by Type (5 Years)\",\n",
|
||||||
")\n",
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# Remove dollar sign formatting since this is rate data\n",
|
"# Remove dollar sign formatting since this is rate data\n",
|
||||||
"fig.update_layout(yaxis_tickprefix='', yaxis_title='Rate per 100K')\n",
|
"fig.update_layout(yaxis_tickprefix=\"\", yaxis_title=\"Rate per 100K\")\n",
|
||||||
"\n",
|
"\n",
|
||||||
"fig.show()"
|
"fig.show()"
|
||||||
]
|
]
|
||||||
@@ -161,15 +165,19 @@
|
|||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"# Total crime rate trend\n",
|
"# Total crime rate trend\n",
|
||||||
"total_data = df[['date', 'avg_crime_rate']].rename(columns={'avg_crime_rate': 'total_rate'}).to_dict('records')\n",
|
"total_data = (\n",
|
||||||
|
" df[[\"date\", \"avg_crime_rate\"]]\n",
|
||||||
|
" .rename(columns={\"avg_crime_rate\": \"total_rate\"})\n",
|
||||||
|
" .to_dict(\"records\")\n",
|
||||||
|
")\n",
|
||||||
"\n",
|
"\n",
|
||||||
"fig2 = create_price_time_series(\n",
|
"fig2 = create_price_time_series(\n",
|
||||||
" data=total_data,\n",
|
" data=total_data,\n",
|
||||||
" date_column='date',\n",
|
" date_column=\"date\",\n",
|
||||||
" price_column='total_rate',\n",
|
" price_column=\"total_rate\",\n",
|
||||||
" title='Toronto Overall Crime Rate Trend',\n",
|
" title=\"Toronto Overall Crime Rate Trend\",\n",
|
||||||
")\n",
|
")\n",
|
||||||
"fig2.update_layout(yaxis_tickprefix='', yaxis_title='Rate per 100K')\n",
|
"fig2.update_layout(yaxis_tickprefix=\"\", yaxis_title=\"Rate per 100K\")\n",
|
||||||
"fig2.show()"
|
"fig2.show()"
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
@@ -5,7 +5,7 @@ from typing import Any
|
|||||||
import dash_mantine_components as dmc
|
import dash_mantine_components as dmc
|
||||||
from dash import dcc
|
from dash import dcc
|
||||||
|
|
||||||
from portfolio_app.figures.summary_cards import create_metric_card_figure
|
from portfolio_app.figures.toronto.summary_cards import create_metric_card_figure
|
||||||
|
|
||||||
|
|
||||||
class MetricCard:
|
class MetricCard:
|
||||||
|
|||||||
@@ -1,61 +1,15 @@
|
|||||||
"""Plotly figure factories for data visualization."""
|
"""Plotly figure factories for data visualization.
|
||||||
|
|
||||||
from .bar_charts import (
|
Figure factories are organized by dashboard domain:
|
||||||
create_horizontal_bar,
|
- toronto/ : Toronto Neighbourhood Dashboard figures
|
||||||
create_ranking_bar,
|
|
||||||
create_stacked_bar,
|
Usage:
|
||||||
)
|
from portfolio_app.figures.toronto import create_choropleth_figure
|
||||||
from .choropleth import (
|
from portfolio_app.figures.toronto import create_ranking_bar
|
||||||
create_choropleth_figure,
|
"""
|
||||||
create_zone_map,
|
|
||||||
)
|
from . import toronto
|
||||||
from .demographics import (
|
|
||||||
create_age_pyramid,
|
|
||||||
create_donut_chart,
|
|
||||||
create_income_distribution,
|
|
||||||
)
|
|
||||||
from .radar import (
|
|
||||||
create_comparison_radar,
|
|
||||||
create_radar_figure,
|
|
||||||
)
|
|
||||||
from .scatter import (
|
|
||||||
create_bubble_chart,
|
|
||||||
create_scatter_figure,
|
|
||||||
)
|
|
||||||
from .summary_cards import create_metric_card_figure, create_summary_metrics
|
|
||||||
from .time_series import (
|
|
||||||
add_policy_markers,
|
|
||||||
create_market_comparison_chart,
|
|
||||||
create_price_time_series,
|
|
||||||
create_time_series_with_events,
|
|
||||||
create_volume_time_series,
|
|
||||||
)
|
|
||||||
|
|
||||||
__all__ = [
|
__all__ = [
|
||||||
# Choropleth
|
"toronto",
|
||||||
"create_choropleth_figure",
|
|
||||||
"create_zone_map",
|
|
||||||
# Time series
|
|
||||||
"create_price_time_series",
|
|
||||||
"create_volume_time_series",
|
|
||||||
"create_market_comparison_chart",
|
|
||||||
"create_time_series_with_events",
|
|
||||||
"add_policy_markers",
|
|
||||||
# Summary
|
|
||||||
"create_metric_card_figure",
|
|
||||||
"create_summary_metrics",
|
|
||||||
# Bar charts
|
|
||||||
"create_ranking_bar",
|
|
||||||
"create_stacked_bar",
|
|
||||||
"create_horizontal_bar",
|
|
||||||
# Scatter plots
|
|
||||||
"create_scatter_figure",
|
|
||||||
"create_bubble_chart",
|
|
||||||
# Radar charts
|
|
||||||
"create_radar_figure",
|
|
||||||
"create_comparison_radar",
|
|
||||||
# Demographics
|
|
||||||
"create_age_pyramid",
|
|
||||||
"create_donut_chart",
|
|
||||||
"create_income_distribution",
|
|
||||||
]
|
]
|
||||||
|
|||||||
61
portfolio_app/figures/toronto/__init__.py
Normal file
61
portfolio_app/figures/toronto/__init__.py
Normal file
@@ -0,0 +1,61 @@
|
|||||||
|
"""Plotly figure factories for Toronto dashboard visualizations."""
|
||||||
|
|
||||||
|
from .bar_charts import (
|
||||||
|
create_horizontal_bar,
|
||||||
|
create_ranking_bar,
|
||||||
|
create_stacked_bar,
|
||||||
|
)
|
||||||
|
from .choropleth import (
|
||||||
|
create_choropleth_figure,
|
||||||
|
create_zone_map,
|
||||||
|
)
|
||||||
|
from .demographics import (
|
||||||
|
create_age_pyramid,
|
||||||
|
create_donut_chart,
|
||||||
|
create_income_distribution,
|
||||||
|
)
|
||||||
|
from .radar import (
|
||||||
|
create_comparison_radar,
|
||||||
|
create_radar_figure,
|
||||||
|
)
|
||||||
|
from .scatter import (
|
||||||
|
create_bubble_chart,
|
||||||
|
create_scatter_figure,
|
||||||
|
)
|
||||||
|
from .summary_cards import create_metric_card_figure, create_summary_metrics
|
||||||
|
from .time_series import (
|
||||||
|
add_policy_markers,
|
||||||
|
create_market_comparison_chart,
|
||||||
|
create_price_time_series,
|
||||||
|
create_time_series_with_events,
|
||||||
|
create_volume_time_series,
|
||||||
|
)
|
||||||
|
|
||||||
|
__all__ = [
|
||||||
|
# Choropleth
|
||||||
|
"create_choropleth_figure",
|
||||||
|
"create_zone_map",
|
||||||
|
# Time series
|
||||||
|
"create_price_time_series",
|
||||||
|
"create_volume_time_series",
|
||||||
|
"create_market_comparison_chart",
|
||||||
|
"create_time_series_with_events",
|
||||||
|
"add_policy_markers",
|
||||||
|
# Summary
|
||||||
|
"create_metric_card_figure",
|
||||||
|
"create_summary_metrics",
|
||||||
|
# Bar charts
|
||||||
|
"create_ranking_bar",
|
||||||
|
"create_stacked_bar",
|
||||||
|
"create_horizontal_bar",
|
||||||
|
# Scatter plots
|
||||||
|
"create_scatter_figure",
|
||||||
|
"create_bubble_chart",
|
||||||
|
# Radar charts
|
||||||
|
"create_radar_figure",
|
||||||
|
"create_comparison_radar",
|
||||||
|
# Demographics
|
||||||
|
"create_age_pyramid",
|
||||||
|
"create_donut_chart",
|
||||||
|
"create_income_distribution",
|
||||||
|
]
|
||||||
@@ -5,7 +5,7 @@ import pandas as pd
|
|||||||
import plotly.graph_objects as go
|
import plotly.graph_objects as go
|
||||||
from dash import Input, Output, callback
|
from dash import Input, Output, callback
|
||||||
|
|
||||||
from portfolio_app.figures import (
|
from portfolio_app.figures.toronto import (
|
||||||
create_donut_chart,
|
create_donut_chart,
|
||||||
create_horizontal_bar,
|
create_horizontal_bar,
|
||||||
create_radar_figure,
|
create_radar_figure,
|
||||||
|
|||||||
@@ -4,7 +4,7 @@
|
|||||||
import plotly.graph_objects as go
|
import plotly.graph_objects as go
|
||||||
from dash import Input, Output, State, callback, no_update
|
from dash import Input, Output, State, callback, no_update
|
||||||
|
|
||||||
from portfolio_app.figures import create_choropleth_figure, create_ranking_bar
|
from portfolio_app.figures.toronto import create_choropleth_figure, create_ranking_bar
|
||||||
from portfolio_app.toronto.services import (
|
from portfolio_app.toronto.services import (
|
||||||
get_amenities_data,
|
get_amenities_data,
|
||||||
get_demographics_data,
|
get_demographics_data,
|
||||||
|
|||||||
@@ -8,11 +8,18 @@ from sqlalchemy.orm import Mapped, mapped_column
|
|||||||
|
|
||||||
from .base import Base
|
from .base import Base
|
||||||
|
|
||||||
|
# Schema constants
|
||||||
|
RAW_TORONTO_SCHEMA = "raw_toronto"
|
||||||
|
|
||||||
|
|
||||||
class DimTime(Base):
|
class DimTime(Base):
|
||||||
"""Time dimension table."""
|
"""Time dimension table (shared across all projects).
|
||||||
|
|
||||||
|
Note: Stays in public schema as it's a shared dimension.
|
||||||
|
"""
|
||||||
|
|
||||||
__tablename__ = "dim_time"
|
__tablename__ = "dim_time"
|
||||||
|
__table_args__ = {"schema": "public"}
|
||||||
|
|
||||||
date_key: Mapped[int] = mapped_column(Integer, primary_key=True)
|
date_key: Mapped[int] = mapped_column(Integer, primary_key=True)
|
||||||
full_date: Mapped[date] = mapped_column(Date, nullable=False, unique=True)
|
full_date: Mapped[date] = mapped_column(Date, nullable=False, unique=True)
|
||||||
@@ -27,6 +34,7 @@ class DimCMHCZone(Base):
|
|||||||
"""CMHC zone dimension table with PostGIS geometry."""
|
"""CMHC zone dimension table with PostGIS geometry."""
|
||||||
|
|
||||||
__tablename__ = "dim_cmhc_zone"
|
__tablename__ = "dim_cmhc_zone"
|
||||||
|
__table_args__ = {"schema": RAW_TORONTO_SCHEMA}
|
||||||
|
|
||||||
zone_key: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
|
zone_key: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
|
||||||
zone_code: Mapped[str] = mapped_column(String(10), nullable=False, unique=True)
|
zone_code: Mapped[str] = mapped_column(String(10), nullable=False, unique=True)
|
||||||
@@ -41,6 +49,7 @@ class DimNeighbourhood(Base):
|
|||||||
"""
|
"""
|
||||||
|
|
||||||
__tablename__ = "dim_neighbourhood"
|
__tablename__ = "dim_neighbourhood"
|
||||||
|
__table_args__ = {"schema": RAW_TORONTO_SCHEMA}
|
||||||
|
|
||||||
neighbourhood_id: Mapped[int] = mapped_column(Integer, primary_key=True)
|
neighbourhood_id: Mapped[int] = mapped_column(Integer, primary_key=True)
|
||||||
name: Mapped[str] = mapped_column(String(100), nullable=False)
|
name: Mapped[str] = mapped_column(String(100), nullable=False)
|
||||||
@@ -69,6 +78,7 @@ class DimPolicyEvent(Base):
|
|||||||
"""Policy event dimension for time-series annotation."""
|
"""Policy event dimension for time-series annotation."""
|
||||||
|
|
||||||
__tablename__ = "dim_policy_event"
|
__tablename__ = "dim_policy_event"
|
||||||
|
__table_args__ = {"schema": RAW_TORONTO_SCHEMA}
|
||||||
|
|
||||||
event_id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
|
event_id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
|
||||||
event_date: Mapped[date] = mapped_column(Date, nullable=False)
|
event_date: Mapped[date] = mapped_column(Date, nullable=False)
|
||||||
|
|||||||
@@ -4,6 +4,7 @@ from sqlalchemy import ForeignKey, Index, Integer, Numeric, String
|
|||||||
from sqlalchemy.orm import Mapped, mapped_column, relationship
|
from sqlalchemy.orm import Mapped, mapped_column, relationship
|
||||||
|
|
||||||
from .base import Base
|
from .base import Base
|
||||||
|
from .dimensions import RAW_TORONTO_SCHEMA
|
||||||
|
|
||||||
|
|
||||||
class BridgeCMHCNeighbourhood(Base):
|
class BridgeCMHCNeighbourhood(Base):
|
||||||
@@ -14,6 +15,11 @@ class BridgeCMHCNeighbourhood(Base):
|
|||||||
"""
|
"""
|
||||||
|
|
||||||
__tablename__ = "bridge_cmhc_neighbourhood"
|
__tablename__ = "bridge_cmhc_neighbourhood"
|
||||||
|
__table_args__ = (
|
||||||
|
Index("ix_bridge_cmhc_zone", "cmhc_zone_code"),
|
||||||
|
Index("ix_bridge_neighbourhood", "neighbourhood_id"),
|
||||||
|
{"schema": RAW_TORONTO_SCHEMA},
|
||||||
|
)
|
||||||
|
|
||||||
id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
|
id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
|
||||||
cmhc_zone_code: Mapped[str] = mapped_column(String(10), nullable=False)
|
cmhc_zone_code: Mapped[str] = mapped_column(String(10), nullable=False)
|
||||||
@@ -22,11 +28,6 @@ class BridgeCMHCNeighbourhood(Base):
|
|||||||
Numeric(5, 4), nullable=False
|
Numeric(5, 4), nullable=False
|
||||||
) # 0.0000 to 1.0000
|
) # 0.0000 to 1.0000
|
||||||
|
|
||||||
__table_args__ = (
|
|
||||||
Index("ix_bridge_cmhc_zone", "cmhc_zone_code"),
|
|
||||||
Index("ix_bridge_neighbourhood", "neighbourhood_id"),
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
class FactCensus(Base):
|
class FactCensus(Base):
|
||||||
"""Census statistics by neighbourhood and year.
|
"""Census statistics by neighbourhood and year.
|
||||||
@@ -35,6 +36,10 @@ class FactCensus(Base):
|
|||||||
"""
|
"""
|
||||||
|
|
||||||
__tablename__ = "fact_census"
|
__tablename__ = "fact_census"
|
||||||
|
__table_args__ = (
|
||||||
|
Index("ix_fact_census_neighbourhood_year", "neighbourhood_id", "census_year"),
|
||||||
|
{"schema": RAW_TORONTO_SCHEMA},
|
||||||
|
)
|
||||||
|
|
||||||
id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
|
id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
|
||||||
neighbourhood_id: Mapped[int] = mapped_column(Integer, nullable=False)
|
neighbourhood_id: Mapped[int] = mapped_column(Integer, nullable=False)
|
||||||
@@ -66,10 +71,6 @@ class FactCensus(Base):
|
|||||||
Numeric(12, 2), nullable=True
|
Numeric(12, 2), nullable=True
|
||||||
)
|
)
|
||||||
|
|
||||||
__table_args__ = (
|
|
||||||
Index("ix_fact_census_neighbourhood_year", "neighbourhood_id", "census_year"),
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
class FactCrime(Base):
|
class FactCrime(Base):
|
||||||
"""Crime statistics by neighbourhood and year.
|
"""Crime statistics by neighbourhood and year.
|
||||||
@@ -78,6 +79,11 @@ class FactCrime(Base):
|
|||||||
"""
|
"""
|
||||||
|
|
||||||
__tablename__ = "fact_crime"
|
__tablename__ = "fact_crime"
|
||||||
|
__table_args__ = (
|
||||||
|
Index("ix_fact_crime_neighbourhood_year", "neighbourhood_id", "year"),
|
||||||
|
Index("ix_fact_crime_type", "crime_type"),
|
||||||
|
{"schema": RAW_TORONTO_SCHEMA},
|
||||||
|
)
|
||||||
|
|
||||||
id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
|
id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
|
||||||
neighbourhood_id: Mapped[int] = mapped_column(Integer, nullable=False)
|
neighbourhood_id: Mapped[int] = mapped_column(Integer, nullable=False)
|
||||||
@@ -86,11 +92,6 @@ class FactCrime(Base):
|
|||||||
count: Mapped[int] = mapped_column(Integer, nullable=False)
|
count: Mapped[int] = mapped_column(Integer, nullable=False)
|
||||||
rate_per_100k: Mapped[float | None] = mapped_column(Numeric(10, 2), nullable=True)
|
rate_per_100k: Mapped[float | None] = mapped_column(Numeric(10, 2), nullable=True)
|
||||||
|
|
||||||
__table_args__ = (
|
|
||||||
Index("ix_fact_crime_neighbourhood_year", "neighbourhood_id", "year"),
|
|
||||||
Index("ix_fact_crime_type", "crime_type"),
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
class FactAmenities(Base):
|
class FactAmenities(Base):
|
||||||
"""Amenity counts by neighbourhood.
|
"""Amenity counts by neighbourhood.
|
||||||
@@ -99,6 +100,11 @@ class FactAmenities(Base):
|
|||||||
"""
|
"""
|
||||||
|
|
||||||
__tablename__ = "fact_amenities"
|
__tablename__ = "fact_amenities"
|
||||||
|
__table_args__ = (
|
||||||
|
Index("ix_fact_amenities_neighbourhood_year", "neighbourhood_id", "year"),
|
||||||
|
Index("ix_fact_amenities_type", "amenity_type"),
|
||||||
|
{"schema": RAW_TORONTO_SCHEMA},
|
||||||
|
)
|
||||||
|
|
||||||
id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
|
id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
|
||||||
neighbourhood_id: Mapped[int] = mapped_column(Integer, nullable=False)
|
neighbourhood_id: Mapped[int] = mapped_column(Integer, nullable=False)
|
||||||
@@ -106,11 +112,6 @@ class FactAmenities(Base):
|
|||||||
count: Mapped[int] = mapped_column(Integer, nullable=False)
|
count: Mapped[int] = mapped_column(Integer, nullable=False)
|
||||||
year: Mapped[int] = mapped_column(Integer, nullable=False)
|
year: Mapped[int] = mapped_column(Integer, nullable=False)
|
||||||
|
|
||||||
__table_args__ = (
|
|
||||||
Index("ix_fact_amenities_neighbourhood_year", "neighbourhood_id", "year"),
|
|
||||||
Index("ix_fact_amenities_type", "amenity_type"),
|
|
||||||
)
|
|
||||||
|
|
||||||
|
|
||||||
class FactRentals(Base):
|
class FactRentals(Base):
|
||||||
"""Fact table for CMHC rental market data.
|
"""Fact table for CMHC rental market data.
|
||||||
@@ -119,13 +120,16 @@ class FactRentals(Base):
|
|||||||
"""
|
"""
|
||||||
|
|
||||||
__tablename__ = "fact_rentals"
|
__tablename__ = "fact_rentals"
|
||||||
|
__table_args__ = {"schema": RAW_TORONTO_SCHEMA}
|
||||||
|
|
||||||
id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
|
id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
|
||||||
date_key: Mapped[int] = mapped_column(
|
date_key: Mapped[int] = mapped_column(
|
||||||
Integer, ForeignKey("dim_time.date_key"), nullable=False
|
Integer, ForeignKey("public.dim_time.date_key"), nullable=False
|
||||||
)
|
)
|
||||||
zone_key: Mapped[int] = mapped_column(
|
zone_key: Mapped[int] = mapped_column(
|
||||||
Integer, ForeignKey("dim_cmhc_zone.zone_key"), nullable=False
|
Integer,
|
||||||
|
ForeignKey(f"{RAW_TORONTO_SCHEMA}.dim_cmhc_zone.zone_key"),
|
||||||
|
nullable=False,
|
||||||
)
|
)
|
||||||
bedroom_type: Mapped[str] = mapped_column(String(20), nullable=False)
|
bedroom_type: Mapped[str] = mapped_column(String(20), nullable=False)
|
||||||
universe: Mapped[int | None] = mapped_column(Integer, nullable=True)
|
universe: Mapped[int | None] = mapped_column(Integer, nullable=True)
|
||||||
@@ -139,6 +143,6 @@ class FactRentals(Base):
|
|||||||
rent_change_pct: Mapped[float | None] = mapped_column(Numeric(5, 2), nullable=True)
|
rent_change_pct: Mapped[float | None] = mapped_column(Numeric(5, 2), nullable=True)
|
||||||
reliability_code: Mapped[str | None] = mapped_column(String(2), nullable=True)
|
reliability_code: Mapped[str | None] = mapped_column(String(2), nullable=True)
|
||||||
|
|
||||||
# Relationships
|
# Relationships - explicit foreign_keys needed for cross-schema joins
|
||||||
time = relationship("DimTime", backref="rentals")
|
time = relationship("DimTime", foreign_keys=[date_key], backref="rentals")
|
||||||
zone = relationship("DimCMHCZone", backref="rentals")
|
zone = relationship("DimCMHCZone", foreign_keys=[zone_key], backref="rentals")
|
||||||
|
|||||||
@@ -15,6 +15,7 @@ from pathlib import Path
|
|||||||
sys.path.insert(0, str(Path(__file__).parent.parent.parent))
|
sys.path.insert(0, str(Path(__file__).parent.parent.parent))
|
||||||
|
|
||||||
from portfolio_app.toronto.models import create_tables, get_engine # noqa: E402
|
from portfolio_app.toronto.models import create_tables, get_engine # noqa: E402
|
||||||
|
from portfolio_app.toronto.models.dimensions import RAW_TORONTO_SCHEMA # noqa: E402
|
||||||
|
|
||||||
|
|
||||||
def main() -> int:
|
def main() -> int:
|
||||||
@@ -32,16 +33,30 @@ def main() -> int:
|
|||||||
result.fetchone()
|
result.fetchone()
|
||||||
print("Database connection successful")
|
print("Database connection successful")
|
||||||
|
|
||||||
|
# Create domain-specific schemas
|
||||||
|
with engine.connect() as conn:
|
||||||
|
conn.execute(text(f"CREATE SCHEMA IF NOT EXISTS {RAW_TORONTO_SCHEMA}"))
|
||||||
|
conn.commit()
|
||||||
|
print(f"Created schema: {RAW_TORONTO_SCHEMA}")
|
||||||
|
|
||||||
# Create all tables
|
# Create all tables
|
||||||
create_tables()
|
create_tables()
|
||||||
print("Schema created successfully")
|
print("Schema created successfully")
|
||||||
|
|
||||||
# List created tables
|
# List created tables by schema
|
||||||
from sqlalchemy import inspect
|
from sqlalchemy import inspect
|
||||||
|
|
||||||
inspector = inspect(engine)
|
inspector = inspect(engine)
|
||||||
tables = inspector.get_table_names()
|
|
||||||
print(f"Created tables: {', '.join(tables)}")
|
# Public schema tables
|
||||||
|
public_tables = inspector.get_table_names(schema="public")
|
||||||
|
if public_tables:
|
||||||
|
print(f"Public schema tables: {', '.join(public_tables)}")
|
||||||
|
|
||||||
|
# raw_toronto schema tables
|
||||||
|
toronto_tables = inspector.get_table_names(schema=RAW_TORONTO_SCHEMA)
|
||||||
|
if toronto_tables:
|
||||||
|
print(f"{RAW_TORONTO_SCHEMA} schema tables: {', '.join(toronto_tables)}")
|
||||||
|
|
||||||
return 0
|
return 0
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user