14 Commits

Author SHA1 Message Date
d64f90b3d3 Merge branch 'feature/7-nav-theme-modernization' into development 2026-01-15 11:53:22 -05:00
b3fb94c7cb feat: Add floating sidebar navigation and dark theme support
- Add floating pill-shaped sidebar with navigation icons
- Implement dark/light theme toggle with localStorage persistence
- Update all figure factories for transparent backgrounds
- Use carto-darkmatter map style for choropleths
- Add methodology link button to Toronto dashboard header
- Add back to dashboard button on methodology page
- Remove social links from home page (now in sidebar)
- Update CLAUDE.md to Sprint 7

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-15 11:53:13 -05:00
1e0ea9cca2 Merge pull request 'feat: Add GeoJSON parsers and choropleth map visualization' (#26) from feature/geo-parsers-choropleth into development 2026-01-14 23:02:21 +00:00
9dfa24fb76 feat: add GeoJSON parsers and choropleth map visualization
- Add geo.py parser module with CMHCZoneParser, TRREBDistrictParser,
  and NeighbourhoodParser for loading geographic boundaries
- Add coordinate reprojection support (EPSG:3857 to WGS84)
- Organize geo data in data/toronto/raw/geo/ directory
- Add CMHC zones GeoJSON (31 zones) for rental market choropleth
- Add Toronto neighbourhoods GeoJSON (158) as purchase market proxy
- Update callbacks with real CMHC 2024 rental data
- Add sample purchase data for all 158 neighbourhoods
- Update pre-commit config to exclude geo data files

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-14 17:58:13 -05:00
8701a12b41 Merge pull request 'Upload files to "/"' (#24) from lmiranda-cmhc-zones into development
Reviewed-on: lmiranda/personal-portfolio#24
2026-01-14 21:04:24 +00:00
6ef5460ad0 Upload files to "/" 2026-01-14 21:04:06 +00:00
19ffc04573 Merge pull request 'fix: Toronto page registration for Dash Pages' (#23) from fix/toronto-page-registration into development 2026-01-12 03:19:49 +00:00
08aa61f85e fix: rename Toronto page __init__.py to dashboard.py for Dash Pages
Dash Pages does not auto-discover __init__.py files as page modules.
Renamed to dashboard.py so the page registers correctly at /toronto.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-11 22:08:49 -05:00
2a6db2a252 Merge pull request 'feat: Sprint 6 - Polish and deployment preparation' (#22) from feature/sprint6-polish-deploy into development 2026-01-12 02:51:14 +00:00
140d3085bf feat: Sprint 6 polish - methodology, demo data, deployment prep
- Add policy event markers to time series charts
- Create methodology page (/toronto/methodology) with data sources
- Add demo data module for testing without full pipeline
- Update README with project documentation
- Add health check endpoint (/health)
- Add database initialization script
- Export new figure factory functions

Closes #21

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-11 21:50:45 -05:00
ad6ee3d37f Merge pull request 'feat: Sprint 5 - Visualization' (#19) from feature/sprint5-visualization into development 2026-01-11 21:22:59 +00:00
077e426d34 feat: add Sprint 5 visualization components and Toronto dashboard
- Add figure factories: choropleth, time_series, summary_cards
- Add shared components: map_controls, time_slider, metric_card
- Create Toronto dashboard page with KPI cards, choropleth maps, and time series
- Add dashboard callbacks for interactivity
- Placeholder data for demonstration until QGIS boundaries are complete

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-11 16:20:01 -05:00
b7907e68e4 Merge pull request 'feat: Sprint 4 - Loaders and dbt models' (#17) from feature/sprint4-loaders-dbt into development 2026-01-11 21:08:01 +00:00
457bb49395 feat: add loaders and dbt models for Toronto housing data
Sprint 4 implementation:

Loaders:
- base.py: Session management, bulk insert, upsert utilities
- dimensions.py: Load time, district, zone, neighbourhood, policy dimensions
- trreb.py: Load TRREB purchase data to fact_purchases
- cmhc.py: Load CMHC rental data to fact_rentals

dbt Project:
- Project configuration (dbt_project.yml, packages.yml)
- Staging models for all fact and dimension tables
- Intermediate models with dimension enrichment
- Marts: purchase analysis, rental analysis, market summary

Closes #16

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-11 16:07:30 -05:00
52 changed files with 5845 additions and 43 deletions

View File

@@ -7,7 +7,7 @@ repos:
- id: check-yaml
- id: check-added-large-files
args: ['--maxkb=1000']
exclude: ^data/raw/
exclude: ^data/(raw/|toronto/raw/geo/)
- id: check-merge-conflict
- repo: https://github.com/astral-sh/ruff-pre-commit

View File

@@ -6,7 +6,7 @@ Working context for Claude Code on the Analytics Portfolio project.
## Project Status
**Current Sprint**: 1 (Project Bootstrap)
**Current Sprint**: 7 (Navigation & Theme Modernization)
**Phase**: 1 - Toronto Housing Dashboard
**Branch**: `development` (feature branches merge here)
@@ -254,4 +254,4 @@ All scripts in `scripts/`:
---
*Last Updated: Sprint 1*
*Last Updated: Sprint 7*

120
README.md
View File

@@ -1,2 +1,120 @@
# personal-portfolio
# Analytics Portfolio
A data analytics portfolio showcasing end-to-end data engineering, visualization, and analysis capabilities.
## Projects
### Toronto Housing Dashboard
An interactive choropleth dashboard analyzing Toronto's housing market using multi-source data integration.
**Features:**
- Purchase market analysis from TRREB monthly reports
- Rental market analysis from CMHC annual surveys
- Interactive choropleth maps by district/zone
- Time series visualization with policy event annotations
- Purchase/Rental mode toggle
**Data Sources:**
- [TRREB Market Watch](https://trreb.ca/market-data/market-watch/) - Monthly purchase statistics
- [CMHC Rental Market Survey](https://www.cmhc-schl.gc.ca/professionals/housing-markets-data-and-research/housing-data/data-tables/rental-market) - Annual rental data
**Tech Stack:**
- Python 3.11+ / Dash / Plotly
- PostgreSQL + PostGIS
- dbt for data transformation
- Pydantic for validation
- SQLAlchemy 2.0
## Quick Start
```bash
# Clone and setup
git clone https://github.com/lmiranda/personal-portfolio.git
cd personal-portfolio
# Install dependencies and configure environment
make setup
# Start database
make docker-up
# Initialize database schema
make db-init
# Run development server
make run
```
Visit `http://localhost:8050` to view the portfolio.
## Project Structure
```
portfolio_app/
├── app.py # Dash app factory
├── config.py # Pydantic settings
├── pages/
│ ├── home.py # Bio landing page (/)
│ └── toronto/ # Toronto dashboard (/toronto)
├── components/ # Shared UI components
├── figures/ # Plotly figure factories
└── toronto/ # Toronto data logic
├── parsers/ # PDF/CSV extraction
├── loaders/ # Database operations
├── schemas/ # Pydantic models
└── models/ # SQLAlchemy ORM
dbt/
├── models/
│ ├── staging/ # 1:1 source tables
│ ├── intermediate/ # Business logic
│ └── marts/ # Analytical tables
```
## Development
```bash
make test # Run tests
make lint # Run linter
make format # Format code
make ci # Run all checks
```
## Data Pipeline
```
Raw Files (PDF/Excel)
Parsers (pdfplumber, pandas)
Pydantic Validation
SQLAlchemy Loaders
PostgreSQL + PostGIS
dbt Transformations
Dash Visualization
```
## Environment Variables
Copy `.env.example` to `.env` and configure:
```bash
DATABASE_URL=postgresql://user:pass@localhost:5432/portfolio
POSTGRES_USER=portfolio
POSTGRES_PASSWORD=<secure>
POSTGRES_DB=portfolio
DASH_DEBUG=true
```
## License
MIT
## Author
Leo Miranda - [GitHub](https://github.com/lmiranda) | [LinkedIn](https://linkedin.com/in/yourprofile)

View File

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

28
dbt/dbt_project.yml Normal file
View File

@@ -0,0 +1,28 @@
name: 'toronto_housing'
version: '1.0.0'
config-version: 2
profile: 'toronto_housing'
model-paths: ["models"]
analysis-paths: ["analyses"]
test-paths: ["tests"]
seed-paths: ["seeds"]
macro-paths: ["macros"]
snapshot-paths: ["snapshots"]
clean-targets:
- "target"
- "dbt_packages"
models:
toronto_housing:
staging:
+materialized: view
+schema: staging
intermediate:
+materialized: view
+schema: intermediate
marts:
+materialized: table
+schema: marts

View File

@@ -0,0 +1,24 @@
version: 2
models:
- name: int_purchases__monthly
description: "Purchase data enriched with time and district dimensions"
columns:
- name: purchase_id
tests:
- unique
- not_null
- name: district_code
tests:
- not_null
- name: int_rentals__annual
description: "Rental data enriched with time and zone dimensions"
columns:
- name: rental_id
tests:
- unique
- not_null
- name: zone_code
tests:
- not_null

View File

@@ -0,0 +1,62 @@
-- Intermediate: Monthly purchase data enriched with dimensions
-- Joins purchases with time and district dimensions for analysis
with purchases as (
select * from {{ ref('stg_trreb__purchases') }}
),
time_dim as (
select * from {{ ref('stg_dimensions__time') }}
),
district_dim as (
select * from {{ ref('stg_dimensions__trreb_districts') }}
),
enriched as (
select
p.purchase_id,
-- Time attributes
t.date_key,
t.full_date,
t.year,
t.month,
t.quarter,
t.month_name,
-- District attributes
d.district_key,
d.district_code,
d.district_name,
d.area_type,
-- Metrics
p.sales_count,
p.dollar_volume,
p.avg_price,
p.median_price,
p.new_listings,
p.active_listings,
p.days_on_market,
p.sale_to_list_ratio,
-- Calculated metrics
case
when p.active_listings > 0
then round(p.sales_count::numeric / p.active_listings, 3)
else null
end as absorption_rate,
case
when p.sales_count > 0
then round(p.active_listings::numeric / p.sales_count, 1)
else null
end as months_of_inventory
from purchases p
inner join time_dim t on p.date_key = t.date_key
inner join district_dim d on p.district_key = d.district_key
)
select * from enriched

View File

@@ -0,0 +1,57 @@
-- Intermediate: Annual rental data enriched with dimensions
-- Joins rentals with time and zone dimensions for analysis
with rentals as (
select * from {{ ref('stg_cmhc__rentals') }}
),
time_dim as (
select * from {{ ref('stg_dimensions__time') }}
),
zone_dim as (
select * from {{ ref('stg_dimensions__cmhc_zones') }}
),
enriched as (
select
r.rental_id,
-- Time attributes
t.date_key,
t.full_date,
t.year,
t.month,
t.quarter,
-- Zone attributes
z.zone_key,
z.zone_code,
z.zone_name,
-- Bedroom type
r.bedroom_type,
-- Metrics
r.rental_universe,
r.avg_rent,
r.median_rent,
r.vacancy_rate,
r.availability_rate,
r.turnover_rate,
r.year_over_year_rent_change,
r.reliability_code,
-- Calculated metrics
case
when r.rental_universe > 0 and r.vacancy_rate is not null
then round(r.rental_universe * (r.vacancy_rate / 100), 0)
else null
end as vacant_units_estimate
from rentals r
inner join time_dim t on r.date_key = t.date_key
inner join zone_dim z on r.zone_key = z.zone_key
)
select * from enriched

View File

@@ -0,0 +1,23 @@
version: 2
models:
- name: mart_toronto_purchases
description: "Final mart for Toronto purchase/sales analysis by district and time"
columns:
- name: purchase_id
description: "Unique purchase record identifier"
tests:
- unique
- not_null
- name: mart_toronto_rentals
description: "Final mart for Toronto rental market analysis by zone and time"
columns:
- name: rental_id
description: "Unique rental record identifier"
tests:
- unique
- not_null
- name: mart_toronto_market_summary
description: "Combined market summary aggregating purchases and rentals at Toronto level"

View File

@@ -0,0 +1,81 @@
-- Mart: Toronto Market Summary
-- Aggregated view combining purchase and rental market indicators
-- Grain: One row per year-month
with purchases_agg as (
select
year,
month,
month_name,
quarter,
-- Aggregate purchase metrics across all districts
sum(sales_count) as total_sales,
sum(dollar_volume) as total_dollar_volume,
round(avg(avg_price), 0) as avg_price_all_districts,
round(avg(median_price), 0) as median_price_all_districts,
sum(new_listings) as total_new_listings,
sum(active_listings) as total_active_listings,
round(avg(days_on_market), 0) as avg_days_on_market,
round(avg(sale_to_list_ratio), 2) as avg_sale_to_list_ratio,
round(avg(absorption_rate), 3) as avg_absorption_rate,
round(avg(months_of_inventory), 1) as avg_months_of_inventory,
round(avg(avg_price_yoy_pct), 2) as avg_price_yoy_pct
from {{ ref('mart_toronto_purchases') }}
group by year, month, month_name, quarter
),
rentals_agg as (
select
year,
-- Aggregate rental metrics across all zones (all bedroom types)
round(avg(avg_rent), 0) as avg_rent_all_zones,
round(avg(vacancy_rate), 2) as avg_vacancy_rate,
round(avg(rent_change_pct), 2) as avg_rent_change_pct,
sum(rental_universe) as total_rental_universe
from {{ ref('mart_toronto_rentals') }}
group by year
),
final as (
select
p.year,
p.month,
p.month_name,
p.quarter,
-- Purchase market indicators
p.total_sales,
p.total_dollar_volume,
p.avg_price_all_districts,
p.median_price_all_districts,
p.total_new_listings,
p.total_active_listings,
p.avg_days_on_market,
p.avg_sale_to_list_ratio,
p.avg_absorption_rate,
p.avg_months_of_inventory,
p.avg_price_yoy_pct,
-- Rental market indicators (annual, so join on year)
r.avg_rent_all_zones,
r.avg_vacancy_rate,
r.avg_rent_change_pct,
r.total_rental_universe,
-- Affordability indicator (price to rent ratio)
case
when r.avg_rent_all_zones > 0
then round(p.avg_price_all_districts / (r.avg_rent_all_zones * 12), 1)
else null
end as price_to_annual_rent_ratio
from purchases_agg p
left join rentals_agg r on p.year = r.year
)
select * from final
order by year desc, month desc

View File

@@ -0,0 +1,79 @@
-- Mart: Toronto Purchase Market Analysis
-- Final analytical table for purchase/sales data visualization
-- Grain: One row per district per month
with purchases as (
select * from {{ ref('int_purchases__monthly') }}
),
-- Add year-over-year calculations
with_yoy as (
select
p.*,
-- Previous year same month values
lag(p.avg_price, 12) over (
partition by p.district_code
order by p.date_key
) as avg_price_prev_year,
lag(p.sales_count, 12) over (
partition by p.district_code
order by p.date_key
) as sales_count_prev_year,
lag(p.median_price, 12) over (
partition by p.district_code
order by p.date_key
) as median_price_prev_year
from purchases p
),
final as (
select
purchase_id,
date_key,
full_date,
year,
month,
quarter,
month_name,
district_key,
district_code,
district_name,
area_type,
sales_count,
dollar_volume,
avg_price,
median_price,
new_listings,
active_listings,
days_on_market,
sale_to_list_ratio,
absorption_rate,
months_of_inventory,
-- Year-over-year changes
case
when avg_price_prev_year > 0
then round(((avg_price - avg_price_prev_year) / avg_price_prev_year) * 100, 2)
else null
end as avg_price_yoy_pct,
case
when sales_count_prev_year > 0
then round(((sales_count - sales_count_prev_year)::numeric / sales_count_prev_year) * 100, 2)
else null
end as sales_count_yoy_pct,
case
when median_price_prev_year > 0
then round(((median_price - median_price_prev_year) / median_price_prev_year) * 100, 2)
else null
end as median_price_yoy_pct
from with_yoy
)
select * from final

View File

@@ -0,0 +1,64 @@
-- Mart: Toronto Rental Market Analysis
-- Final analytical table for rental market visualization
-- Grain: One row per zone per bedroom type per survey year
with rentals as (
select * from {{ ref('int_rentals__annual') }}
),
-- Add year-over-year calculations
with_yoy as (
select
r.*,
-- Previous year values
lag(r.avg_rent, 1) over (
partition by r.zone_code, r.bedroom_type
order by r.year
) as avg_rent_prev_year,
lag(r.vacancy_rate, 1) over (
partition by r.zone_code, r.bedroom_type
order by r.year
) as vacancy_rate_prev_year
from rentals r
),
final as (
select
rental_id,
date_key,
full_date,
year,
quarter,
zone_key,
zone_code,
zone_name,
bedroom_type,
rental_universe,
avg_rent,
median_rent,
vacancy_rate,
availability_rate,
turnover_rate,
year_over_year_rent_change,
reliability_code,
vacant_units_estimate,
-- Calculated year-over-year (if not provided)
coalesce(
year_over_year_rent_change,
case
when avg_rent_prev_year > 0
then round(((avg_rent - avg_rent_prev_year) / avg_rent_prev_year) * 100, 2)
else null
end
) as rent_change_pct,
vacancy_rate - vacancy_rate_prev_year as vacancy_rate_change
from with_yoy
)
select * from final

View File

@@ -0,0 +1,61 @@
version: 2
sources:
- name: toronto_housing
description: "Toronto housing data loaded from TRREB and CMHC sources"
database: portfolio
schema: public
tables:
- name: fact_purchases
description: "TRREB monthly purchase/sales statistics by district"
columns:
- name: id
description: "Primary key"
- name: date_key
description: "Foreign key to dim_time"
- name: district_key
description: "Foreign key to dim_trreb_district"
- name: fact_rentals
description: "CMHC annual rental survey data by zone and bedroom type"
columns:
- name: id
description: "Primary key"
- name: date_key
description: "Foreign key to dim_time"
- name: zone_key
description: "Foreign key to dim_cmhc_zone"
- name: dim_time
description: "Time dimension (monthly grain)"
columns:
- name: date_key
description: "Primary key (YYYYMMDD format)"
- name: dim_trreb_district
description: "TRREB district dimension with geometry"
columns:
- name: district_key
description: "Primary key"
- name: district_code
description: "TRREB district code"
- name: dim_cmhc_zone
description: "CMHC zone dimension with geometry"
columns:
- name: zone_key
description: "Primary key"
- name: zone_code
description: "CMHC zone code"
- name: dim_neighbourhood
description: "City of Toronto neighbourhoods (reference only)"
columns:
- name: neighbourhood_id
description: "Primary key"
- name: dim_policy_event
description: "Housing policy events for annotation"
columns:
- name: event_id
description: "Primary key"

View File

@@ -0,0 +1,73 @@
version: 2
models:
- name: stg_trreb__purchases
description: "Staged TRREB purchase/sales data from fact_purchases"
columns:
- name: purchase_id
description: "Unique identifier for purchase record"
tests:
- unique
- not_null
- name: date_key
description: "Date dimension key (YYYYMMDD)"
tests:
- not_null
- name: district_key
description: "TRREB district dimension key"
tests:
- not_null
- name: stg_cmhc__rentals
description: "Staged CMHC rental market data from fact_rentals"
columns:
- name: rental_id
description: "Unique identifier for rental record"
tests:
- unique
- not_null
- name: date_key
description: "Date dimension key (YYYYMMDD)"
tests:
- not_null
- name: zone_key
description: "CMHC zone dimension key"
tests:
- not_null
- name: stg_dimensions__time
description: "Staged time dimension"
columns:
- name: date_key
description: "Date dimension key (YYYYMMDD)"
tests:
- unique
- not_null
- name: stg_dimensions__trreb_districts
description: "Staged TRREB district dimension"
columns:
- name: district_key
description: "District dimension key"
tests:
- unique
- not_null
- name: district_code
description: "TRREB district code (e.g., W01, C01)"
tests:
- unique
- not_null
- name: stg_dimensions__cmhc_zones
description: "Staged CMHC zone dimension"
columns:
- name: zone_key
description: "Zone dimension key"
tests:
- unique
- not_null
- name: zone_code
description: "CMHC zone code"
tests:
- unique
- not_null

View File

@@ -0,0 +1,26 @@
-- Staged CMHC rental market survey data
-- Source: fact_rentals table loaded from CMHC CSV exports
-- Grain: One row per zone per bedroom type per survey year
with source as (
select * from {{ source('toronto_housing', 'fact_rentals') }}
),
staged as (
select
id as rental_id,
date_key,
zone_key,
bedroom_type,
universe as rental_universe,
avg_rent,
median_rent,
vacancy_rate,
availability_rate,
turnover_rate,
rent_change_pct as year_over_year_rent_change,
reliability_code
from source
)
select * from staged

View File

@@ -0,0 +1,18 @@
-- Staged CMHC zone dimension
-- Source: dim_cmhc_zone table
-- Grain: One row per zone
with source as (
select * from {{ source('toronto_housing', 'dim_cmhc_zone') }}
),
staged as (
select
zone_key,
zone_code,
zone_name,
geometry
from source
)
select * from staged

View File

@@ -0,0 +1,21 @@
-- Staged time dimension
-- Source: dim_time table
-- Grain: One row per month
with source as (
select * from {{ source('toronto_housing', 'dim_time') }}
),
staged as (
select
date_key,
full_date,
year,
month,
quarter,
month_name,
is_month_start
from source
)
select * from staged

View File

@@ -0,0 +1,19 @@
-- Staged TRREB district dimension
-- Source: dim_trreb_district table
-- Grain: One row per district
with source as (
select * from {{ source('toronto_housing', 'dim_trreb_district') }}
),
staged as (
select
district_key,
district_code,
district_name,
area_type,
geometry
from source
)
select * from staged

View File

@@ -0,0 +1,25 @@
-- Staged TRREB purchase/sales data
-- Source: fact_purchases table loaded from TRREB Market Watch PDFs
-- Grain: One row per district per month
with source as (
select * from {{ source('toronto_housing', 'fact_purchases') }}
),
staged as (
select
id as purchase_id,
date_key,
district_key,
sales_count,
dollar_volume,
avg_price,
median_price,
new_listings,
active_listings,
avg_dom as days_on_market,
avg_sp_lp as sale_to_list_ratio
from source
)
select * from staged

5
dbt/packages.yml Normal file
View File

@@ -0,0 +1,5 @@
packages:
- package: dbt-labs/dbt_utils
version: ">=1.0.0"
- package: calogica/dbt_expectations
version: ">=0.10.0"

21
dbt/profiles.yml.example Normal file
View File

@@ -0,0 +1,21 @@
toronto_housing:
target: dev
outputs:
dev:
type: postgres
host: localhost
user: portfolio
password: "{{ env_var('POSTGRES_PASSWORD') }}"
port: 5432
dbname: portfolio
schema: public
threads: 4
prod:
type: postgres
host: "{{ env_var('POSTGRES_HOST') }}"
user: "{{ env_var('POSTGRES_USER') }}"
password: "{{ env_var('POSTGRES_PASSWORD') }}"
port: 5432
dbname: portfolio
schema: public
threads: 4

View File

@@ -2,7 +2,9 @@
import dash
import dash_mantine_components as dmc
from dash import dcc, html
from .components import create_sidebar
from .config import get_settings
@@ -17,14 +19,31 @@ def create_app() -> dash.Dash:
)
app.layout = dmc.MantineProvider(
dash.page_container,
id="mantine-provider",
children=[
dcc.Location(id="url", refresh=False),
dcc.Store(id="theme-store", storage_type="local", data="dark"),
dcc.Store(id="theme-init-dummy"), # Dummy store for theme init callback
html.Div(
[
create_sidebar(),
html.Div(
dash.page_container,
className="page-content-wrapper",
),
],
),
],
theme={
"primaryColor": "blue",
"fontFamily": "'Inter', sans-serif",
},
forceColorScheme="light",
defaultColorScheme="dark",
)
# Import callbacks to register them
from . import callbacks # noqa: F401
return app

View File

@@ -0,0 +1,139 @@
/* Floating sidebar navigation styles */
/* Sidebar container */
.floating-sidebar {
position: fixed;
left: 16px;
top: 50%;
transform: translateY(-50%);
width: 60px;
padding: 16px 8px;
border-radius: 32px;
z-index: 1000;
display: flex;
flex-direction: column;
align-items: center;
gap: 8px;
box-shadow: 0 4px 12px rgba(0, 0, 0, 0.15);
transition: background-color 0.2s ease;
}
/* Page content offset to prevent sidebar overlap */
.page-content-wrapper {
margin-left: 92px; /* sidebar width (60px) + left margin (16px) + gap (16px) */
min-height: 100vh;
}
/* Dark theme (default) */
[data-mantine-color-scheme="dark"] .floating-sidebar {
background-color: #141414;
}
[data-mantine-color-scheme="dark"] body {
background-color: #000000;
}
/* Light theme */
[data-mantine-color-scheme="light"] .floating-sidebar {
background-color: #f0f0f0;
}
[data-mantine-color-scheme="light"] body {
background-color: #ffffff;
}
/* Brand initials styling */
.sidebar-brand {
width: 40px;
height: 40px;
display: flex;
align-items: center;
justify-content: center;
border-radius: 50%;
background-color: var(--mantine-color-blue-filled);
margin-bottom: 4px;
transition: transform 0.2s ease;
}
.sidebar-brand:hover {
transform: scale(1.05);
}
.sidebar-brand-link {
font-weight: 700;
font-size: 16px;
color: white;
text-decoration: none;
line-height: 1;
}
/* Divider between sections */
.sidebar-divider {
width: 32px;
height: 1px;
background-color: var(--mantine-color-dimmed);
margin: 4px 0;
opacity: 0.3;
}
/* Active nav icon indicator */
.nav-icon-active {
background-color: var(--mantine-color-blue-filled) !important;
}
/* Navigation icon hover effects */
.floating-sidebar .mantine-ActionIcon-root {
transition: transform 0.15s ease, background-color 0.15s ease;
}
.floating-sidebar .mantine-ActionIcon-root:hover {
transform: scale(1.1);
}
/* Ensure links don't have underlines */
.floating-sidebar a {
text-decoration: none;
}
/* Theme toggle specific styling */
#theme-toggle {
transition: transform 0.3s ease;
}
#theme-toggle:hover {
transform: rotate(15deg) scale(1.1);
}
/* Responsive adjustments for smaller screens */
@media (max-width: 768px) {
.floating-sidebar {
left: 8px;
width: 50px;
padding: 12px 6px;
border-radius: 25px;
}
.page-content-wrapper {
margin-left: 70px;
}
.sidebar-brand {
width: 34px;
height: 34px;
}
.sidebar-brand-link {
font-size: 14px;
}
}
/* Very small screens - hide sidebar, show minimal navigation */
@media (max-width: 480px) {
.floating-sidebar {
display: none;
}
.page-content-wrapper {
margin-left: 0;
}
}

View File

@@ -0,0 +1,5 @@
"""Application-level callbacks for the portfolio app."""
from . import theme
__all__ = ["theme"]

View File

@@ -0,0 +1,38 @@
"""Theme toggle callbacks using clientside JavaScript."""
from dash import Input, Output, State, clientside_callback
# Toggle theme on button click
# Stores new theme value and updates the DOM attribute
clientside_callback(
"""
function(n_clicks, currentTheme) {
if (n_clicks === undefined || n_clicks === null) {
return window.dash_clientside.no_update;
}
const newTheme = currentTheme === 'dark' ? 'light' : 'dark';
document.documentElement.setAttribute('data-mantine-color-scheme', newTheme);
return newTheme;
}
""",
Output("theme-store", "data"),
Input("theme-toggle", "n_clicks"),
State("theme-store", "data"),
prevent_initial_call=True,
)
# Initialize theme from localStorage on page load
# Uses a dummy output since we only need the side effect of setting the DOM attribute
clientside_callback(
"""
function(theme) {
if (theme) {
document.documentElement.setAttribute('data-mantine-color-scheme', theme);
}
return theme;
}
""",
Output("theme-init-dummy", "data"),
Input("theme-store", "data"),
prevent_initial_call=False,
)

View File

@@ -0,0 +1,16 @@
"""Shared Dash components for the portfolio application."""
from .map_controls import create_map_controls, create_metric_selector
from .metric_card import MetricCard, create_metric_cards_row
from .sidebar import create_sidebar
from .time_slider import create_time_slider, create_year_selector
__all__ = [
"create_map_controls",
"create_metric_selector",
"create_sidebar",
"create_time_slider",
"create_year_selector",
"MetricCard",
"create_metric_cards_row",
]

View File

@@ -0,0 +1,79 @@
"""Map control components for choropleth visualizations."""
from typing import Any
import dash_mantine_components as dmc
from dash import html
def create_metric_selector(
id_prefix: str,
options: list[dict[str, str]],
default_value: str | None = None,
label: str = "Select Metric",
) -> dmc.Select:
"""Create a metric selector dropdown.
Args:
id_prefix: Prefix for component IDs.
options: List of options with 'label' and 'value' keys.
default_value: Initial selected value.
label: Label text for the selector.
Returns:
Mantine Select component.
"""
return dmc.Select(
id=f"{id_prefix}-metric-selector",
label=label,
data=options,
value=default_value or (options[0]["value"] if options else None),
style={"width": "200px"},
)
def create_map_controls(
id_prefix: str,
metric_options: list[dict[str, str]],
default_metric: str | None = None,
show_layer_toggle: bool = True,
) -> dmc.Paper:
"""Create a control panel for map visualizations.
Args:
id_prefix: Prefix for component IDs.
metric_options: Options for metric selector.
default_metric: Default selected metric.
show_layer_toggle: Whether to show layer visibility toggle.
Returns:
Mantine Paper component containing controls.
"""
controls: list[Any] = [
create_metric_selector(
id_prefix=id_prefix,
options=metric_options,
default_value=default_metric,
label="Display Metric",
),
]
if show_layer_toggle:
controls.append(
dmc.Switch(
id=f"{id_prefix}-layer-toggle",
label="Show Boundaries",
checked=True,
style={"marginTop": "10px"},
)
)
return dmc.Paper(
children=[
dmc.Text("Map Controls", fw=500, size="sm", mb="xs"),
html.Div(controls),
],
p="md",
radius="sm",
withBorder=True,
)

View File

@@ -0,0 +1,115 @@
"""Metric card components for KPI display."""
from typing import Any
import dash_mantine_components as dmc
from dash import dcc
from portfolio_app.figures.summary_cards import create_metric_card_figure
class MetricCard:
"""A reusable metric card component."""
def __init__(
self,
id_prefix: str,
title: str,
value: float | int | str = 0,
delta: float | None = None,
prefix: str = "",
suffix: str = "",
format_spec: str = ",.0f",
positive_is_good: bool = True,
):
"""Initialize a metric card.
Args:
id_prefix: Prefix for component IDs.
title: Card title.
value: Main metric value.
delta: Change value for delta indicator.
prefix: Value prefix (e.g., '$').
suffix: Value suffix.
format_spec: Python format specification.
positive_is_good: Whether positive delta is good.
"""
self.id_prefix = id_prefix
self.title = title
self.value = value
self.delta = delta
self.prefix = prefix
self.suffix = suffix
self.format_spec = format_spec
self.positive_is_good = positive_is_good
def render(self) -> dmc.Paper:
"""Render the metric card component.
Returns:
Mantine Paper component with embedded graph.
"""
fig = create_metric_card_figure(
value=self.value,
title=self.title,
delta=self.delta,
prefix=self.prefix,
suffix=self.suffix,
format_spec=self.format_spec,
positive_is_good=self.positive_is_good,
)
return dmc.Paper(
children=[
dcc.Graph(
id=f"{self.id_prefix}-graph",
figure=fig,
config={"displayModeBar": False},
style={"height": "120px"},
)
],
p="xs",
radius="sm",
withBorder=True,
)
def create_metric_cards_row(
metrics: list[dict[str, Any]],
id_prefix: str = "metric",
) -> dmc.SimpleGrid:
"""Create a row of metric cards.
Args:
metrics: List of metric configurations with keys:
- title: Card title
- value: Metric value
- delta: Optional change value
- prefix: Optional value prefix
- suffix: Optional value suffix
- format_spec: Optional format specification
- positive_is_good: Optional delta color logic
id_prefix: Prefix for component IDs.
Returns:
Mantine SimpleGrid component with metric cards.
"""
cards = []
for i, metric in enumerate(metrics):
card = MetricCard(
id_prefix=f"{id_prefix}-{i}",
title=metric.get("title", ""),
value=metric.get("value", 0),
delta=metric.get("delta"),
prefix=metric.get("prefix", ""),
suffix=metric.get("suffix", ""),
format_spec=metric.get("format_spec", ",.0f"),
positive_is_good=metric.get("positive_is_good", True),
)
cards.append(card.render())
return dmc.SimpleGrid(
cols={"base": 1, "sm": 2, "md": len(cards)},
spacing="md",
children=cards,
)

View File

@@ -0,0 +1,179 @@
"""Floating sidebar navigation component."""
import dash_mantine_components as dmc
from dash import dcc, html
from dash_iconify import DashIconify
# Navigation items configuration
NAV_ITEMS = [
{"path": "/", "icon": "tabler:home", "label": "Home"},
{"path": "/toronto", "icon": "tabler:map-2", "label": "Toronto Housing"},
]
# External links configuration
EXTERNAL_LINKS = [
{
"url": "https://github.com/leomiranda",
"icon": "tabler:brand-github",
"label": "GitHub",
},
{
"url": "https://linkedin.com/in/leobmiranda",
"icon": "tabler:brand-linkedin",
"label": "LinkedIn",
},
]
def create_brand_logo() -> html.Div:
"""Create the brand initials logo."""
return html.Div(
dcc.Link(
"LM",
href="/",
className="sidebar-brand-link",
),
className="sidebar-brand",
)
def create_nav_icon(
icon: str,
label: str,
path: str,
current_path: str,
) -> dmc.Tooltip:
"""Create a navigation icon with tooltip.
Args:
icon: Iconify icon string.
label: Tooltip label.
path: Navigation path.
current_path: Current page path for active state.
Returns:
Tooltip-wrapped navigation icon.
"""
is_active = current_path == path or (path != "/" and current_path.startswith(path))
return dmc.Tooltip(
dcc.Link(
dmc.ActionIcon(
DashIconify(icon=icon, width=20),
variant="subtle" if not is_active else "filled",
size="lg",
radius="xl",
color="blue" if is_active else "gray",
className="nav-icon-active" if is_active else "",
),
href=path,
),
label=label,
position="right",
withArrow=True,
)
def create_theme_toggle(current_theme: str = "dark") -> dmc.Tooltip:
"""Create the theme toggle button.
Args:
current_theme: Current theme ('dark' or 'light').
Returns:
Tooltip-wrapped theme toggle icon.
"""
icon = "tabler:sun" if current_theme == "dark" else "tabler:moon"
label = "Switch to light mode" if current_theme == "dark" else "Switch to dark mode"
return dmc.Tooltip(
dmc.ActionIcon(
DashIconify(icon=icon, width=20, id="theme-toggle-icon"),
id="theme-toggle",
variant="subtle",
size="lg",
radius="xl",
color="gray",
),
label=label,
position="right",
withArrow=True,
)
def create_external_link(url: str, icon: str, label: str) -> dmc.Tooltip:
"""Create an external link icon with tooltip.
Args:
url: External URL.
icon: Iconify icon string.
label: Tooltip label.
Returns:
Tooltip-wrapped external link icon.
"""
return dmc.Tooltip(
dmc.Anchor(
dmc.ActionIcon(
DashIconify(icon=icon, width=20),
variant="subtle",
size="lg",
radius="xl",
color="gray",
),
href=url,
target="_blank",
),
label=label,
position="right",
withArrow=True,
)
def create_sidebar_divider() -> html.Div:
"""Create a horizontal divider for the sidebar."""
return html.Div(className="sidebar-divider")
def create_sidebar(current_path: str = "/", current_theme: str = "dark") -> html.Div:
"""Create the floating sidebar navigation.
Args:
current_path: Current page path for active state highlighting.
current_theme: Current theme for toggle icon state.
Returns:
Complete sidebar component.
"""
return html.Div(
[
# Brand logo
create_brand_logo(),
create_sidebar_divider(),
# Navigation icons
*[
create_nav_icon(
icon=item["icon"],
label=item["label"],
path=item["path"],
current_path=current_path,
)
for item in NAV_ITEMS
],
create_sidebar_divider(),
# Theme toggle
create_theme_toggle(current_theme),
create_sidebar_divider(),
# External links
*[
create_external_link(
url=link["url"],
icon=link["icon"],
label=link["label"],
)
for link in EXTERNAL_LINKS
],
],
className="floating-sidebar",
id="floating-sidebar",
)

View File

@@ -0,0 +1,135 @@
"""Time selection components for temporal data filtering."""
from datetime import date
import dash_mantine_components as dmc
def create_year_selector(
id_prefix: str,
min_year: int = 2020,
max_year: int | None = None,
default_year: int | None = None,
label: str = "Select Year",
) -> dmc.Select:
"""Create a year selector dropdown.
Args:
id_prefix: Prefix for component IDs.
min_year: Minimum year option.
max_year: Maximum year option (defaults to current year).
default_year: Initial selected year.
label: Label text for the selector.
Returns:
Mantine Select component.
"""
if max_year is None:
max_year = date.today().year
if default_year is None:
default_year = max_year
years = list(range(max_year, min_year - 1, -1))
options = [{"label": str(year), "value": str(year)} for year in years]
return dmc.Select(
id=f"{id_prefix}-year-selector",
label=label,
data=options,
value=str(default_year),
style={"width": "120px"},
)
def create_time_slider(
id_prefix: str,
min_year: int = 2020,
max_year: int | None = None,
default_range: tuple[int, int] | None = None,
label: str = "Time Range",
) -> dmc.Paper:
"""Create a time range slider component.
Args:
id_prefix: Prefix for component IDs.
min_year: Minimum year for the slider.
max_year: Maximum year for the slider.
default_range: Default (start, end) year range.
label: Label text for the slider.
Returns:
Mantine Paper component containing the slider.
"""
if max_year is None:
max_year = date.today().year
if default_range is None:
default_range = (min_year, max_year)
# Create marks for every year
marks = [
{"value": year, "label": str(year)} for year in range(min_year, max_year + 1)
]
return dmc.Paper(
children=[
dmc.Text(label, fw=500, size="sm", mb="xs"),
dmc.RangeSlider(
id=f"{id_prefix}-time-slider",
min=min_year,
max=max_year,
value=list(default_range),
marks=marks,
step=1,
minRange=1,
style={"marginTop": "20px", "marginBottom": "10px"},
),
],
p="md",
radius="sm",
withBorder=True,
)
def create_month_selector(
id_prefix: str,
default_month: int | None = None,
label: str = "Select Month",
) -> dmc.Select:
"""Create a month selector dropdown.
Args:
id_prefix: Prefix for component IDs.
default_month: Initial selected month (1-12).
label: Label text for the selector.
Returns:
Mantine Select component.
"""
months = [
"January",
"February",
"March",
"April",
"May",
"June",
"July",
"August",
"September",
"October",
"November",
"December",
]
options = [{"label": month, "value": str(i + 1)} for i, month in enumerate(months)]
if default_month is None:
default_month = date.today().month
return dmc.Select(
id=f"{id_prefix}-month-selector",
label=label,
data=options,
value=str(default_month),
style={"width": "140px"},
)

View File

@@ -0,0 +1,31 @@
"""Plotly figure factories for data visualization."""
from .choropleth import (
create_choropleth_figure,
create_district_map,
create_zone_map,
)
from .summary_cards import create_metric_card_figure, create_summary_metrics
from .time_series import (
add_policy_markers,
create_market_comparison_chart,
create_price_time_series,
create_time_series_with_events,
create_volume_time_series,
)
__all__ = [
# Choropleth
"create_choropleth_figure",
"create_district_map",
"create_zone_map",
# Time series
"create_price_time_series",
"create_volume_time_series",
"create_market_comparison_chart",
"create_time_series_with_events",
"add_policy_markers",
# Summary
"create_metric_card_figure",
"create_summary_metrics",
]

View File

@@ -0,0 +1,171 @@
"""Choropleth map figure factory for Toronto housing data."""
from typing import Any
import plotly.express as px
import plotly.graph_objects as go
def create_choropleth_figure(
geojson: dict[str, Any] | None,
data: list[dict[str, Any]],
location_key: str,
color_column: str,
hover_data: list[str] | None = None,
color_scale: str = "Blues",
title: str | None = None,
map_style: str = "carto-positron",
center: dict[str, float] | None = None,
zoom: float = 9.5,
) -> go.Figure:
"""Create a choropleth map figure.
Args:
geojson: GeoJSON FeatureCollection for boundaries.
data: List of data records with location keys and values.
location_key: Column name for location identifier.
color_column: Column name for color values.
hover_data: Additional columns to show on hover.
color_scale: Plotly color scale name.
title: Optional chart title.
map_style: Mapbox style (carto-positron, open-street-map, etc.).
center: Map center coordinates {"lat": float, "lon": float}.
zoom: Initial zoom level.
Returns:
Plotly Figure object.
"""
# Default center to Toronto
if center is None:
center = {"lat": 43.7, "lon": -79.4}
# Use dark-mode friendly map style by default
if map_style == "carto-positron":
map_style = "carto-darkmatter"
# If no geojson provided, create a placeholder map
if geojson is None or not data:
fig = go.Figure(go.Scattermapbox())
fig.update_layout(
mapbox={
"style": map_style,
"center": center,
"zoom": zoom,
},
margin={"l": 0, "r": 0, "t": 40, "b": 0},
title=title or "Toronto Housing Map",
height=500,
paper_bgcolor="rgba(0,0,0,0)",
plot_bgcolor="rgba(0,0,0,0)",
font_color="#c9c9c9",
)
fig.add_annotation(
text="No geometry data available. Complete QGIS digitization to enable map.",
xref="paper",
yref="paper",
x=0.5,
y=0.5,
showarrow=False,
font={"size": 14, "color": "#888888"},
)
return fig
# Create choropleth with data
import pandas as pd
df = pd.DataFrame(data)
# Use dark-mode friendly map style
effective_map_style = (
"carto-darkmatter" if map_style == "carto-positron" else map_style
)
fig = px.choropleth_mapbox(
df,
geojson=geojson,
locations=location_key,
featureidkey=f"properties.{location_key}",
color=color_column,
color_continuous_scale=color_scale,
hover_data=hover_data,
mapbox_style=effective_map_style,
center=center,
zoom=zoom,
opacity=0.7,
)
fig.update_layout(
margin={"l": 0, "r": 0, "t": 40, "b": 0},
title=title,
height=500,
paper_bgcolor="rgba(0,0,0,0)",
plot_bgcolor="rgba(0,0,0,0)",
font_color="#c9c9c9",
coloraxis_colorbar={
"title": {
"text": color_column.replace("_", " ").title(),
"font": {"color": "#c9c9c9"},
},
"thickness": 15,
"len": 0.7,
"tickfont": {"color": "#c9c9c9"},
},
)
return fig
def create_district_map(
districts_geojson: dict[str, Any] | None,
purchase_data: list[dict[str, Any]],
metric: str = "avg_price",
) -> go.Figure:
"""Create choropleth map for TRREB districts.
Args:
districts_geojson: GeoJSON for TRREB district boundaries.
purchase_data: Purchase statistics by district.
metric: Metric to display (avg_price, sales_count, etc.).
Returns:
Plotly Figure object.
"""
hover_columns = ["district_name", "sales_count", "avg_price", "median_price"]
return create_choropleth_figure(
geojson=districts_geojson,
data=purchase_data,
location_key="district_code",
color_column=metric,
hover_data=[c for c in hover_columns if c != metric],
color_scale="Blues" if "price" in metric else "Greens",
title="Toronto Purchase Market by District",
)
def create_zone_map(
zones_geojson: dict[str, Any] | None,
rental_data: list[dict[str, Any]],
metric: str = "avg_rent",
) -> go.Figure:
"""Create choropleth map for CMHC zones.
Args:
zones_geojson: GeoJSON for CMHC zone boundaries.
rental_data: Rental statistics by zone.
metric: Metric to display (avg_rent, vacancy_rate, etc.).
Returns:
Plotly Figure object.
"""
hover_columns = ["zone_name", "avg_rent", "vacancy_rate", "rental_universe"]
return create_choropleth_figure(
geojson=zones_geojson,
data=rental_data,
location_key="zone_code",
color_column=metric,
hover_data=[c for c in hover_columns if c != metric],
color_scale="Oranges" if "rent" in metric else "Purples",
title="Toronto Rental Market by Zone",
)

View File

@@ -0,0 +1,107 @@
"""Summary card figure factories for KPI display."""
from typing import Any
import plotly.graph_objects as go
def create_metric_card_figure(
value: float | int | str,
title: str,
delta: float | None = None,
delta_suffix: str = "%",
prefix: str = "",
suffix: str = "",
format_spec: str = ",.0f",
positive_is_good: bool = True,
) -> go.Figure:
"""Create a KPI indicator figure.
Args:
value: The main metric value.
title: Card title.
delta: Optional change value (for delta indicator).
delta_suffix: Suffix for delta value (e.g., '%').
prefix: Prefix for main value (e.g., '$').
suffix: Suffix for main value.
format_spec: Python format specification for the value.
positive_is_good: Whether positive delta is good (green) or bad (red).
Returns:
Plotly Figure object.
"""
# Determine numeric value for indicator
if isinstance(value, int | float):
number_value: float | None = float(value)
else:
number_value = None
fig = go.Figure()
# Add indicator trace
indicator_config: dict[str, Any] = {
"mode": "number",
"value": number_value if number_value is not None else 0,
"title": {"text": title, "font": {"size": 14}},
"number": {
"font": {"size": 32},
"prefix": prefix,
"suffix": suffix,
"valueformat": format_spec,
},
}
# Add delta if provided
if delta is not None:
indicator_config["mode"] = "number+delta"
indicator_config["delta"] = {
"reference": number_value - delta if number_value else 0,
"relative": False,
"valueformat": ".1f",
"suffix": delta_suffix,
"increasing": {"color": "green" if positive_is_good else "red"},
"decreasing": {"color": "red" if positive_is_good else "green"},
}
fig.add_trace(go.Indicator(**indicator_config))
fig.update_layout(
height=120,
margin={"l": 20, "r": 20, "t": 40, "b": 20},
paper_bgcolor="rgba(0,0,0,0)",
plot_bgcolor="rgba(0,0,0,0)",
font={"family": "Inter, sans-serif", "color": "#c9c9c9"},
)
return fig
def create_summary_metrics(
metrics: dict[str, dict[str, Any]],
) -> list[go.Figure]:
"""Create multiple metric card figures.
Args:
metrics: Dictionary of metric configurations.
Key: metric name
Value: dict with 'value', 'title', 'delta' (optional), etc.
Returns:
List of Plotly Figure objects.
"""
figures = []
for metric_config in metrics.values():
fig = create_metric_card_figure(
value=metric_config.get("value", 0),
title=metric_config.get("title", ""),
delta=metric_config.get("delta"),
delta_suffix=metric_config.get("delta_suffix", "%"),
prefix=metric_config.get("prefix", ""),
suffix=metric_config.get("suffix", ""),
format_spec=metric_config.get("format_spec", ",.0f"),
positive_is_good=metric_config.get("positive_is_good", True),
)
figures.append(fig)
return figures

View File

@@ -0,0 +1,386 @@
"""Time series figure factories for Toronto housing data."""
from typing import Any
import plotly.express as px
import plotly.graph_objects as go
def create_price_time_series(
data: list[dict[str, Any]],
date_column: str = "full_date",
price_column: str = "avg_price",
group_column: str | None = None,
title: str = "Average Price Over Time",
show_yoy: bool = True,
) -> go.Figure:
"""Create a time series chart for price data.
Args:
data: List of records with date and price columns.
date_column: Column name for dates.
price_column: Column name for price values.
group_column: Optional column for grouping (e.g., district_code).
title: Chart title.
show_yoy: Whether to show year-over-year change annotations.
Returns:
Plotly Figure object.
"""
import pandas as pd
if not data:
fig = go.Figure()
fig.add_annotation(
text="No data available",
xref="paper",
yref="paper",
x=0.5,
y=0.5,
showarrow=False,
font={"color": "#888888"},
)
fig.update_layout(
title=title,
height=350,
paper_bgcolor="rgba(0,0,0,0)",
plot_bgcolor="rgba(0,0,0,0)",
font_color="#c9c9c9",
)
return fig
df = pd.DataFrame(data)
df[date_column] = pd.to_datetime(df[date_column])
if group_column and group_column in df.columns:
fig = px.line(
df,
x=date_column,
y=price_column,
color=group_column,
title=title,
)
else:
fig = px.line(
df,
x=date_column,
y=price_column,
title=title,
)
fig.update_layout(
height=350,
margin={"l": 40, "r": 20, "t": 50, "b": 40},
xaxis_title="Date",
yaxis_title=price_column.replace("_", " ").title(),
yaxis_tickprefix="$",
yaxis_tickformat=",",
hovermode="x unified",
paper_bgcolor="rgba(0,0,0,0)",
plot_bgcolor="rgba(0,0,0,0)",
font_color="#c9c9c9",
xaxis={"gridcolor": "#333333", "linecolor": "#444444"},
yaxis={"gridcolor": "#333333", "linecolor": "#444444"},
)
return fig
def create_volume_time_series(
data: list[dict[str, Any]],
date_column: str = "full_date",
volume_column: str = "sales_count",
group_column: str | None = None,
title: str = "Sales Volume Over Time",
chart_type: str = "bar",
) -> go.Figure:
"""Create a time series chart for volume/count data.
Args:
data: List of records with date and volume columns.
date_column: Column name for dates.
volume_column: Column name for volume values.
group_column: Optional column for grouping.
title: Chart title.
chart_type: 'bar' or 'line'.
Returns:
Plotly Figure object.
"""
import pandas as pd
if not data:
fig = go.Figure()
fig.add_annotation(
text="No data available",
xref="paper",
yref="paper",
x=0.5,
y=0.5,
showarrow=False,
font={"color": "#888888"},
)
fig.update_layout(
title=title,
height=350,
paper_bgcolor="rgba(0,0,0,0)",
plot_bgcolor="rgba(0,0,0,0)",
font_color="#c9c9c9",
)
return fig
df = pd.DataFrame(data)
df[date_column] = pd.to_datetime(df[date_column])
if chart_type == "bar":
if group_column and group_column in df.columns:
fig = px.bar(
df,
x=date_column,
y=volume_column,
color=group_column,
title=title,
)
else:
fig = px.bar(
df,
x=date_column,
y=volume_column,
title=title,
)
else:
if group_column and group_column in df.columns:
fig = px.line(
df,
x=date_column,
y=volume_column,
color=group_column,
title=title,
)
else:
fig = px.line(
df,
x=date_column,
y=volume_column,
title=title,
)
fig.update_layout(
height=350,
margin={"l": 40, "r": 20, "t": 50, "b": 40},
xaxis_title="Date",
yaxis_title=volume_column.replace("_", " ").title(),
yaxis_tickformat=",",
hovermode="x unified",
paper_bgcolor="rgba(0,0,0,0)",
plot_bgcolor="rgba(0,0,0,0)",
font_color="#c9c9c9",
xaxis={"gridcolor": "#333333", "linecolor": "#444444"},
yaxis={"gridcolor": "#333333", "linecolor": "#444444"},
)
return fig
def create_market_comparison_chart(
data: list[dict[str, Any]],
date_column: str = "full_date",
metrics: list[str] | None = None,
title: str = "Market Indicators",
) -> go.Figure:
"""Create a multi-metric comparison chart.
Args:
data: List of records with date and metric columns.
date_column: Column name for dates.
metrics: List of metric columns to display.
title: Chart title.
Returns:
Plotly Figure object with secondary y-axis.
"""
import pandas as pd
from plotly.subplots import make_subplots
if not data:
fig = go.Figure()
fig.add_annotation(
text="No data available",
xref="paper",
yref="paper",
x=0.5,
y=0.5,
showarrow=False,
font={"color": "#888888"},
)
fig.update_layout(
title=title,
height=400,
paper_bgcolor="rgba(0,0,0,0)",
plot_bgcolor="rgba(0,0,0,0)",
font_color="#c9c9c9",
)
return fig
if metrics is None:
metrics = ["avg_price", "sales_count"]
df = pd.DataFrame(data)
df[date_column] = pd.to_datetime(df[date_column])
fig = make_subplots(specs=[[{"secondary_y": True}]])
colors = ["#1f77b4", "#ff7f0e", "#2ca02c", "#d62728"]
for i, metric in enumerate(metrics[:4]):
if metric not in df.columns:
continue
secondary = i > 0
fig.add_trace(
go.Scatter(
x=df[date_column],
y=df[metric],
name=metric.replace("_", " ").title(),
line={"color": colors[i % len(colors)]},
),
secondary_y=secondary,
)
fig.update_layout(
title=title,
height=400,
margin={"l": 40, "r": 40, "t": 50, "b": 40},
hovermode="x unified",
paper_bgcolor="rgba(0,0,0,0)",
plot_bgcolor="rgba(0,0,0,0)",
font_color="#c9c9c9",
xaxis={"gridcolor": "#333333", "linecolor": "#444444"},
yaxis={"gridcolor": "#333333", "linecolor": "#444444"},
legend={
"orientation": "h",
"yanchor": "bottom",
"y": 1.02,
"xanchor": "right",
"x": 1,
"font": {"color": "#c9c9c9"},
},
)
return fig
def add_policy_markers(
fig: go.Figure,
policy_events: list[dict[str, Any]],
date_column: str = "event_date",
y_position: float | None = None,
) -> go.Figure:
"""Add policy event markers to an existing time series figure.
Args:
fig: Existing Plotly figure to add markers to.
policy_events: List of policy event dicts with date and metadata.
date_column: Column name for event dates.
y_position: Y position for markers. If None, uses top of chart.
Returns:
Updated Plotly Figure object with policy markers.
"""
if not policy_events:
return fig
# Color mapping for policy categories
category_colors = {
"monetary": "#1f77b4", # Blue
"tax": "#2ca02c", # Green
"regulatory": "#ff7f0e", # Orange
"supply": "#9467bd", # Purple
"economic": "#d62728", # Red
}
# Symbol mapping for expected direction
direction_symbols = {
"bullish": "triangle-up",
"bearish": "triangle-down",
"neutral": "circle",
}
for event in policy_events:
event_date = event.get(date_column)
category = event.get("category", "economic")
direction = event.get("expected_direction", "neutral")
title = event.get("title", "Policy Event")
level = event.get("level", "federal")
color = category_colors.get(category, "#666666")
symbol = direction_symbols.get(direction, "circle")
# Add vertical line for the event
fig.add_vline(
x=event_date,
line_dash="dot",
line_color=color,
opacity=0.5,
annotation_text="",
)
# Add marker with hover info
fig.add_trace(
go.Scatter(
x=[event_date],
y=[y_position] if y_position else [None], # type: ignore[list-item]
mode="markers",
marker={
"symbol": symbol,
"size": 12,
"color": color,
"line": {"width": 1, "color": "white"},
},
name=title,
hovertemplate=(
f"<b>{title}</b><br>"
f"Date: %{{x}}<br>"
f"Level: {level.title()}<br>"
f"Category: {category.title()}<br>"
f"<extra></extra>"
),
showlegend=False,
)
)
return fig
def create_time_series_with_events(
data: list[dict[str, Any]],
policy_events: list[dict[str, Any]],
date_column: str = "full_date",
value_column: str = "avg_price",
title: str = "Price Trend with Policy Events",
) -> go.Figure:
"""Create a time series chart with policy event markers.
Args:
data: Time series data.
policy_events: Policy events to overlay.
date_column: Column name for dates.
value_column: Column name for values.
title: Chart title.
Returns:
Plotly Figure with time series and policy markers.
"""
# Create base time series
fig = create_price_time_series(
data=data,
date_column=date_column,
price_column=value_column,
title=title,
)
# Add policy markers at the top of the chart
if policy_events:
fig = add_policy_markers(fig, policy_events)
return fig

View File

@@ -0,0 +1,20 @@
"""Health check endpoint for deployment monitoring."""
import dash
from dash import html
dash.register_page(
__name__,
path="/health",
title="Health Check",
)
def layout() -> html.Div:
"""Return simple health check response."""
return html.Div(
[
html.Pre("status: ok"),
],
id="health-check",
)

View File

@@ -2,7 +2,6 @@
import dash
import dash_mantine_components as dmc
from dash_iconify import DashIconify
dash.register_page(__name__, path="/", name="Home")
@@ -52,19 +51,6 @@ PROJECTS = [
},
]
SOCIAL_LINKS = [
{
"platform": "LinkedIn",
"url": "https://linkedin.com/in/leobmiranda",
"icon": "mdi:linkedin",
},
{
"platform": "GitHub",
"url": "https://github.com/leomiranda",
"icon": "mdi:github",
},
]
AVAILABILITY = "Open to Senior Data Analyst, Analytics Engineer, and BI Developer opportunities in Toronto or remote."
@@ -160,27 +146,6 @@ def create_projects_section() -> dmc.Paper:
)
def create_social_links() -> dmc.Group:
"""Create social media links."""
return dmc.Group(
[
dmc.Anchor(
dmc.Button(
link["platform"],
leftSection=DashIconify(icon=link["icon"], width=20),
variant="outline",
size="md",
),
href=link["url"],
target="_blank",
)
for link in SOCIAL_LINKS
],
justify="center",
gap="md",
)
def create_availability_section() -> dmc.Text:
"""Create the availability statement."""
return dmc.Text(AVAILABILITY, size="sm", c="dimmed", ta="center", fs="italic")
@@ -193,7 +158,6 @@ layout = dmc.Container(
create_summary_section(),
create_tech_stack_section(),
create_projects_section(),
create_social_links(),
dmc.Divider(my="lg"),
create_availability_section(),
dmc.Space(h=40),

View File

@@ -1 +1 @@
"""Toronto Housing Dashboard page."""
"""Toronto Housing Dashboard pages."""

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,294 @@
"""Toronto Housing Dashboard page."""
import dash
import dash_mantine_components as dmc
from dash import dcc, html
from dash_iconify import DashIconify
from portfolio_app.components import (
create_map_controls,
create_metric_cards_row,
create_time_slider,
create_year_selector,
)
dash.register_page(__name__, path="/toronto", name="Toronto Housing")
# Metric options for the purchase market
PURCHASE_METRIC_OPTIONS = [
{"label": "Average Price", "value": "avg_price"},
{"label": "Median Price", "value": "median_price"},
{"label": "Sales Volume", "value": "sales_count"},
{"label": "Days on Market", "value": "avg_dom"},
]
# Metric options for the rental market
RENTAL_METRIC_OPTIONS = [
{"label": "Average Rent", "value": "avg_rent"},
{"label": "Vacancy Rate", "value": "vacancy_rate"},
{"label": "Rental Universe", "value": "rental_universe"},
]
# Sample metrics for KPI cards (will be populated by callbacks)
SAMPLE_METRICS = [
{
"title": "Avg. Price",
"value": 1125000,
"delta": 2.3,
"prefix": "$",
"format_spec": ",.0f",
},
{
"title": "Sales Volume",
"value": 4850,
"delta": -5.1,
"format_spec": ",",
},
{
"title": "Avg. DOM",
"value": 18,
"delta": 3,
"suffix": " days",
"positive_is_good": False,
},
{
"title": "Avg. Rent",
"value": 2450,
"delta": 4.2,
"prefix": "$",
"format_spec": ",.0f",
},
]
def create_header() -> dmc.Group:
"""Create the dashboard header with title and controls."""
return dmc.Group(
[
dmc.Stack(
[
dmc.Title("Toronto Housing Dashboard", order=1),
dmc.Text(
"Real estate market analysis for the Greater Toronto Area",
c="dimmed",
),
],
gap="xs",
),
dmc.Group(
[
dcc.Link(
dmc.Button(
"Methodology",
leftSection=DashIconify(
icon="tabler:info-circle", width=18
),
variant="subtle",
color="gray",
),
href="/toronto/methodology",
),
create_year_selector(
id_prefix="toronto",
min_year=2020,
default_year=2024,
label="Year",
),
],
gap="md",
),
],
justify="space-between",
align="flex-start",
)
def create_kpi_section() -> dmc.Box:
"""Create the KPI metrics row."""
return dmc.Box(
children=[
dmc.Title("Key Metrics", order=3, size="h4", mb="sm"),
html.Div(
id="toronto-kpi-cards",
children=[
create_metric_cards_row(SAMPLE_METRICS, id_prefix="toronto-kpi")
],
),
],
)
def create_purchase_map_section() -> dmc.Grid:
"""Create the purchase market choropleth section."""
return dmc.Grid(
[
dmc.GridCol(
create_map_controls(
id_prefix="purchase-map",
metric_options=PURCHASE_METRIC_OPTIONS,
default_metric="avg_price",
),
span={"base": 12, "md": 3},
),
dmc.GridCol(
dmc.Paper(
children=[
dcc.Graph(
id="purchase-choropleth",
config={"scrollZoom": True},
style={"height": "500px"},
),
],
p="xs",
radius="sm",
withBorder=True,
),
span={"base": 12, "md": 9},
),
],
gutter="md",
)
def create_rental_map_section() -> dmc.Grid:
"""Create the rental market choropleth section."""
return dmc.Grid(
[
dmc.GridCol(
create_map_controls(
id_prefix="rental-map",
metric_options=RENTAL_METRIC_OPTIONS,
default_metric="avg_rent",
),
span={"base": 12, "md": 3},
),
dmc.GridCol(
dmc.Paper(
children=[
dcc.Graph(
id="rental-choropleth",
config={"scrollZoom": True},
style={"height": "500px"},
),
],
p="xs",
radius="sm",
withBorder=True,
),
span={"base": 12, "md": 9},
),
],
gutter="md",
)
def create_time_series_section() -> dmc.Grid:
"""Create the time series charts section."""
return dmc.Grid(
[
dmc.GridCol(
dmc.Paper(
children=[
dmc.Title("Price Trends", order=4, size="h5", mb="sm"),
dcc.Graph(
id="price-time-series",
config={"displayModeBar": False},
style={"height": "350px"},
),
],
p="md",
radius="sm",
withBorder=True,
),
span={"base": 12, "md": 6},
),
dmc.GridCol(
dmc.Paper(
children=[
dmc.Title("Sales Volume", order=4, size="h5", mb="sm"),
dcc.Graph(
id="volume-time-series",
config={"displayModeBar": False},
style={"height": "350px"},
),
],
p="md",
radius="sm",
withBorder=True,
),
span={"base": 12, "md": 6},
),
],
gutter="md",
)
def create_market_comparison_section() -> dmc.Paper:
"""Create the market comparison chart section."""
return dmc.Paper(
children=[
dmc.Group(
[
dmc.Title("Market Indicators", order=4, size="h5"),
create_time_slider(
id_prefix="market-comparison",
min_year=2020,
label="",
),
],
justify="space-between",
align="center",
mb="md",
),
dcc.Graph(
id="market-comparison-chart",
config={"displayModeBar": False},
style={"height": "400px"},
),
],
p="md",
radius="sm",
withBorder=True,
)
def create_data_notice() -> dmc.Alert:
"""Create a notice about data availability."""
return dmc.Alert(
children=[
dmc.Text(
"This dashboard uses TRREB and CMHC data. "
"Geographic boundaries require QGIS digitization to enable choropleth maps. "
"Sample data is shown below.",
size="sm",
),
],
title="Data Notice",
color="blue",
variant="light",
)
# Register callbacks
from portfolio_app.pages.toronto import callbacks # noqa: E402, F401
layout = dmc.Container(
dmc.Stack(
[
create_header(),
create_data_notice(),
create_kpi_section(),
dmc.Divider(my="md", label="Purchase Market", labelPosition="center"),
create_purchase_map_section(),
dmc.Divider(my="md", label="Rental Market", labelPosition="center"),
create_rental_map_section(),
dmc.Divider(my="md", label="Trends", labelPosition="center"),
create_time_series_section(),
create_market_comparison_section(),
dmc.Space(h=40),
],
gap="lg",
),
size="xl",
py="xl",
)

View File

@@ -0,0 +1,274 @@
"""Methodology page for Toronto Housing Dashboard."""
import dash
import dash_mantine_components as dmc
from dash import dcc, html
from dash_iconify import DashIconify
dash.register_page(
__name__,
path="/toronto/methodology",
title="Methodology | Toronto Housing Dashboard",
description="Data sources, methodology, and limitations for the Toronto Housing Dashboard",
)
def layout() -> dmc.Container:
"""Render the methodology page layout."""
return dmc.Container(
size="md",
py="xl",
children=[
# Back to Dashboard button
dcc.Link(
dmc.Button(
"Back to Dashboard",
leftSection=DashIconify(icon="tabler:arrow-left", width=18),
variant="subtle",
color="gray",
),
href="/toronto",
),
# Header
dmc.Title("Methodology", order=1, mb="lg", mt="md"),
dmc.Text(
"This page documents the data sources, processing methodology, "
"and known limitations of the Toronto Housing Dashboard.",
size="lg",
c="dimmed",
mb="xl",
),
# Data Sources Section
dmc.Paper(
p="lg",
radius="md",
withBorder=True,
mb="lg",
children=[
dmc.Title("Data Sources", order=2, mb="md"),
# TRREB
dmc.Title("Purchase Data: TRREB", order=3, size="h4", mb="sm"),
dmc.Text(
[
"The Toronto Regional Real Estate Board (TRREB) publishes monthly ",
html.Strong("Market Watch"),
" reports containing aggregate statistics for residential real estate "
"transactions across the Greater Toronto Area.",
],
mb="sm",
),
dmc.List(
[
dmc.ListItem("Source: TRREB Market Watch Reports (PDF)"),
dmc.ListItem("Geographic granularity: ~35 TRREB Districts"),
dmc.ListItem("Temporal granularity: Monthly"),
dmc.ListItem("Coverage: 2021-present"),
dmc.ListItem(
[
"Metrics: Sales count, average/median price, new listings, ",
"active listings, days on market, sale-to-list ratio",
]
),
],
mb="md",
),
dmc.Anchor(
"TRREB Market Watch Archive",
href="https://trreb.ca/market-data/market-watch/market-watch-archive/",
target="_blank",
mb="lg",
),
# CMHC
dmc.Title(
"Rental Data: CMHC", order=3, size="h4", mb="sm", mt="md"
),
dmc.Text(
[
"Canada Mortgage and Housing Corporation (CMHC) conducts the annual ",
html.Strong("Rental Market Survey"),
" providing rental market statistics for major urban centres.",
],
mb="sm",
),
dmc.List(
[
dmc.ListItem("Source: CMHC Rental Market Survey (Excel)"),
dmc.ListItem(
"Geographic granularity: ~20 CMHC Zones (Census Tract aligned)"
),
dmc.ListItem(
"Temporal granularity: Annual (October survey)"
),
dmc.ListItem("Coverage: 2021-present"),
dmc.ListItem(
[
"Metrics: Average/median rent, vacancy rate, universe count, ",
"turnover rate, year-over-year rent change",
]
),
],
mb="md",
),
dmc.Anchor(
"CMHC Housing Market Information Portal",
href="https://www.cmhc-schl.gc.ca/professionals/housing-markets-data-and-research/housing-data/data-tables/rental-market",
target="_blank",
),
],
),
# Geographic Considerations
dmc.Paper(
p="lg",
radius="md",
withBorder=True,
mb="lg",
children=[
dmc.Title("Geographic Considerations", order=2, mb="md"),
dmc.Alert(
title="Important: Non-Aligned Geographies",
color="yellow",
mb="md",
children=[
"TRREB Districts and CMHC Zones do ",
html.Strong("not"),
" align geographically. They are displayed as separate layers and "
"should not be directly compared at the sub-regional level.",
],
),
dmc.Text(
"The dashboard presents three geographic layers:",
mb="sm",
),
dmc.List(
[
dmc.ListItem(
[
html.Strong("TRREB Districts (~35): "),
"Used for purchase/sales data visualization. "
"Districts are defined by TRREB and labeled with codes like W01, C01, E01.",
]
),
dmc.ListItem(
[
html.Strong("CMHC Zones (~20): "),
"Used for rental data visualization. "
"Zones are aligned with Census Tract boundaries.",
]
),
dmc.ListItem(
[
html.Strong("City Neighbourhoods (158): "),
"Reference overlay only. "
"These are official City of Toronto neighbourhood boundaries.",
]
),
],
),
],
),
# Policy Events
dmc.Paper(
p="lg",
radius="md",
withBorder=True,
mb="lg",
children=[
dmc.Title("Policy Event Annotations", order=2, mb="md"),
dmc.Text(
"The time series charts include markers for significant policy events "
"that may have influenced housing market conditions. These annotations are "
"for contextual reference only.",
mb="md",
),
dmc.Alert(
title="No Causation Claims",
color="blue",
children=[
"The presence of a policy marker near a market trend change does ",
html.Strong("not"),
" imply causation. Housing markets are influenced by numerous factors "
"beyond policy interventions.",
],
),
],
),
# Limitations
dmc.Paper(
p="lg",
radius="md",
withBorder=True,
mb="lg",
children=[
dmc.Title("Limitations", order=2, mb="md"),
dmc.List(
[
dmc.ListItem(
[
html.Strong("Aggregate Data: "),
"All statistics are aggregates. Individual property characteristics, "
"condition, and micro-location are not reflected.",
]
),
dmc.ListItem(
[
html.Strong("Reporting Lag: "),
"TRREB data reflects closed transactions, which may lag market "
"conditions by 1-3 months. CMHC data is annual.",
]
),
dmc.ListItem(
[
html.Strong("Geographic Boundaries: "),
"TRREB district boundaries were manually digitized from reference maps "
"and may contain minor inaccuracies.",
]
),
dmc.ListItem(
[
html.Strong("Data Suppression: "),
"Some cells may be suppressed for confidentiality when transaction "
"counts are below thresholds.",
]
),
],
),
],
),
# Technical Implementation
dmc.Paper(
p="lg",
radius="md",
withBorder=True,
children=[
dmc.Title("Technical Implementation", order=2, mb="md"),
dmc.Text("This dashboard is built with:", mb="sm"),
dmc.List(
[
dmc.ListItem("Python 3.11+ with Dash and Plotly"),
dmc.ListItem("PostgreSQL with PostGIS for geospatial data"),
dmc.ListItem("dbt for data transformation"),
dmc.ListItem("Pydantic for data validation"),
dmc.ListItem("SQLAlchemy 2.0 for database operations"),
],
mb="md",
),
dmc.Anchor(
"View source code on GitHub",
href="https://github.com/lmiranda/personal-portfolio",
target="_blank",
),
],
),
# Back link
dmc.Group(
mt="xl",
children=[
dmc.Anchor(
"← Back to Dashboard",
href="/toronto",
size="lg",
),
],
),
],
)

View File

@@ -0,0 +1,257 @@
"""Demo/sample data for testing the Toronto Housing Dashboard without full pipeline.
This module provides synthetic data for development and demonstration purposes.
Replace with real data from the database in production.
"""
from datetime import date
from typing import Any
def get_demo_districts() -> list[dict[str, Any]]:
"""Return sample TRREB district data."""
return [
{"district_code": "W01", "district_name": "Long Branch", "area_type": "West"},
{"district_code": "W02", "district_name": "Mimico", "area_type": "West"},
{
"district_code": "W03",
"district_name": "Kingsway South",
"area_type": "West",
},
{"district_code": "W04", "district_name": "Edenbridge", "area_type": "West"},
{"district_code": "W05", "district_name": "Islington", "area_type": "West"},
{"district_code": "W06", "district_name": "Rexdale", "area_type": "West"},
{"district_code": "W07", "district_name": "Willowdale", "area_type": "West"},
{"district_code": "W08", "district_name": "York", "area_type": "West"},
{
"district_code": "C01",
"district_name": "Downtown Core",
"area_type": "Central",
},
{"district_code": "C02", "district_name": "Annex", "area_type": "Central"},
{
"district_code": "C03",
"district_name": "Forest Hill",
"area_type": "Central",
},
{
"district_code": "C04",
"district_name": "Lawrence Park",
"area_type": "Central",
},
{
"district_code": "C06",
"district_name": "Willowdale East",
"area_type": "Central",
},
{"district_code": "C07", "district_name": "Thornhill", "area_type": "Central"},
{"district_code": "C08", "district_name": "Waterfront", "area_type": "Central"},
{"district_code": "E01", "district_name": "Leslieville", "area_type": "East"},
{"district_code": "E02", "district_name": "The Beaches", "area_type": "East"},
{"district_code": "E03", "district_name": "Danforth", "area_type": "East"},
{"district_code": "E04", "district_name": "Birch Cliff", "area_type": "East"},
{"district_code": "E05", "district_name": "Scarborough", "area_type": "East"},
]
def get_demo_purchase_data() -> list[dict[str, Any]]:
"""Return sample purchase data for time series visualization."""
import random
random.seed(42)
data = []
base_prices = {
"W01": 850000,
"C01": 1200000,
"E01": 950000,
}
for year in [2024, 2025]:
for month in range(1, 13):
if year == 2025 and month > 12:
break
for district, base_price in base_prices.items():
# Add some randomness and trend
trend = (year - 2024) * 12 + month
price_variation = random.uniform(-0.05, 0.05)
trend_factor = 1 + (trend * 0.002) # Slight upward trend
avg_price = int(base_price * trend_factor * (1 + price_variation))
sales = random.randint(50, 200)
data.append(
{
"district_code": district,
"full_date": date(year, month, 1),
"year": year,
"month": month,
"avg_price": avg_price,
"median_price": int(avg_price * 0.95),
"sales_count": sales,
"new_listings": int(sales * random.uniform(1.2, 1.8)),
"active_listings": int(sales * random.uniform(2.0, 3.5)),
"days_on_market": random.randint(15, 45),
"sale_to_list_ratio": round(random.uniform(0.95, 1.05), 2),
}
)
return data
def get_demo_rental_data() -> list[dict[str, Any]]:
"""Return sample rental data for visualization."""
data = []
zones = [
("Zone01", "Downtown"),
("Zone02", "Midtown"),
("Zone03", "North York"),
("Zone04", "Scarborough"),
("Zone05", "Etobicoke"),
]
bedroom_types = ["bachelor", "1_bedroom", "2_bedroom", "3_bedroom"]
base_rents = {
"bachelor": 1800,
"1_bedroom": 2200,
"2_bedroom": 2800,
"3_bedroom": 3400,
}
for year in [2021, 2022, 2023, 2024, 2025]:
for zone_code, zone_name in zones:
for bedroom in bedroom_types:
# Rental trend: ~5% increase per year
year_factor = 1 + ((year - 2021) * 0.05)
base_rent = base_rents[bedroom]
data.append(
{
"zone_code": zone_code,
"zone_name": zone_name,
"survey_year": year,
"full_date": date(year, 10, 1),
"bedroom_type": bedroom,
"average_rent": int(base_rent * year_factor),
"median_rent": int(base_rent * year_factor * 0.98),
"vacancy_rate": round(
2.5 - (year - 2021) * 0.3, 1
), # Decreasing vacancy
"universe": 5000 + (year - 2021) * 200,
}
)
return data
def get_demo_policy_events() -> list[dict[str, Any]]:
"""Return sample policy events for annotation."""
return [
{
"event_date": date(2024, 6, 5),
"effective_date": date(2024, 6, 5),
"level": "federal",
"category": "monetary",
"title": "BoC Rate Cut (25bp)",
"description": "Bank of Canada cuts overnight rate by 25 basis points to 4.75%",
"expected_direction": "bullish",
},
{
"event_date": date(2024, 7, 24),
"effective_date": date(2024, 7, 24),
"level": "federal",
"category": "monetary",
"title": "BoC Rate Cut (25bp)",
"description": "Bank of Canada cuts overnight rate by 25 basis points to 4.50%",
"expected_direction": "bullish",
},
{
"event_date": date(2024, 9, 4),
"effective_date": date(2024, 9, 4),
"level": "federal",
"category": "monetary",
"title": "BoC Rate Cut (25bp)",
"description": "Bank of Canada cuts overnight rate by 25 basis points to 4.25%",
"expected_direction": "bullish",
},
{
"event_date": date(2024, 10, 23),
"effective_date": date(2024, 10, 23),
"level": "federal",
"category": "monetary",
"title": "BoC Rate Cut (50bp)",
"description": "Bank of Canada cuts overnight rate by 50 basis points to 3.75%",
"expected_direction": "bullish",
},
{
"event_date": date(2024, 12, 11),
"effective_date": date(2024, 12, 11),
"level": "federal",
"category": "monetary",
"title": "BoC Rate Cut (50bp)",
"description": "Bank of Canada cuts overnight rate by 50 basis points to 3.25%",
"expected_direction": "bullish",
},
{
"event_date": date(2024, 9, 16),
"effective_date": date(2024, 12, 15),
"level": "federal",
"category": "regulatory",
"title": "CMHC 30-Year Amortization",
"description": "30-year amortization extended to all first-time buyers and new builds",
"expected_direction": "bullish",
},
{
"event_date": date(2024, 9, 16),
"effective_date": date(2024, 12, 15),
"level": "federal",
"category": "regulatory",
"title": "Insured Mortgage Cap $1.5M",
"description": "Insured mortgage cap raised from $1M to $1.5M",
"expected_direction": "bullish",
},
]
def get_demo_summary_metrics() -> dict[str, dict[str, Any]]:
"""Return summary metrics for KPI cards."""
return {
"avg_price": {
"value": 1067968,
"title": "Avg. Price (2025)",
"delta": -4.7,
"delta_suffix": "%",
"prefix": "$",
"format_spec": ",.0f",
"positive_is_good": True,
},
"total_sales": {
"value": 67610,
"title": "Total Sales (2024)",
"delta": 2.6,
"delta_suffix": "%",
"format_spec": ",.0f",
"positive_is_good": True,
},
"avg_rent": {
"value": 2450,
"title": "Avg. Rent (2025)",
"delta": 3.2,
"delta_suffix": "%",
"prefix": "$",
"format_spec": ",.0f",
"positive_is_good": False,
},
"vacancy_rate": {
"value": 1.8,
"title": "Vacancy Rate",
"delta": -0.4,
"delta_suffix": "pp",
"suffix": "%",
"format_spec": ".1f",
"positive_is_good": False,
},
}

View File

@@ -1 +1,32 @@
"""Database loaders for Toronto housing data."""
from .base import bulk_insert, get_session, upsert_by_key
from .cmhc import load_cmhc_record, load_cmhc_rentals
from .dimensions import (
generate_date_key,
load_cmhc_zones,
load_neighbourhoods,
load_policy_events,
load_time_dimension,
load_trreb_districts,
)
from .trreb import load_trreb_purchases, load_trreb_record
__all__ = [
# Base utilities
"get_session",
"bulk_insert",
"upsert_by_key",
# Dimension loaders
"generate_date_key",
"load_time_dimension",
"load_trreb_districts",
"load_cmhc_zones",
"load_neighbourhoods",
"load_policy_events",
# Fact loaders
"load_trreb_purchases",
"load_trreb_record",
"load_cmhc_rentals",
"load_cmhc_record",
]

View File

@@ -0,0 +1,85 @@
"""Base loader utilities for database operations."""
from collections.abc import Generator
from contextlib import contextmanager
from typing import Any, TypeVar
from sqlalchemy.orm import Session
from portfolio_app.toronto.models import get_session_factory
T = TypeVar("T")
@contextmanager
def get_session() -> Generator[Session, None, None]:
"""Get a database session with automatic cleanup.
Yields:
SQLAlchemy session that auto-commits on success, rollbacks on error.
"""
session_factory = get_session_factory()
session = session_factory()
try:
yield session
session.commit()
except Exception:
session.rollback()
raise
finally:
session.close()
def bulk_insert(session: Session, objects: list[T]) -> int:
"""Bulk insert objects into the database.
Args:
session: Active SQLAlchemy session.
objects: List of ORM model instances to insert.
Returns:
Number of objects inserted.
"""
session.add_all(objects)
session.flush()
return len(objects)
def upsert_by_key(
session: Session,
model_class: Any,
objects: list[T],
key_columns: list[str],
) -> tuple[int, int]:
"""Upsert objects based on unique key columns.
Args:
session: Active SQLAlchemy session.
model_class: The ORM model class.
objects: List of ORM model instances to upsert.
key_columns: Column names that form the unique key.
Returns:
Tuple of (inserted_count, updated_count).
"""
inserted = 0
updated = 0
for obj in objects:
# Build filter for existing record
filters = {col: getattr(obj, col) for col in key_columns}
existing = session.query(model_class).filter_by(**filters).first()
if existing:
# Update existing record
for column in model_class.__table__.columns:
if column.name not in key_columns and column.name != "id":
setattr(existing, column.name, getattr(obj, column.name))
updated += 1
else:
# Insert new record
session.add(obj)
inserted += 1
session.flush()
return inserted, updated

View File

@@ -0,0 +1,137 @@
"""Loader for CMHC rental data into fact_rentals."""
from sqlalchemy.orm import Session
from portfolio_app.toronto.models import DimCMHCZone, DimTime, FactRentals
from portfolio_app.toronto.schemas import CMHCAnnualSurvey, CMHCRentalRecord
from .base import get_session, upsert_by_key
from .dimensions import generate_date_key
def load_cmhc_rentals(
survey: CMHCAnnualSurvey,
session: Session | None = None,
) -> int:
"""Load CMHC annual survey data into fact_rentals.
Args:
survey: Validated CMHC annual survey containing records.
session: Optional existing session.
Returns:
Number of records loaded.
"""
from datetime import date
def _load(sess: Session) -> int:
# Get zone key mapping
zones = sess.query(DimCMHCZone).all()
zone_map = {z.zone_code: z.zone_key for z in zones}
# CMHC surveys are annual - use October 1st as reference date
survey_date = date(survey.survey_year, 10, 1)
date_key = generate_date_key(survey_date)
# Verify time dimension exists
time_dim = sess.query(DimTime).filter_by(date_key=date_key).first()
if not time_dim:
raise ValueError(
f"Time dimension not found for date_key {date_key}. "
"Load time dimension first."
)
records = []
for record in survey.records:
zone_key = zone_map.get(record.zone_code)
if not zone_key:
# Skip records for unknown zones
continue
fact = FactRentals(
date_key=date_key,
zone_key=zone_key,
bedroom_type=record.bedroom_type.value,
universe=record.universe,
avg_rent=record.average_rent,
median_rent=record.median_rent,
vacancy_rate=record.vacancy_rate,
availability_rate=record.availability_rate,
turnover_rate=record.turnover_rate,
rent_change_pct=record.rent_change_pct,
reliability_code=record.average_rent_reliability.value
if record.average_rent_reliability
else None,
)
records.append(fact)
inserted, updated = upsert_by_key(
sess, FactRentals, records, ["date_key", "zone_key", "bedroom_type"]
)
return inserted + updated
if session:
return _load(session)
with get_session() as sess:
return _load(sess)
def load_cmhc_record(
record: CMHCRentalRecord,
survey_year: int,
session: Session | None = None,
) -> int:
"""Load a single CMHC record into fact_rentals.
Args:
record: Single validated CMHC rental record.
survey_year: Year of the survey.
session: Optional existing session.
Returns:
Number of records loaded (0 or 1).
"""
from datetime import date
def _load(sess: Session) -> int:
# Get zone key
zone = sess.query(DimCMHCZone).filter_by(zone_code=record.zone_code).first()
if not zone:
return 0
survey_date = date(survey_year, 10, 1)
date_key = generate_date_key(survey_date)
# Verify time dimension exists
time_dim = sess.query(DimTime).filter_by(date_key=date_key).first()
if not time_dim:
raise ValueError(
f"Time dimension not found for date_key {date_key}. "
"Load time dimension first."
)
fact = FactRentals(
date_key=date_key,
zone_key=zone.zone_key,
bedroom_type=record.bedroom_type.value,
universe=record.universe,
avg_rent=record.average_rent,
median_rent=record.median_rent,
vacancy_rate=record.vacancy_rate,
availability_rate=record.availability_rate,
turnover_rate=record.turnover_rate,
rent_change_pct=record.rent_change_pct,
reliability_code=record.average_rent_reliability.value
if record.average_rent_reliability
else None,
)
inserted, updated = upsert_by_key(
sess, FactRentals, [fact], ["date_key", "zone_key", "bedroom_type"]
)
return inserted + updated
if session:
return _load(session)
with get_session() as sess:
return _load(sess)

View File

@@ -0,0 +1,251 @@
"""Loaders for dimension tables."""
from datetime import date
from sqlalchemy.orm import Session
from portfolio_app.toronto.models import (
DimCMHCZone,
DimNeighbourhood,
DimPolicyEvent,
DimTime,
DimTRREBDistrict,
)
from portfolio_app.toronto.schemas import (
CMHCZone,
Neighbourhood,
PolicyEvent,
TRREBDistrict,
)
from .base import get_session, upsert_by_key
def generate_date_key(d: date) -> int:
"""Generate integer date key from date (YYYYMMDD format).
Args:
d: Date to convert.
Returns:
Integer in YYYYMMDD format.
"""
return d.year * 10000 + d.month * 100 + d.day
def load_time_dimension(
start_date: date,
end_date: date,
session: Session | None = None,
) -> int:
"""Load time dimension with date range.
Args:
start_date: Start of date range.
end_date: End of date range (inclusive).
session: Optional existing session.
Returns:
Number of records loaded.
"""
month_names = [
"",
"January",
"February",
"March",
"April",
"May",
"June",
"July",
"August",
"September",
"October",
"November",
"December",
]
def _load(sess: Session) -> int:
records = []
current = start_date.replace(day=1) # Start at month beginning
while current <= end_date:
quarter = (current.month - 1) // 3 + 1
dim = DimTime(
date_key=generate_date_key(current),
full_date=current,
year=current.year,
month=current.month,
quarter=quarter,
month_name=month_names[current.month],
is_month_start=True,
)
records.append(dim)
# Move to next month
if current.month == 12:
current = current.replace(year=current.year + 1, month=1)
else:
current = current.replace(month=current.month + 1)
inserted, updated = upsert_by_key(sess, DimTime, records, ["date_key"])
return inserted + updated
if session:
return _load(session)
with get_session() as sess:
return _load(sess)
def load_trreb_districts(
districts: list[TRREBDistrict],
session: Session | None = None,
) -> int:
"""Load TRREB district dimension.
Args:
districts: List of validated district schemas.
session: Optional existing session.
Returns:
Number of records loaded.
"""
def _load(sess: Session) -> int:
records = []
for d in districts:
dim = DimTRREBDistrict(
district_code=d.district_code,
district_name=d.district_name,
area_type=d.area_type.value,
geometry=d.geometry_wkt,
)
records.append(dim)
inserted, updated = upsert_by_key(
sess, DimTRREBDistrict, records, ["district_code"]
)
return inserted + updated
if session:
return _load(session)
with get_session() as sess:
return _load(sess)
def load_cmhc_zones(
zones: list[CMHCZone],
session: Session | None = None,
) -> int:
"""Load CMHC zone dimension.
Args:
zones: List of validated zone schemas.
session: Optional existing session.
Returns:
Number of records loaded.
"""
def _load(sess: Session) -> int:
records = []
for z in zones:
dim = DimCMHCZone(
zone_code=z.zone_code,
zone_name=z.zone_name,
geometry=z.geometry_wkt,
)
records.append(dim)
inserted, updated = upsert_by_key(sess, DimCMHCZone, records, ["zone_code"])
return inserted + updated
if session:
return _load(session)
with get_session() as sess:
return _load(sess)
def load_neighbourhoods(
neighbourhoods: list[Neighbourhood],
session: Session | None = None,
) -> int:
"""Load neighbourhood dimension.
Args:
neighbourhoods: List of validated neighbourhood schemas.
session: Optional existing session.
Returns:
Number of records loaded.
"""
def _load(sess: Session) -> int:
records = []
for n in neighbourhoods:
dim = DimNeighbourhood(
neighbourhood_id=n.neighbourhood_id,
name=n.name,
geometry=n.geometry_wkt,
population=n.population,
land_area_sqkm=n.land_area_sqkm,
pop_density_per_sqkm=n.pop_density_per_sqkm,
pct_bachelors_or_higher=n.pct_bachelors_or_higher,
median_household_income=n.median_household_income,
pct_owner_occupied=n.pct_owner_occupied,
pct_renter_occupied=n.pct_renter_occupied,
census_year=n.census_year,
)
records.append(dim)
inserted, updated = upsert_by_key(
sess, DimNeighbourhood, records, ["neighbourhood_id"]
)
return inserted + updated
if session:
return _load(session)
with get_session() as sess:
return _load(sess)
def load_policy_events(
events: list[PolicyEvent],
session: Session | None = None,
) -> int:
"""Load policy event dimension.
Args:
events: List of validated policy event schemas.
session: Optional existing session.
Returns:
Number of records loaded.
"""
def _load(sess: Session) -> int:
records = []
for e in events:
dim = DimPolicyEvent(
event_date=e.event_date,
effective_date=e.effective_date,
level=e.level.value,
category=e.category.value,
title=e.title,
description=e.description,
expected_direction=e.expected_direction.value,
source_url=e.source_url,
confidence=e.confidence.value,
)
records.append(dim)
# For policy events, use event_date + title as unique key
inserted, updated = upsert_by_key(
sess, DimPolicyEvent, records, ["event_date", "title"]
)
return inserted + updated
if session:
return _load(session)
with get_session() as sess:
return _load(sess)

View File

@@ -0,0 +1,129 @@
"""Loader for TRREB purchase data into fact_purchases."""
from sqlalchemy.orm import Session
from portfolio_app.toronto.models import DimTime, DimTRREBDistrict, FactPurchases
from portfolio_app.toronto.schemas import TRREBMonthlyRecord, TRREBMonthlyReport
from .base import get_session, upsert_by_key
from .dimensions import generate_date_key
def load_trreb_purchases(
report: TRREBMonthlyReport,
session: Session | None = None,
) -> int:
"""Load TRREB monthly report data into fact_purchases.
Args:
report: Validated TRREB monthly report containing records.
session: Optional existing session.
Returns:
Number of records loaded.
"""
def _load(sess: Session) -> int:
# Get district key mapping
districts = sess.query(DimTRREBDistrict).all()
district_map = {d.district_code: d.district_key for d in districts}
# Build date key from report date
date_key = generate_date_key(report.report_date)
# Verify time dimension exists
time_dim = sess.query(DimTime).filter_by(date_key=date_key).first()
if not time_dim:
raise ValueError(
f"Time dimension not found for date_key {date_key}. "
"Load time dimension first."
)
records = []
for record in report.records:
district_key = district_map.get(record.area_code)
if not district_key:
# Skip records for unknown districts (e.g., aggregate rows)
continue
fact = FactPurchases(
date_key=date_key,
district_key=district_key,
sales_count=record.sales,
dollar_volume=record.dollar_volume,
avg_price=record.avg_price,
median_price=record.median_price,
new_listings=record.new_listings,
active_listings=record.active_listings,
avg_dom=record.avg_dom,
avg_sp_lp=record.avg_sp_lp,
)
records.append(fact)
inserted, updated = upsert_by_key(
sess, FactPurchases, records, ["date_key", "district_key"]
)
return inserted + updated
if session:
return _load(session)
with get_session() as sess:
return _load(sess)
def load_trreb_record(
record: TRREBMonthlyRecord,
session: Session | None = None,
) -> int:
"""Load a single TRREB record into fact_purchases.
Args:
record: Single validated TRREB monthly record.
session: Optional existing session.
Returns:
Number of records loaded (0 or 1).
"""
def _load(sess: Session) -> int:
# Get district key
district = (
sess.query(DimTRREBDistrict)
.filter_by(district_code=record.area_code)
.first()
)
if not district:
return 0
date_key = generate_date_key(record.report_date)
# Verify time dimension exists
time_dim = sess.query(DimTime).filter_by(date_key=date_key).first()
if not time_dim:
raise ValueError(
f"Time dimension not found for date_key {date_key}. "
"Load time dimension first."
)
fact = FactPurchases(
date_key=date_key,
district_key=district.district_key,
sales_count=record.sales,
dollar_volume=record.dollar_volume,
avg_price=record.avg_price,
median_price=record.median_price,
new_listings=record.new_listings,
active_listings=record.active_listings,
avg_dom=record.avg_dom,
avg_sp_lp=record.avg_sp_lp,
)
inserted, updated = upsert_by_key(
sess, FactPurchases, [fact], ["date_key", "district_key"]
)
return inserted + updated
if session:
return _load(session)
with get_session() as sess:
return _load(sess)

View File

@@ -1,9 +1,20 @@
"""Parsers for Toronto housing data sources."""
from .cmhc import CMHCParser
from .geo import (
CMHCZoneParser,
NeighbourhoodParser,
TRREBDistrictParser,
load_geojson,
)
from .trreb import TRREBParser
__all__ = [
"TRREBParser",
"CMHCParser",
# GeoJSON parsers
"CMHCZoneParser",
"TRREBDistrictParser",
"NeighbourhoodParser",
"load_geojson",
]

View File

@@ -0,0 +1,463 @@
"""GeoJSON parser for geographic boundary files.
This module provides parsers for loading geographic boundary files
(GeoJSON format) and converting them to Pydantic schemas for database
loading or direct use in Plotly choropleth maps.
"""
import json
from pathlib import Path
from typing import Any
from pyproj import Transformer
from shapely.geometry import mapping, shape
from shapely.ops import transform
from portfolio_app.toronto.schemas import CMHCZone, Neighbourhood, TRREBDistrict
from portfolio_app.toronto.schemas.dimensions import AreaType
# Transformer for reprojecting from Web Mercator to WGS84
_TRANSFORMER_3857_TO_4326 = Transformer.from_crs(
"EPSG:3857", "EPSG:4326", always_xy=True
)
def load_geojson(path: Path) -> dict[str, Any]:
"""Load a GeoJSON file and return as dictionary.
Args:
path: Path to the GeoJSON file.
Returns:
GeoJSON as dictionary (FeatureCollection).
Raises:
FileNotFoundError: If file does not exist.
ValueError: If file is not valid GeoJSON.
"""
if not path.exists():
raise FileNotFoundError(f"GeoJSON file not found: {path}")
if path.suffix.lower() not in (".geojson", ".json"):
raise ValueError(f"Expected GeoJSON file, got: {path.suffix}")
with open(path, encoding="utf-8") as f:
data = json.load(f)
if data.get("type") != "FeatureCollection":
raise ValueError("GeoJSON must be a FeatureCollection")
return dict(data)
def geometry_to_wkt(geometry: dict[str, Any]) -> str:
"""Convert GeoJSON geometry to WKT string.
Args:
geometry: GeoJSON geometry dictionary.
Returns:
WKT representation of the geometry.
"""
return str(shape(geometry).wkt)
def reproject_geometry(
geometry: dict[str, Any], source_crs: str = "EPSG:3857"
) -> dict[str, Any]:
"""Reproject a GeoJSON geometry to WGS84 (EPSG:4326).
Args:
geometry: GeoJSON geometry dictionary.
source_crs: Source CRS (default EPSG:3857 Web Mercator).
Returns:
GeoJSON geometry in WGS84 coordinates.
"""
if source_crs == "EPSG:3857":
transformer = _TRANSFORMER_3857_TO_4326
else:
transformer = Transformer.from_crs(source_crs, "EPSG:4326", always_xy=True)
geom = shape(geometry)
reprojected = transform(transformer.transform, geom)
return dict(mapping(reprojected))
class CMHCZoneParser:
"""Parser for CMHC zone boundary GeoJSON files.
CMHC zone boundaries are extracted from the R `cmhc` package using
`get_cmhc_geography(geography_type="ZONE", cma="Toronto")`.
Expected GeoJSON properties:
- zone_code or Zone_Code: Zone identifier
- zone_name or Zone_Name: Zone name
"""
# Property name mappings for different GeoJSON formats
CODE_PROPERTIES = ["zone_code", "Zone_Code", "ZONE_CODE", "zonecode", "code"]
NAME_PROPERTIES = [
"zone_name",
"Zone_Name",
"ZONE_NAME",
"ZONE_NAME_EN",
"NAME_EN",
"zonename",
"name",
"NAME",
]
def __init__(self, geojson_path: Path) -> None:
"""Initialize parser with path to GeoJSON file.
Args:
geojson_path: Path to the CMHC zones GeoJSON file.
"""
self.geojson_path = geojson_path
self._geojson: dict[str, Any] | None = None
@property
def geojson(self) -> dict[str, Any]:
"""Lazy-load and return raw GeoJSON data."""
if self._geojson is None:
self._geojson = load_geojson(self.geojson_path)
return self._geojson
def _find_property(
self, properties: dict[str, Any], candidates: list[str]
) -> str | None:
"""Find a property value by checking multiple candidate names."""
for name in candidates:
if name in properties and properties[name] is not None:
return str(properties[name])
return None
def parse(self) -> list[CMHCZone]:
"""Parse GeoJSON and return list of CMHCZone schemas.
Returns:
List of validated CMHCZone objects.
Raises:
ValueError: If required properties are missing.
"""
zones = []
for feature in self.geojson.get("features", []):
props = feature.get("properties", {})
geom = feature.get("geometry")
zone_code = self._find_property(props, self.CODE_PROPERTIES)
zone_name = self._find_property(props, self.NAME_PROPERTIES)
if not zone_code:
raise ValueError(
f"Zone code not found in properties: {list(props.keys())}"
)
if not zone_name:
zone_name = zone_code # Fallback to code if name missing
geometry_wkt = geometry_to_wkt(geom) if geom else None
zones.append(
CMHCZone(
zone_code=zone_code,
zone_name=zone_name,
geometry_wkt=geometry_wkt,
)
)
return zones
def _needs_reprojection(self) -> bool:
"""Check if GeoJSON needs reprojection to WGS84."""
crs = self.geojson.get("crs", {})
crs_name = crs.get("properties", {}).get("name", "")
# EPSG:3857 or Web Mercator needs reprojection
return "3857" in crs_name or "900913" in crs_name
def get_geojson_for_choropleth(
self, key_property: str = "zone_code"
) -> dict[str, Any]:
"""Get GeoJSON formatted for Plotly choropleth maps.
Ensures the feature properties include a standardized key for
joining with data. Automatically reprojects from EPSG:3857 to
WGS84 if needed.
Args:
key_property: Property name to use as feature identifier.
Returns:
GeoJSON FeatureCollection with standardized properties in WGS84.
"""
needs_reproject = self._needs_reprojection()
features = []
for feature in self.geojson.get("features", []):
props = feature.get("properties", {})
new_props = dict(props)
# Ensure standardized property names exist
zone_code = self._find_property(props, self.CODE_PROPERTIES)
zone_name = self._find_property(props, self.NAME_PROPERTIES)
new_props["zone_code"] = zone_code
new_props["zone_name"] = zone_name or zone_code
# Reproject geometry if needed
geometry = feature.get("geometry")
if needs_reproject and geometry:
geometry = reproject_geometry(geometry)
features.append(
{
"type": "Feature",
"properties": new_props,
"geometry": geometry,
}
)
return {"type": "FeatureCollection", "features": features}
class TRREBDistrictParser:
"""Parser for TRREB district boundary GeoJSON files.
TRREB district boundaries are manually digitized from the TRREB PDF map
using QGIS.
Expected GeoJSON properties:
- district_code: District code (W01, C01, E01, etc.)
- district_name: District name
- area_type: West, Central, East, or North
"""
CODE_PROPERTIES = [
"district_code",
"District_Code",
"DISTRICT_CODE",
"districtcode",
"code",
]
NAME_PROPERTIES = [
"district_name",
"District_Name",
"DISTRICT_NAME",
"districtname",
"name",
"NAME",
]
AREA_PROPERTIES = [
"area_type",
"Area_Type",
"AREA_TYPE",
"areatype",
"area",
"type",
]
def __init__(self, geojson_path: Path) -> None:
"""Initialize parser with path to GeoJSON file."""
self.geojson_path = geojson_path
self._geojson: dict[str, Any] | None = None
@property
def geojson(self) -> dict[str, Any]:
"""Lazy-load and return raw GeoJSON data."""
if self._geojson is None:
self._geojson = load_geojson(self.geojson_path)
return self._geojson
def _find_property(
self, properties: dict[str, Any], candidates: list[str]
) -> str | None:
"""Find a property value by checking multiple candidate names."""
for name in candidates:
if name in properties and properties[name] is not None:
return str(properties[name])
return None
def _infer_area_type(self, district_code: str) -> AreaType:
"""Infer area type from district code prefix."""
prefix = district_code[0].upper()
mapping = {"W": AreaType.WEST, "C": AreaType.CENTRAL, "E": AreaType.EAST}
return mapping.get(prefix, AreaType.NORTH)
def parse(self) -> list[TRREBDistrict]:
"""Parse GeoJSON and return list of TRREBDistrict schemas."""
districts = []
for feature in self.geojson.get("features", []):
props = feature.get("properties", {})
geom = feature.get("geometry")
district_code = self._find_property(props, self.CODE_PROPERTIES)
district_name = self._find_property(props, self.NAME_PROPERTIES)
area_type_str = self._find_property(props, self.AREA_PROPERTIES)
if not district_code:
raise ValueError(
f"District code not found in properties: {list(props.keys())}"
)
if not district_name:
district_name = district_code
# Infer or parse area type
if area_type_str:
try:
area_type = AreaType(area_type_str)
except ValueError:
area_type = self._infer_area_type(district_code)
else:
area_type = self._infer_area_type(district_code)
geometry_wkt = geometry_to_wkt(geom) if geom else None
districts.append(
TRREBDistrict(
district_code=district_code,
district_name=district_name,
area_type=area_type,
geometry_wkt=geometry_wkt,
)
)
return districts
def get_geojson_for_choropleth(
self, key_property: str = "district_code"
) -> dict[str, Any]:
"""Get GeoJSON formatted for Plotly choropleth maps."""
features = []
for feature in self.geojson.get("features", []):
props = feature.get("properties", {})
new_props = dict(props)
district_code = self._find_property(props, self.CODE_PROPERTIES)
district_name = self._find_property(props, self.NAME_PROPERTIES)
new_props["district_code"] = district_code
new_props["district_name"] = district_name or district_code
features.append(
{
"type": "Feature",
"properties": new_props,
"geometry": feature.get("geometry"),
}
)
return {"type": "FeatureCollection", "features": features}
class NeighbourhoodParser:
"""Parser for City of Toronto neighbourhood boundary GeoJSON files.
Neighbourhood boundaries are from the City of Toronto Open Data portal.
Expected GeoJSON properties:
- neighbourhood_id or AREA_ID: Neighbourhood ID (1-158)
- name or AREA_NAME: Neighbourhood name
"""
ID_PROPERTIES = [
"neighbourhood_id",
"AREA_SHORT_CODE", # City of Toronto 158 neighbourhoods
"AREA_LONG_CODE",
"AREA_ID",
"area_id",
"id",
"ID",
"HOOD_ID",
]
NAME_PROPERTIES = [
"AREA_NAME", # City of Toronto 158 neighbourhoods
"name",
"NAME",
"area_name",
"neighbourhood_name",
]
def __init__(self, geojson_path: Path) -> None:
"""Initialize parser with path to GeoJSON file."""
self.geojson_path = geojson_path
self._geojson: dict[str, Any] | None = None
@property
def geojson(self) -> dict[str, Any]:
"""Lazy-load and return raw GeoJSON data."""
if self._geojson is None:
self._geojson = load_geojson(self.geojson_path)
return self._geojson
def _find_property(
self, properties: dict[str, Any], candidates: list[str]
) -> str | None:
"""Find a property value by checking multiple candidate names."""
for name in candidates:
if name in properties and properties[name] is not None:
return str(properties[name])
return None
def parse(self) -> list[Neighbourhood]:
"""Parse GeoJSON and return list of Neighbourhood schemas.
Note: This parser only extracts ID, name, and geometry.
Census enrichment data (population, income, etc.) should be
loaded separately and merged.
"""
neighbourhoods = []
for feature in self.geojson.get("features", []):
props = feature.get("properties", {})
geom = feature.get("geometry")
neighbourhood_id_str = self._find_property(props, self.ID_PROPERTIES)
name = self._find_property(props, self.NAME_PROPERTIES)
if not neighbourhood_id_str:
raise ValueError(
f"Neighbourhood ID not found in properties: {list(props.keys())}"
)
neighbourhood_id = int(neighbourhood_id_str)
if not name:
name = f"Neighbourhood {neighbourhood_id}"
geometry_wkt = geometry_to_wkt(geom) if geom else None
neighbourhoods.append(
Neighbourhood(
neighbourhood_id=neighbourhood_id,
name=name,
geometry_wkt=geometry_wkt,
)
)
return neighbourhoods
def get_geojson_for_choropleth(
self, key_property: str = "neighbourhood_id"
) -> dict[str, Any]:
"""Get GeoJSON formatted for Plotly choropleth maps."""
features = []
for feature in self.geojson.get("features", []):
props = feature.get("properties", {})
new_props = dict(props)
neighbourhood_id = self._find_property(props, self.ID_PROPERTIES)
name = self._find_property(props, self.NAME_PROPERTIES)
new_props["neighbourhood_id"] = (
int(neighbourhood_id) if neighbourhood_id else None
)
new_props["name"] = name
features.append(
{
"type": "Feature",
"properties": new_props,
"geometry": feature.get("geometry"),
}
)
return {"type": "FeatureCollection", "features": features}

52
scripts/db/init_schema.py Normal file
View File

@@ -0,0 +1,52 @@
#!/usr/bin/env python3
"""Initialize database schema.
Usage:
python scripts/db/init_schema.py
This script creates all SQLAlchemy tables in the database.
Run this after docker-compose up to initialize the schema.
"""
import sys
from pathlib import Path
# Add project root to path
sys.path.insert(0, str(Path(__file__).parent.parent.parent))
from portfolio_app.toronto.models import create_tables, get_engine # noqa: E402
def main() -> int:
"""Initialize the database schema."""
print("Initializing database schema...")
try:
engine = get_engine()
# Test connection
with engine.connect() as conn:
result = conn.execute("SELECT 1")
result.fetchone()
print("Database connection successful")
# Create all tables
create_tables()
print("Schema created successfully")
# List created tables
from sqlalchemy import inspect
inspector = inspect(engine)
tables = inspector.get_table_names()
print(f"Created tables: {', '.join(tables)}")
return 0
except Exception as e:
print(f"Error: {e}", file=sys.stderr)
return 1
if __name__ == "__main__":
sys.exit(main())

View File

@@ -0,0 +1,6 @@
"""Placeholder test to ensure pytest collection succeeds."""
def test_placeholder():
"""Remove this once real tests are added."""
assert True