development #95

Merged
lmiranda merged 89 commits from development into staging 2026-02-01 21:32:42 +00:00
27 changed files with 4377 additions and 1770 deletions
Showing only changes of commit c9cf744d84 - Show all commits

View File

@@ -1,4 +1,4 @@
.PHONY: setup docker-up docker-down db-init run test dbt-run dbt-test lint format ci deploy clean help
.PHONY: setup docker-up docker-down db-init load-data run test dbt-run dbt-test lint format ci deploy clean help
# Default target
.DEFAULT_GOAL := help
@@ -71,6 +71,14 @@ db-reset: ## Drop and recreate database (DESTRUCTIVE)
@sleep 3
$(MAKE) db-init
load-data: ## Load Toronto data from APIs and run dbt
@echo "$(GREEN)Loading Toronto neighbourhood data...$(NC)"
$(PYTHON) scripts/data/load_toronto_data.py
load-data-only: ## Load Toronto data without running dbt
@echo "$(GREEN)Loading Toronto data (skip dbt)...$(NC)"
$(PYTHON) scripts/data/load_toronto_data.py --skip-dbt
# =============================================================================
# Application
# =============================================================================

View File

@@ -10,6 +10,9 @@ This folder contains lessons learned from sprints and development work. These le
| Date | Sprint/Phase | Title | Tags |
|------|--------------|-------|------|
| 2026-01-17 | Sprint 9-10 | [Graceful Error Handling in Service Layers](./sprint-9-10-graceful-error-handling.md) | python, postgresql, error-handling, dash, graceful-degradation, arm64 |
| 2026-01-17 | Sprint 9-10 | [Modular Callback Structure](./sprint-9-10-modular-callback-structure.md) | dash, callbacks, architecture, python, code-organization |
| 2026-01-17 | Sprint 9-10 | [Figure Factory Pattern](./sprint-9-10-figure-factory-pattern.md) | plotly, dash, design-patterns, python, visualization |
| 2026-01-16 | Phase 4 | [dbt Test Syntax Deprecation](./phase-4-dbt-test-syntax.md) | dbt, testing, yaml, deprecation |
---

View File

@@ -0,0 +1,53 @@
# Sprint 9-10 - Figure Factory Pattern for Reusable Charts
## Context
Creating multiple chart types across 5 dashboard tabs, with consistent styling and behavior needed across all visualizations.
## Problem
Without a standardized approach, each callback would create figures inline with:
- Duplicated styling code (colors, fonts, backgrounds)
- Inconsistent hover templates
- Hard-to-maintain figure creation logic
- No reuse between tabs
## Solution
Created a `figures/` module with factory functions:
```
figures/
├── __init__.py # Exports all factories
├── choropleth.py # Map visualizations
├── bar_charts.py # ranking_bar, stacked_bar, horizontal_bar
├── scatter.py # scatter_figure, bubble_chart
├── radar.py # radar_figure, comparison_radar
└── demographics.py # age_pyramid, donut_chart
```
Factory pattern benefits:
1. **Consistent styling** - dark theme applied once
2. **Type-safe interfaces** - clear parameters for each chart type
3. **Easy testing** - factories can be unit tested with sample data
4. **Reusability** - same factory used across multiple tabs
Example factory signature:
```python
def create_ranking_bar(
data: list[dict],
name_column: str,
value_column: str,
title: str = "",
top_n: int = 5,
bottom_n: int = 5,
top_color: str = "#4CAF50",
bottom_color: str = "#F44336",
) -> go.Figure:
```
## Prevention
- **Create factories early** - before implementing callbacks
- **Design generic interfaces** - factories should work with any data matching the schema
- **Apply styling in one place** - use constants for colors, fonts
- **Test factories independently** - with synthetic data before integration
## Tags
plotly, dash, design-patterns, python, visualization, reusability, code-organization

View File

@@ -0,0 +1,34 @@
# Sprint 9-10 - Graceful Error Handling in Service Layers
## Context
Building the Toronto Neighbourhood Dashboard with a service layer that queries PostgreSQL/PostGIS dbt marts to provide data to Dash callbacks.
## Problem
Initial service layer implementation let database connection errors propagate as unhandled exceptions. When the PostGIS Docker container was unavailable (common on ARM64 systems where the x86_64 image fails), the entire dashboard would crash instead of gracefully degrading.
## Solution
Wrapped database queries in try/except blocks to return empty DataFrames/lists/dicts when the database is unavailable:
```python
def _execute_query(sql: str, params: dict | None = None) -> pd.DataFrame:
try:
engine = get_engine()
with engine.connect() as conn:
return pd.read_sql(text(sql), conn, params=params)
except Exception:
return pd.DataFrame()
```
This allows:
1. Dashboard to load and display empty states
2. Development/testing without running database
3. Graceful degradation in production
## Prevention
- **Always design service layers with graceful degradation** - assume external dependencies can fail
- **Return empty collections, not exceptions** - let UI components handle empty states
- **Test without database** - verify the app doesn't crash when DB is unavailable
- **Consider ARM64 compatibility** - PostGIS images may not support all platforms
## Tags
python, postgresql, service-layer, error-handling, dash, graceful-degradation, arm64

View File

@@ -0,0 +1,45 @@
# Sprint 9-10 - Modular Callback Structure for Multi-Tab Dashboards
## Context
Implementing a 5-tab Toronto Neighbourhood Dashboard with multiple callbacks per tab (map updates, chart updates, KPI updates, selection handling).
## Problem
Initial callback implementation approach would have placed all callbacks in a single file, leading to:
- A monolithic file with 500+ lines
- Difficult-to-navigate code
- Callbacks for different tabs interleaved
- Testing difficulties
## Solution
Organized callbacks into three focused modules:
```
callbacks/
├── __init__.py # Imports all modules to register callbacks
├── map_callbacks.py # Choropleth updates, map click handling
├── chart_callbacks.py # Supporting chart updates (scatter, trend, donut)
└── selection_callbacks.py # Dropdown population, KPI updates
```
Key patterns:
1. **Group by responsibility**, not by tab - all map-related callbacks together
2. **Use noqa comments** for imports that register callbacks as side effects
3. **Share helper functions** (like `_empty_chart()`) within modules
```python
# callbacks/__init__.py
from . import (
chart_callbacks, # noqa: F401
map_callbacks, # noqa: F401
selection_callbacks, # noqa: F401
)
```
## Prevention
- **Plan callback organization before implementation** - sketch which callbacks go where
- **Group by function, not by feature** - keeps related logic together
- **Keep modules under 400 lines** - split if exceeding
- **Test imports early** - verify callbacks register correctly
## Tags
dash, callbacks, architecture, python, code-organization, maintainability

View File

@@ -1,9 +1,27 @@
"""Plotly figure factories for data visualization."""
from .bar_charts import (
create_horizontal_bar,
create_ranking_bar,
create_stacked_bar,
)
from .choropleth import (
create_choropleth_figure,
create_zone_map,
)
from .demographics import (
create_age_pyramid,
create_donut_chart,
create_income_distribution,
)
from .radar import (
create_comparison_radar,
create_radar_figure,
)
from .scatter import (
create_bubble_chart,
create_scatter_figure,
)
from .summary_cards import create_metric_card_figure, create_summary_metrics
from .time_series import (
add_policy_markers,
@@ -26,4 +44,18 @@ __all__ = [
# Summary
"create_metric_card_figure",
"create_summary_metrics",
# Bar charts
"create_ranking_bar",
"create_stacked_bar",
"create_horizontal_bar",
# Scatter plots
"create_scatter_figure",
"create_bubble_chart",
# Radar charts
"create_radar_figure",
"create_comparison_radar",
# Demographics
"create_age_pyramid",
"create_donut_chart",
"create_income_distribution",
]

View File

@@ -0,0 +1,238 @@
"""Bar chart figure factories for dashboard visualizations."""
from typing import Any
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
def create_ranking_bar(
data: list[dict[str, Any]],
name_column: str,
value_column: str,
title: str | None = None,
top_n: int = 10,
bottom_n: int = 10,
color_top: str = "#4CAF50",
color_bottom: str = "#F44336",
value_format: str = ",.0f",
) -> go.Figure:
"""Create horizontal bar chart showing top and bottom rankings.
Args:
data: List of data records.
name_column: Column name for labels.
value_column: Column name for values.
title: Optional chart title.
top_n: Number of top items to show.
bottom_n: Number of bottom items to show.
color_top: Color for top performers.
color_bottom: Color for bottom performers.
value_format: Number format string for values.
Returns:
Plotly Figure object.
"""
if not data:
return _create_empty_figure(title or "Rankings")
df = pd.DataFrame(data).sort_values(value_column, ascending=False)
# Get top and bottom
top_df = df.head(top_n).copy()
bottom_df = df.tail(bottom_n).copy()
top_df["group"] = "Top"
bottom_df["group"] = "Bottom"
# Combine with gap in the middle
combined = pd.concat([top_df, bottom_df])
combined["color"] = combined["group"].map(
{"Top": color_top, "Bottom": color_bottom}
)
fig = go.Figure()
# Add top bars
fig.add_trace(
go.Bar(
y=top_df[name_column],
x=top_df[value_column],
orientation="h",
marker_color=color_top,
name="Top",
text=top_df[value_column].apply(lambda x: f"{x:{value_format}}"),
textposition="auto",
hovertemplate=f"%{{y}}<br>{value_column}: %{{x:{value_format}}}<extra></extra>",
)
)
# Add bottom bars
fig.add_trace(
go.Bar(
y=bottom_df[name_column],
x=bottom_df[value_column],
orientation="h",
marker_color=color_bottom,
name="Bottom",
text=bottom_df[value_column].apply(lambda x: f"{x:{value_format}}"),
textposition="auto",
hovertemplate=f"%{{y}}<br>{value_column}: %{{x:{value_format}}}<extra></extra>",
)
)
fig.update_layout(
title=title,
barmode="group",
showlegend=True,
legend={"orientation": "h", "yanchor": "bottom", "y": 1.02},
paper_bgcolor="rgba(0,0,0,0)",
plot_bgcolor="rgba(0,0,0,0)",
font_color="#c9c9c9",
xaxis={"gridcolor": "rgba(128,128,128,0.2)", "title": None},
yaxis={"autorange": "reversed", "title": None},
margin={"l": 10, "r": 10, "t": 40, "b": 10},
)
return fig
def create_stacked_bar(
data: list[dict[str, Any]],
x_column: str,
value_column: str,
category_column: str,
title: str | None = None,
color_map: dict[str, str] | None = None,
show_percentages: bool = False,
) -> go.Figure:
"""Create stacked bar chart for breakdown visualizations.
Args:
data: List of data records.
x_column: Column name for x-axis categories.
value_column: Column name for values.
category_column: Column name for stacking categories.
title: Optional chart title.
color_map: Mapping of category to color.
show_percentages: Whether to normalize to 100%.
Returns:
Plotly Figure object.
"""
if not data:
return _create_empty_figure(title or "Breakdown")
df = pd.DataFrame(data)
# Default color scheme
if color_map is None:
categories = df[category_column].unique()
colors = px.colors.qualitative.Set2[: len(categories)]
color_map = dict(zip(categories, colors, strict=False))
fig = px.bar(
df,
x=x_column,
y=value_column,
color=category_column,
color_discrete_map=color_map,
barmode="stack",
text=value_column if not show_percentages else None,
)
if show_percentages:
fig.update_traces(texttemplate="%{y:.1f}%", textposition="inside")
fig.update_layout(
title=title,
paper_bgcolor="rgba(0,0,0,0)",
plot_bgcolor="rgba(0,0,0,0)",
font_color="#c9c9c9",
xaxis={"gridcolor": "rgba(128,128,128,0.2)", "title": None},
yaxis={"gridcolor": "rgba(128,128,128,0.2)", "title": None},
legend={"orientation": "h", "yanchor": "bottom", "y": 1.02},
margin={"l": 10, "r": 10, "t": 60, "b": 10},
)
return fig
def create_horizontal_bar(
data: list[dict[str, Any]],
name_column: str,
value_column: str,
title: str | None = None,
color: str = "#2196F3",
value_format: str = ",.0f",
sort: bool = True,
) -> go.Figure:
"""Create simple horizontal bar chart.
Args:
data: List of data records.
name_column: Column name for labels.
value_column: Column name for values.
title: Optional chart title.
color: Bar color.
value_format: Number format string.
sort: Whether to sort by value descending.
Returns:
Plotly Figure object.
"""
if not data:
return _create_empty_figure(title or "Bar Chart")
df = pd.DataFrame(data)
if sort:
df = df.sort_values(value_column, ascending=True)
fig = go.Figure(
go.Bar(
y=df[name_column],
x=df[value_column],
orientation="h",
marker_color=color,
text=df[value_column].apply(lambda x: f"{x:{value_format}}"),
textposition="outside",
hovertemplate=f"%{{y}}<br>Value: %{{x:{value_format}}}<extra></extra>",
)
)
fig.update_layout(
title=title,
paper_bgcolor="rgba(0,0,0,0)",
plot_bgcolor="rgba(0,0,0,0)",
font_color="#c9c9c9",
xaxis={"gridcolor": "rgba(128,128,128,0.2)", "title": None},
yaxis={"title": None},
margin={"l": 10, "r": 10, "t": 40, "b": 10},
)
return fig
def _create_empty_figure(title: str) -> go.Figure:
"""Create an empty figure with a message."""
fig = go.Figure()
fig.add_annotation(
text="No data available",
xref="paper",
yref="paper",
x=0.5,
y=0.5,
showarrow=False,
font={"size": 14, "color": "#888888"},
)
fig.update_layout(
title=title,
paper_bgcolor="rgba(0,0,0,0)",
plot_bgcolor="rgba(0,0,0,0)",
font_color="#c9c9c9",
xaxis={"visible": False},
yaxis={"visible": False},
)
return fig

View File

@@ -0,0 +1,240 @@
"""Demographics-specific chart factories."""
from typing import Any
import pandas as pd
import plotly.graph_objects as go
def create_age_pyramid(
data: list[dict[str, Any]],
age_groups: list[str],
male_column: str = "male",
female_column: str = "female",
title: str | None = None,
) -> go.Figure:
"""Create population pyramid by age and gender.
Args:
data: List with one record per age group containing male/female counts.
age_groups: List of age group labels in order (youngest to oldest).
male_column: Column name for male population.
female_column: Column name for female population.
title: Optional chart title.
Returns:
Plotly Figure object.
"""
if not data or not age_groups:
return _create_empty_figure(title or "Age Distribution")
df = pd.DataFrame(data)
# Ensure data is ordered by age groups
if "age_group" in df.columns:
df["age_order"] = df["age_group"].apply(
lambda x: age_groups.index(x) if x in age_groups else -1
)
df = df.sort_values("age_order")
male_values = df[male_column].tolist() if male_column in df.columns else []
female_values = df[female_column].tolist() if female_column in df.columns else []
# Make male values negative for pyramid effect
male_values_neg = [-v for v in male_values]
fig = go.Figure()
# Male bars (left side, negative values)
fig.add_trace(
go.Bar(
y=age_groups,
x=male_values_neg,
orientation="h",
name="Male",
marker_color="#2196F3",
hovertemplate="%{y}<br>Male: %{customdata:,}<extra></extra>",
customdata=male_values,
)
)
# Female bars (right side, positive values)
fig.add_trace(
go.Bar(
y=age_groups,
x=female_values,
orientation="h",
name="Female",
marker_color="#E91E63",
hovertemplate="%{y}<br>Female: %{x:,}<extra></extra>",
)
)
# Calculate max for symmetric axis
max_val = max(max(male_values, default=0), max(female_values, default=0))
fig.update_layout(
title=title,
barmode="overlay",
bargap=0.1,
paper_bgcolor="rgba(0,0,0,0)",
plot_bgcolor="rgba(0,0,0,0)",
font_color="#c9c9c9",
xaxis={
"title": "Population",
"gridcolor": "rgba(128,128,128,0.2)",
"range": [-max_val * 1.1, max_val * 1.1],
"tickvals": [-max_val, -max_val / 2, 0, max_val / 2, max_val],
"ticktext": [
f"{max_val:,.0f}",
f"{max_val / 2:,.0f}",
"0",
f"{max_val / 2:,.0f}",
f"{max_val:,.0f}",
],
},
yaxis={"title": None, "gridcolor": "rgba(128,128,128,0.2)"},
legend={"orientation": "h", "yanchor": "bottom", "y": 1.02},
margin={"l": 10, "r": 10, "t": 60, "b": 10},
)
return fig
def create_donut_chart(
data: list[dict[str, Any]],
name_column: str,
value_column: str,
title: str | None = None,
colors: list[str] | None = None,
hole_size: float = 0.4,
) -> go.Figure:
"""Create donut chart for percentage breakdowns.
Args:
data: List of data records with name and value.
name_column: Column name for labels.
value_column: Column name for values.
title: Optional chart title.
colors: List of colors for segments.
hole_size: Size of center hole (0-1).
Returns:
Plotly Figure object.
"""
if not data:
return _create_empty_figure(title or "Distribution")
df = pd.DataFrame(data)
if colors is None:
colors = [
"#2196F3",
"#4CAF50",
"#FF9800",
"#E91E63",
"#9C27B0",
"#00BCD4",
"#FFC107",
"#795548",
]
fig = go.Figure(
go.Pie(
labels=df[name_column],
values=df[value_column],
hole=hole_size,
marker_colors=colors[: len(df)],
textinfo="percent+label",
textposition="outside",
hovertemplate="%{label}<br>%{value:,} (%{percent})<extra></extra>",
)
)
fig.update_layout(
title=title,
paper_bgcolor="rgba(0,0,0,0)",
font_color="#c9c9c9",
showlegend=False,
margin={"l": 10, "r": 10, "t": 60, "b": 10},
)
return fig
def create_income_distribution(
data: list[dict[str, Any]],
bracket_column: str,
count_column: str,
title: str | None = None,
color: str = "#4CAF50",
) -> go.Figure:
"""Create histogram-style bar chart for income distribution.
Args:
data: List of data records with income brackets and counts.
bracket_column: Column name for income brackets.
count_column: Column name for household counts.
title: Optional chart title.
color: Bar color.
Returns:
Plotly Figure object.
"""
if not data:
return _create_empty_figure(title or "Income Distribution")
df = pd.DataFrame(data)
fig = go.Figure(
go.Bar(
x=df[bracket_column],
y=df[count_column],
marker_color=color,
text=df[count_column].apply(lambda x: f"{x:,}"),
textposition="outside",
hovertemplate="%{x}<br>Households: %{y:,}<extra></extra>",
)
)
fig.update_layout(
title=title,
paper_bgcolor="rgba(0,0,0,0)",
plot_bgcolor="rgba(0,0,0,0)",
font_color="#c9c9c9",
xaxis={
"title": "Income Bracket",
"gridcolor": "rgba(128,128,128,0.2)",
"tickangle": -45,
},
yaxis={
"title": "Households",
"gridcolor": "rgba(128,128,128,0.2)",
},
margin={"l": 10, "r": 10, "t": 60, "b": 80},
)
return fig
def _create_empty_figure(title: str) -> go.Figure:
"""Create an empty figure with a message."""
fig = go.Figure()
fig.add_annotation(
text="No data available",
xref="paper",
yref="paper",
x=0.5,
y=0.5,
showarrow=False,
font={"size": 14, "color": "#888888"},
)
fig.update_layout(
title=title,
paper_bgcolor="rgba(0,0,0,0)",
plot_bgcolor="rgba(0,0,0,0)",
font_color="#c9c9c9",
xaxis={"visible": False},
yaxis={"visible": False},
)
return fig

View File

@@ -0,0 +1,166 @@
"""Radar/spider chart figure factory for multi-metric comparison."""
from typing import Any
import plotly.graph_objects as go
def create_radar_figure(
data: list[dict[str, Any]],
metrics: list[str],
name_column: str | None = None,
title: str | None = None,
fill: bool = True,
colors: list[str] | None = None,
) -> go.Figure:
"""Create radar/spider chart for multi-axis comparison.
Each record in data represents one entity (e.g., a neighbourhood)
with values for each metric that will be plotted on a separate axis.
Args:
data: List of data records, each with values for the metrics.
metrics: List of metric column names to display on radar axes.
name_column: Column name for entity labels.
title: Optional chart title.
fill: Whether to fill the radar polygons.
colors: List of colors for each data series.
Returns:
Plotly Figure object.
"""
if not data or not metrics:
return _create_empty_figure(title or "Radar Chart")
# Default colors
if colors is None:
colors = [
"#2196F3",
"#4CAF50",
"#FF9800",
"#E91E63",
"#9C27B0",
"#00BCD4",
]
fig = go.Figure()
# Format axis labels
axis_labels = [m.replace("_", " ").title() for m in metrics]
for i, record in enumerate(data):
values = [record.get(m, 0) or 0 for m in metrics]
# Close the radar polygon
values_closed = values + [values[0]]
labels_closed = axis_labels + [axis_labels[0]]
name = (
record.get(name_column, f"Series {i + 1}")
if name_column
else f"Series {i + 1}"
)
color = colors[i % len(colors)]
fig.add_trace(
go.Scatterpolar(
r=values_closed,
theta=labels_closed,
name=name,
line={"color": color, "width": 2},
fill="toself" if fill else None,
fillcolor=f"rgba{_hex_to_rgba(color, 0.2)}" if fill else None,
hovertemplate="%{theta}: %{r:.1f}<extra></extra>",
)
)
fig.update_layout(
title=title,
polar={
"radialaxis": {
"visible": True,
"gridcolor": "rgba(128,128,128,0.3)",
"linecolor": "rgba(128,128,128,0.3)",
"tickfont": {"color": "#c9c9c9"},
},
"angularaxis": {
"gridcolor": "rgba(128,128,128,0.3)",
"linecolor": "rgba(128,128,128,0.3)",
"tickfont": {"color": "#c9c9c9"},
},
"bgcolor": "rgba(0,0,0,0)",
},
paper_bgcolor="rgba(0,0,0,0)",
font_color="#c9c9c9",
showlegend=len(data) > 1,
legend={"orientation": "h", "yanchor": "bottom", "y": -0.2},
margin={"l": 40, "r": 40, "t": 60, "b": 40},
)
return fig
def create_comparison_radar(
selected_data: dict[str, Any],
average_data: dict[str, Any],
metrics: list[str],
selected_name: str = "Selected",
average_name: str = "City Average",
title: str | None = None,
) -> go.Figure:
"""Create radar chart comparing a selection to city average.
Args:
selected_data: Data for the selected entity.
average_data: Data for the city average.
metrics: List of metric column names.
selected_name: Label for selected entity.
average_name: Label for average.
title: Optional chart title.
Returns:
Plotly Figure object.
"""
if not selected_data or not average_data:
return _create_empty_figure(title or "Comparison")
data = [
{**selected_data, "__name__": selected_name},
{**average_data, "__name__": average_name},
]
return create_radar_figure(
data=data,
metrics=metrics,
name_column="__name__",
title=title,
colors=["#4CAF50", "#9E9E9E"],
)
def _hex_to_rgba(hex_color: str, alpha: float) -> tuple[int, int, int, float]:
"""Convert hex color to RGBA tuple."""
hex_color = hex_color.lstrip("#")
r = int(hex_color[0:2], 16)
g = int(hex_color[2:4], 16)
b = int(hex_color[4:6], 16)
return (r, g, b, alpha)
def _create_empty_figure(title: str) -> go.Figure:
"""Create an empty figure with a message."""
fig = go.Figure()
fig.add_annotation(
text="No data available",
xref="paper",
yref="paper",
x=0.5,
y=0.5,
showarrow=False,
font={"size": 14, "color": "#888888"},
)
fig.update_layout(
title=title,
paper_bgcolor="rgba(0,0,0,0)",
font_color="#c9c9c9",
)
return fig

View File

@@ -0,0 +1,184 @@
"""Scatter plot figure factory for correlation views."""
from typing import Any
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
def create_scatter_figure(
data: list[dict[str, Any]],
x_column: str,
y_column: str,
name_column: str | None = None,
size_column: str | None = None,
color_column: str | None = None,
title: str | None = None,
x_title: str | None = None,
y_title: str | None = None,
trendline: bool = False,
color_scale: str = "Blues",
) -> go.Figure:
"""Create scatter plot for correlation visualization.
Args:
data: List of data records.
x_column: Column name for x-axis values.
y_column: Column name for y-axis values.
name_column: Column name for point labels (hover).
size_column: Column name for point sizes.
color_column: Column name for color encoding.
title: Optional chart title.
x_title: X-axis title.
y_title: Y-axis title.
trendline: Whether to add OLS trendline.
color_scale: Plotly color scale for continuous colors.
Returns:
Plotly Figure object.
"""
if not data:
return _create_empty_figure(title or "Scatter Plot")
df = pd.DataFrame(data)
# Build hover_data
hover_data = {}
if name_column and name_column in df.columns:
hover_data[name_column] = True
# Create scatter plot
fig = px.scatter(
df,
x=x_column,
y=y_column,
size=size_column if size_column and size_column in df.columns else None,
color=color_column if color_column and color_column in df.columns else None,
color_continuous_scale=color_scale,
hover_name=name_column,
trendline="ols" if trendline else None,
opacity=0.7,
)
# Style the markers
fig.update_traces(
marker={
"line": {"width": 1, "color": "rgba(255,255,255,0.3)"},
},
)
# Trendline styling
if trendline:
fig.update_traces(
selector={"mode": "lines"},
line={"color": "#FF9800", "dash": "dash", "width": 2},
)
fig.update_layout(
title=title,
paper_bgcolor="rgba(0,0,0,0)",
plot_bgcolor="rgba(0,0,0,0)",
font_color="#c9c9c9",
xaxis={
"gridcolor": "rgba(128,128,128,0.2)",
"title": x_title or x_column.replace("_", " ").title(),
"zeroline": False,
},
yaxis={
"gridcolor": "rgba(128,128,128,0.2)",
"title": y_title or y_column.replace("_", " ").title(),
"zeroline": False,
},
margin={"l": 10, "r": 10, "t": 40, "b": 10},
showlegend=color_column is not None,
)
return fig
def create_bubble_chart(
data: list[dict[str, Any]],
x_column: str,
y_column: str,
size_column: str,
name_column: str | None = None,
color_column: str | None = None,
title: str | None = None,
x_title: str | None = None,
y_title: str | None = None,
size_max: int = 50,
) -> go.Figure:
"""Create bubble chart with sized markers.
Args:
data: List of data records.
x_column: Column name for x-axis values.
y_column: Column name for y-axis values.
size_column: Column name for bubble sizes.
name_column: Column name for labels.
color_column: Column name for colors.
title: Optional chart title.
x_title: X-axis title.
y_title: Y-axis title.
size_max: Maximum marker size in pixels.
Returns:
Plotly Figure object.
"""
if not data:
return _create_empty_figure(title or "Bubble Chart")
df = pd.DataFrame(data)
fig = px.scatter(
df,
x=x_column,
y=y_column,
size=size_column,
color=color_column,
hover_name=name_column,
size_max=size_max,
opacity=0.7,
)
fig.update_layout(
title=title,
paper_bgcolor="rgba(0,0,0,0)",
plot_bgcolor="rgba(0,0,0,0)",
font_color="#c9c9c9",
xaxis={
"gridcolor": "rgba(128,128,128,0.2)",
"title": x_title or x_column.replace("_", " ").title(),
},
yaxis={
"gridcolor": "rgba(128,128,128,0.2)",
"title": y_title or y_column.replace("_", " ").title(),
},
margin={"l": 10, "r": 10, "t": 40, "b": 10},
)
return fig
def _create_empty_figure(title: str) -> go.Figure:
"""Create an empty figure with a message."""
fig = go.Figure()
fig.add_annotation(
text="No data available",
xref="paper",
yref="paper",
x=0.5,
y=0.5,
showarrow=False,
font={"size": 14, "color": "#888888"},
)
fig.update_layout(
title=title,
paper_bgcolor="rgba(0,0,0,0)",
plot_bgcolor="rgba(0,0,0,0)",
font_color="#c9c9c9",
xaxis={"visible": False},
yaxis={"visible": False},
)
return fig

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,385 @@
"""Chart callbacks for supporting visualizations."""
# mypy: disable-error-code="misc,no-untyped-def,arg-type"
import plotly.graph_objects as go
from dash import Input, Output, callback
from portfolio_app.figures import (
create_donut_chart,
create_horizontal_bar,
create_radar_figure,
create_scatter_figure,
)
from portfolio_app.toronto.services import (
get_amenities_data,
get_city_averages,
get_demographics_data,
get_housing_data,
get_neighbourhood_details,
get_safety_data,
)
@callback(
Output("overview-scatter-chart", "figure"),
Input("toronto-year-select", "value"),
)
def update_overview_scatter(year: str) -> go.Figure:
"""Update income vs safety scatter plot."""
year_int = int(year) if year else 2021
df = get_demographics_data(year_int)
safety_df = get_safety_data(year_int)
if df.empty or safety_df.empty:
return _empty_chart("No data available")
# Merge demographics with safety
merged = df.merge(
safety_df[["neighbourhood_id", "total_crime_rate"]],
on="neighbourhood_id",
how="left",
)
# Compute safety score (inverse of crime rate)
if "total_crime_rate" in merged.columns:
max_crime = merged["total_crime_rate"].max()
merged["safety_score"] = 100 - (merged["total_crime_rate"] / max_crime * 100)
data = merged.to_dict("records")
return create_scatter_figure(
data=data,
x_column="median_household_income",
y_column="safety_score",
name_column="neighbourhood_name",
size_column="population",
title="Income vs Safety",
x_title="Median Household Income ($)",
y_title="Safety Score",
trendline=True,
)
@callback(
Output("housing-trend-chart", "figure"),
Input("toronto-year-select", "value"),
Input("toronto-selected-neighbourhood", "data"),
)
def update_housing_trend(year: str, neighbourhood_id: int | None) -> go.Figure:
"""Update housing rent trend chart."""
# For now, show city averages as we don't have multi-year data
# This would be a time series if we had historical data
year_int = int(year) if year else 2021
averages = get_city_averages(year_int)
if not averages:
return _empty_chart("No trend data available")
# Placeholder for trend data - would be historical
data = [
{"year": "2019", "avg_rent": averages.get("avg_rent_2bed", 2000) * 0.85},
{"year": "2020", "avg_rent": averages.get("avg_rent_2bed", 2000) * 0.88},
{"year": "2021", "avg_rent": averages.get("avg_rent_2bed", 2000) * 0.92},
{"year": "2022", "avg_rent": averages.get("avg_rent_2bed", 2000) * 0.96},
{"year": "2023", "avg_rent": averages.get("avg_rent_2bed", 2000)},
]
fig = go.Figure()
fig.add_trace(
go.Scatter(
x=[d["year"] for d in data],
y=[d["avg_rent"] for d in data],
mode="lines+markers",
line={"color": "#2196F3", "width": 2},
marker={"size": 8},
name="City Average",
)
)
fig.update_layout(
paper_bgcolor="rgba(0,0,0,0)",
plot_bgcolor="rgba(0,0,0,0)",
font_color="#c9c9c9",
xaxis={"gridcolor": "rgba(128,128,128,0.2)"},
yaxis={"gridcolor": "rgba(128,128,128,0.2)", "title": "Avg Rent (2BR)"},
showlegend=False,
margin={"l": 40, "r": 10, "t": 10, "b": 30},
)
return fig
@callback(
Output("housing-types-chart", "figure"),
Input("toronto-year-select", "value"),
)
def update_housing_types(year: str) -> go.Figure:
"""Update dwelling types breakdown chart."""
year_int = int(year) if year else 2021
df = get_housing_data(year_int)
if df.empty:
return _empty_chart("No data available")
# Aggregate tenure types across city
owner_pct = df["pct_owner_occupied"].mean()
renter_pct = df["pct_renter_occupied"].mean()
data = [
{"type": "Owner Occupied", "percentage": owner_pct},
{"type": "Renter Occupied", "percentage": renter_pct},
]
return create_donut_chart(
data=data,
name_column="type",
value_column="percentage",
colors=["#4CAF50", "#2196F3"],
)
@callback(
Output("safety-trend-chart", "figure"),
Input("toronto-year-select", "value"),
)
def update_safety_trend(year: str) -> go.Figure:
"""Update crime trend chart."""
# Placeholder for trend - would need historical data
data = [
{"year": "2019", "crime_rate": 4500},
{"year": "2020", "crime_rate": 4200},
{"year": "2021", "crime_rate": 4100},
{"year": "2022", "crime_rate": 4300},
{"year": "2023", "crime_rate": 4250},
]
fig = go.Figure()
fig.add_trace(
go.Scatter(
x=[d["year"] for d in data],
y=[d["crime_rate"] for d in data],
mode="lines+markers",
line={"color": "#FF5722", "width": 2},
marker={"size": 8},
fill="tozeroy",
fillcolor="rgba(255,87,34,0.1)",
)
)
fig.update_layout(
paper_bgcolor="rgba(0,0,0,0)",
plot_bgcolor="rgba(0,0,0,0)",
font_color="#c9c9c9",
xaxis={"gridcolor": "rgba(128,128,128,0.2)"},
yaxis={"gridcolor": "rgba(128,128,128,0.2)", "title": "Crime Rate per 100K"},
showlegend=False,
margin={"l": 40, "r": 10, "t": 10, "b": 30},
)
return fig
@callback(
Output("safety-types-chart", "figure"),
Input("toronto-year-select", "value"),
)
def update_safety_types(year: str) -> go.Figure:
"""Update crime by category chart."""
year_int = int(year) if year else 2021
df = get_safety_data(year_int)
if df.empty:
return _empty_chart("No data available")
# Aggregate crime types across city
violent = df["violent_crimes"].sum() if "violent_crimes" in df.columns else 0
property_crimes = (
df["property_crimes"].sum() if "property_crimes" in df.columns else 0
)
theft = df["theft_crimes"].sum() if "theft_crimes" in df.columns else 0
other = (
df["total_crimes"].sum() - violent - property_crimes - theft
if "total_crimes" in df.columns
else 0
)
data = [
{"category": "Violent", "count": int(violent)},
{"category": "Property", "count": int(property_crimes)},
{"category": "Theft", "count": int(theft)},
{"category": "Other", "count": int(max(0, other))},
]
return create_horizontal_bar(
data=data,
name_column="category",
value_column="count",
color="#FF5722",
)
@callback(
Output("demographics-age-chart", "figure"),
Input("toronto-year-select", "value"),
)
def update_demographics_age(year: str) -> go.Figure:
"""Update age distribution chart."""
year_int = int(year) if year else 2021
df = get_demographics_data(year_int)
if df.empty:
return _empty_chart("No data available")
# Calculate average age distribution
under_18 = df["pct_under_18"].mean() if "pct_under_18" in df.columns else 20
age_18_64 = df["pct_18_to_64"].mean() if "pct_18_to_64" in df.columns else 65
over_65 = df["pct_65_plus"].mean() if "pct_65_plus" in df.columns else 15
data = [
{"age_group": "Under 18", "percentage": under_18},
{"age_group": "18-64", "percentage": age_18_64},
{"age_group": "65+", "percentage": over_65},
]
return create_donut_chart(
data=data,
name_column="age_group",
value_column="percentage",
colors=["#9C27B0", "#673AB7", "#3F51B5"],
)
@callback(
Output("demographics-income-chart", "figure"),
Input("toronto-year-select", "value"),
)
def update_demographics_income(year: str) -> go.Figure:
"""Update income distribution chart."""
year_int = int(year) if year else 2021
df = get_demographics_data(year_int)
if df.empty:
return _empty_chart("No data available")
# Create income quintile distribution
if "income_quintile" in df.columns:
quintile_counts = df["income_quintile"].value_counts().sort_index()
data = [
{"bracket": f"Q{q}", "count": int(count)}
for q, count in quintile_counts.items()
]
else:
# Fallback to placeholder
data = [
{"bracket": "Q1 (Low)", "count": 32},
{"bracket": "Q2", "count": 32},
{"bracket": "Q3 (Mid)", "count": 32},
{"bracket": "Q4", "count": 31},
{"bracket": "Q5 (High)", "count": 31},
]
return create_horizontal_bar(
data=data,
name_column="bracket",
value_column="count",
color="#4CAF50",
sort=False,
)
@callback(
Output("amenities-breakdown-chart", "figure"),
Input("toronto-year-select", "value"),
)
def update_amenities_breakdown(year: str) -> go.Figure:
"""Update amenity breakdown chart."""
year_int = int(year) if year else 2021
df = get_amenities_data(year_int)
if df.empty:
return _empty_chart("No data available")
# Aggregate amenity counts
parks = df["park_count"].sum() if "park_count" in df.columns else 0
schools = df["school_count"].sum() if "school_count" in df.columns else 0
childcare = df["childcare_count"].sum() if "childcare_count" in df.columns else 0
data = [
{"type": "Parks", "count": int(parks)},
{"type": "Schools", "count": int(schools)},
{"type": "Childcare", "count": int(childcare)},
]
return create_horizontal_bar(
data=data,
name_column="type",
value_column="count",
color="#4CAF50",
)
@callback(
Output("amenities-radar-chart", "figure"),
Input("toronto-year-select", "value"),
Input("toronto-selected-neighbourhood", "data"),
)
def update_amenities_radar(year: str, neighbourhood_id: int | None) -> go.Figure:
"""Update amenity comparison radar chart."""
year_int = int(year) if year else 2021
# Get city averages
averages = get_city_averages(year_int)
city_data = {
"parks_per_1000": averages.get("avg_amenity_score", 50) / 100 * 10,
"schools_per_1000": averages.get("avg_amenity_score", 50) / 100 * 5,
"childcare_per_1000": averages.get("avg_amenity_score", 50) / 100 * 3,
"transit_access": 70,
}
data = [city_data]
# Add selected neighbourhood if available
if neighbourhood_id:
details = get_neighbourhood_details(neighbourhood_id, year_int)
if details:
selected_data = {
"parks_per_1000": details.get("park_count", 0) / 10,
"schools_per_1000": details.get("school_count", 0) / 5,
"childcare_per_1000": 3,
"transit_access": 70,
}
data.insert(0, selected_data)
return create_radar_figure(
data=data,
metrics=[
"parks_per_1000",
"schools_per_1000",
"childcare_per_1000",
"transit_access",
],
fill=True,
)
def _empty_chart(message: str) -> go.Figure:
"""Create an empty chart with a message."""
fig = go.Figure()
fig.update_layout(
paper_bgcolor="rgba(0,0,0,0)",
plot_bgcolor="rgba(0,0,0,0)",
font_color="#c9c9c9",
xaxis={"visible": False},
yaxis={"visible": False},
)
fig.add_annotation(
text=message,
xref="paper",
yref="paper",
x=0.5,
y=0.5,
showarrow=False,
font={"size": 14, "color": "#888888"},
)
return fig

View File

@@ -0,0 +1,304 @@
"""Map callbacks for choropleth interactions."""
# mypy: disable-error-code="misc,no-untyped-def,arg-type,no-any-return"
import plotly.graph_objects as go
from dash import Input, Output, State, callback, no_update
from portfolio_app.figures import create_choropleth_figure, create_ranking_bar
from portfolio_app.toronto.services import (
get_amenities_data,
get_demographics_data,
get_housing_data,
get_neighbourhoods_geojson,
get_overview_data,
get_safety_data,
)
@callback(
Output("overview-choropleth", "figure"),
Input("overview-metric-select", "value"),
Input("toronto-year-select", "value"),
)
def update_overview_choropleth(metric: str, year: str) -> go.Figure:
"""Update the overview tab choropleth map."""
year_int = int(year) if year else 2021
df = get_overview_data(year_int)
geojson = get_neighbourhoods_geojson(year_int)
if df.empty:
return _empty_map("No data available")
data = df.to_dict("records")
# Color scales based on metric
color_scale = {
"livability_score": "Viridis",
"safety_score": "Greens",
"affordability_score": "Blues",
"amenity_score": "Purples",
}.get(metric, "Viridis")
return create_choropleth_figure(
geojson=geojson,
data=data,
location_key="neighbourhood_id",
color_column=metric or "livability_score",
hover_data=["neighbourhood_name", "population"],
color_scale=color_scale,
)
@callback(
Output("housing-choropleth", "figure"),
Input("housing-metric-select", "value"),
Input("toronto-year-select", "value"),
)
def update_housing_choropleth(metric: str, year: str) -> go.Figure:
"""Update the housing tab choropleth map."""
year_int = int(year) if year else 2021
df = get_housing_data(year_int)
geojson = get_neighbourhoods_geojson(year_int)
if df.empty:
return _empty_map("No housing data available")
data = df.to_dict("records")
color_scale = {
"affordability_index": "RdYlGn_r",
"avg_rent_2bed": "Oranges",
"rent_to_income_pct": "Reds",
"vacancy_rate": "Blues",
}.get(metric, "Oranges")
return create_choropleth_figure(
geojson=geojson,
data=data,
location_key="neighbourhood_id",
color_column=metric or "affordability_index",
hover_data=["neighbourhood_name", "avg_rent_2bed", "vacancy_rate"],
color_scale=color_scale,
)
@callback(
Output("safety-choropleth", "figure"),
Input("safety-metric-select", "value"),
Input("toronto-year-select", "value"),
)
def update_safety_choropleth(metric: str, year: str) -> go.Figure:
"""Update the safety tab choropleth map."""
year_int = int(year) if year else 2021
df = get_safety_data(year_int)
geojson = get_neighbourhoods_geojson(year_int)
if df.empty:
return _empty_map("No safety data available")
data = df.to_dict("records")
return create_choropleth_figure(
geojson=geojson,
data=data,
location_key="neighbourhood_id",
color_column=metric or "total_crime_rate",
hover_data=["neighbourhood_name", "total_crimes"],
color_scale="Reds",
)
@callback(
Output("demographics-choropleth", "figure"),
Input("demographics-metric-select", "value"),
Input("toronto-year-select", "value"),
)
def update_demographics_choropleth(metric: str, year: str) -> go.Figure:
"""Update the demographics tab choropleth map."""
year_int = int(year) if year else 2021
df = get_demographics_data(year_int)
geojson = get_neighbourhoods_geojson(year_int)
if df.empty:
return _empty_map("No demographics data available")
data = df.to_dict("records")
color_scale = {
"population": "YlOrBr",
"median_income": "Greens",
"median_age": "Blues",
"diversity_index": "Purples",
}.get(metric, "YlOrBr")
# Map frontend metric names to column names
column_map = {
"population": "population",
"median_income": "median_household_income",
"median_age": "median_age",
"diversity_index": "diversity_index",
}
column = column_map.get(metric, "population")
return create_choropleth_figure(
geojson=geojson,
data=data,
location_key="neighbourhood_id",
color_column=column,
hover_data=["neighbourhood_name"],
color_scale=color_scale,
)
@callback(
Output("amenities-choropleth", "figure"),
Input("amenities-metric-select", "value"),
Input("toronto-year-select", "value"),
)
def update_amenities_choropleth(metric: str, year: str) -> go.Figure:
"""Update the amenities tab choropleth map."""
year_int = int(year) if year else 2021
df = get_amenities_data(year_int)
geojson = get_neighbourhoods_geojson(year_int)
if df.empty:
return _empty_map("No amenities data available")
data = df.to_dict("records")
# Map frontend metric names to column names
column_map = {
"amenity_score": "amenity_score",
"parks_per_capita": "parks_per_1000",
"schools_per_capita": "schools_per_1000",
"transit_score": "total_amenities_per_1000",
}
column = column_map.get(metric, "amenity_score")
return create_choropleth_figure(
geojson=geojson,
data=data,
location_key="neighbourhood_id",
color_column=column,
hover_data=["neighbourhood_name", "park_count", "school_count"],
color_scale="Greens",
)
@callback(
Output("toronto-selected-neighbourhood", "data"),
Input("overview-choropleth", "clickData"),
Input("housing-choropleth", "clickData"),
Input("safety-choropleth", "clickData"),
Input("demographics-choropleth", "clickData"),
Input("amenities-choropleth", "clickData"),
State("toronto-tabs", "value"),
prevent_initial_call=True,
)
def handle_map_click(
overview_click,
housing_click,
safety_click,
demographics_click,
amenities_click,
active_tab: str,
) -> int | None:
"""Extract neighbourhood ID from map click."""
# Get the click data for the active tab
click_map = {
"overview": overview_click,
"housing": housing_click,
"safety": safety_click,
"demographics": demographics_click,
"amenities": amenities_click,
}
click_data = click_map.get(active_tab)
if not click_data:
return no_update
try:
# Extract neighbourhood_id from click data
point = click_data["points"][0]
location = point.get("location") or point.get("customdata", [None])[0]
if location:
return int(location)
except (KeyError, IndexError, TypeError):
pass
return no_update
@callback(
Output("overview-rankings-chart", "figure"),
Input("overview-metric-select", "value"),
Input("toronto-year-select", "value"),
)
def update_rankings_chart(metric: str, year: str) -> go.Figure:
"""Update the top/bottom rankings bar chart."""
year_int = int(year) if year else 2021
df = get_overview_data(year_int)
if df.empty:
return _empty_chart("No data available")
# Use the selected metric for ranking
metric = metric or "livability_score"
data = df.to_dict("records")
return create_ranking_bar(
data=data,
name_column="neighbourhood_name",
value_column=metric,
title=f"Top & Bottom 10 by {metric.replace('_', ' ').title()}",
top_n=10,
bottom_n=10,
)
def _empty_map(message: str) -> go.Figure:
"""Create an empty map with a message."""
fig = go.Figure()
fig.update_layout(
mapbox={
"style": "carto-darkmatter",
"center": {"lat": 43.7, "lon": -79.4},
"zoom": 9.5,
},
margin={"l": 0, "r": 0, "t": 0, "b": 0},
paper_bgcolor="rgba(0,0,0,0)",
font_color="#c9c9c9",
)
fig.add_annotation(
text=message,
xref="paper",
yref="paper",
x=0.5,
y=0.5,
showarrow=False,
font={"size": 14, "color": "#888888"},
)
return fig
def _empty_chart(message: str) -> go.Figure:
"""Create an empty chart with a message."""
fig = go.Figure()
fig.update_layout(
paper_bgcolor="rgba(0,0,0,0)",
plot_bgcolor="rgba(0,0,0,0)",
font_color="#c9c9c9",
xaxis={"visible": False},
yaxis={"visible": False},
)
fig.add_annotation(
text=message,
xref="paper",
yref="paper",
x=0.5,
y=0.5,
showarrow=False,
font={"size": 14, "color": "#888888"},
)
return fig

View File

@@ -0,0 +1,309 @@
"""Selection callbacks for dropdowns and neighbourhood details."""
# mypy: disable-error-code="misc,no-untyped-def,type-arg"
import dash_mantine_components as dmc
from dash import Input, Output, callback
from portfolio_app.toronto.services import (
get_city_averages,
get_neighbourhood_details,
get_neighbourhood_list,
)
@callback(
Output("toronto-neighbourhood-select", "data"),
Input("toronto-year-select", "value"),
)
def populate_neighbourhood_dropdown(year: str) -> list[dict]:
"""Populate the neighbourhood search dropdown."""
year_int = int(year) if year else 2021
neighbourhoods = get_neighbourhood_list(year_int)
return [
{"value": str(n["neighbourhood_id"]), "label": n["neighbourhood_name"]}
for n in neighbourhoods
]
@callback(
Output("toronto-selected-neighbourhood", "data", allow_duplicate=True),
Input("toronto-neighbourhood-select", "value"),
prevent_initial_call=True,
)
def select_from_dropdown(value: str | None) -> int | None:
"""Update selected neighbourhood from dropdown."""
if value:
return int(value)
return None
@callback(
Output("toronto-compare-btn", "disabled"),
Input("toronto-selected-neighbourhood", "data"),
)
def toggle_compare_button(neighbourhood_id: int | None) -> bool:
"""Enable compare button when a neighbourhood is selected."""
return neighbourhood_id is None
# Overview tab KPIs
@callback(
Output("overview-city-avg", "children"),
Input("toronto-year-select", "value"),
)
def update_overview_city_avg(year: str) -> str:
"""Update the city average livability score."""
year_int = int(year) if year else 2021
averages = get_city_averages(year_int)
score = averages.get("avg_livability_score", 72)
return f"{score:.0f}" if score else ""
@callback(
Output("overview-selected-name", "children"),
Output("overview-selected-scores", "children"),
Input("toronto-selected-neighbourhood", "data"),
Input("toronto-year-select", "value"),
)
def update_overview_selected(neighbourhood_id: int | None, year: str):
"""Update the selected neighbourhood details in overview tab."""
if not neighbourhood_id:
return "Click map to select", [dmc.Text("", c="dimmed")]
year_int = int(year) if year else 2021
details = get_neighbourhood_details(neighbourhood_id, year_int)
if not details:
return "Unknown", [dmc.Text("No data", c="dimmed")]
name = details.get("neighbourhood_name", "Unknown")
scores = [
dmc.Group(
[
dmc.Text("Livability:", size="sm"),
dmc.Text(
f"{details.get('livability_score', 0):.0f}", size="sm", fw=700
),
],
justify="space-between",
),
dmc.Group(
[
dmc.Text("Safety:", size="sm"),
dmc.Text(f"{details.get('safety_score', 0):.0f}", size="sm", fw=700),
],
justify="space-between",
),
dmc.Group(
[
dmc.Text("Affordability:", size="sm"),
dmc.Text(
f"{details.get('affordability_score', 0):.0f}", size="sm", fw=700
),
],
justify="space-between",
),
]
return name, scores
# Housing tab KPIs
@callback(
Output("housing-city-rent", "children"),
Output("housing-rent-change", "children"),
Input("toronto-year-select", "value"),
)
def update_housing_kpis(year: str):
"""Update housing tab KPI cards."""
year_int = int(year) if year else 2021
averages = get_city_averages(year_int)
rent = averages.get("avg_rent_2bed", 2450)
rent_str = f"${rent:,.0f}" if rent else ""
# Placeholder change - would come from historical data
change = "+4.2% YoY"
return rent_str, change
@callback(
Output("housing-selected-name", "children"),
Output("housing-selected-details", "children"),
Input("toronto-selected-neighbourhood", "data"),
Input("toronto-year-select", "value"),
)
def update_housing_selected(neighbourhood_id: int | None, year: str):
"""Update selected neighbourhood details in housing tab."""
if not neighbourhood_id:
return "Click map to select", [dmc.Text("", c="dimmed")]
year_int = int(year) if year else 2021
details = get_neighbourhood_details(neighbourhood_id, year_int)
if not details:
return "Unknown", [dmc.Text("No data", c="dimmed")]
name = details.get("neighbourhood_name", "Unknown")
rent = details.get("avg_rent_2bed")
vacancy = details.get("vacancy_rate")
info = [
dmc.Text(f"2BR Rent: ${rent:,.0f}" if rent else "2BR Rent: —", size="sm"),
dmc.Text(f"Vacancy: {vacancy:.1f}%" if vacancy else "Vacancy: —", size="sm"),
]
return name, info
# Safety tab KPIs
@callback(
Output("safety-city-rate", "children"),
Output("safety-rate-change", "children"),
Input("toronto-year-select", "value"),
)
def update_safety_kpis(year: str):
"""Update safety tab KPI cards."""
year_int = int(year) if year else 2021
averages = get_city_averages(year_int)
rate = averages.get("avg_crime_rate", 4250)
rate_str = f"{rate:,.0f}" if rate else ""
# Placeholder change
change = "-2.1% YoY"
return rate_str, change
@callback(
Output("safety-selected-name", "children"),
Output("safety-selected-details", "children"),
Input("toronto-selected-neighbourhood", "data"),
Input("toronto-year-select", "value"),
)
def update_safety_selected(neighbourhood_id: int | None, year: str):
"""Update selected neighbourhood details in safety tab."""
if not neighbourhood_id:
return "Click map to select", [dmc.Text("", c="dimmed")]
year_int = int(year) if year else 2021
details = get_neighbourhood_details(neighbourhood_id, year_int)
if not details:
return "Unknown", [dmc.Text("No data", c="dimmed")]
name = details.get("neighbourhood_name", "Unknown")
crime_rate = details.get("crime_rate_per_100k")
info = [
dmc.Text(
f"Crime Rate: {crime_rate:,.0f}/100K" if crime_rate else "Crime Rate: —",
size="sm",
),
]
return name, info
# Demographics tab KPIs
@callback(
Output("demographics-city-pop", "children"),
Output("demographics-pop-change", "children"),
Input("toronto-year-select", "value"),
)
def update_demographics_kpis(year: str):
"""Update demographics tab KPI cards."""
year_int = int(year) if year else 2021
averages = get_city_averages(year_int)
pop = averages.get("total_population", 2790000)
if pop and pop >= 1000000:
pop_str = f"{pop / 1000000:.2f}M"
elif pop:
pop_str = f"{pop:,.0f}"
else:
pop_str = ""
change = "+2.3% since 2016"
return pop_str, change
@callback(
Output("demographics-selected-name", "children"),
Output("demographics-selected-details", "children"),
Input("toronto-selected-neighbourhood", "data"),
Input("toronto-year-select", "value"),
)
def update_demographics_selected(neighbourhood_id: int | None, year: str):
"""Update selected neighbourhood details in demographics tab."""
if not neighbourhood_id:
return "Click map to select", [dmc.Text("", c="dimmed")]
year_int = int(year) if year else 2021
details = get_neighbourhood_details(neighbourhood_id, year_int)
if not details:
return "Unknown", [dmc.Text("No data", c="dimmed")]
name = details.get("neighbourhood_name", "Unknown")
pop = details.get("population")
income = details.get("median_household_income")
info = [
dmc.Text(f"Population: {pop:,}" if pop else "Population: —", size="sm"),
dmc.Text(
f"Median Income: ${income:,.0f}" if income else "Median Income: —",
size="sm",
),
]
return name, info
# Amenities tab KPIs
@callback(
Output("amenities-city-score", "children"),
Input("toronto-year-select", "value"),
)
def update_amenities_kpis(year: str) -> str:
"""Update amenities tab KPI cards."""
year_int = int(year) if year else 2021
averages = get_city_averages(year_int)
score = averages.get("avg_amenity_score", 68)
return f"{score:.0f}" if score else ""
@callback(
Output("amenities-selected-name", "children"),
Output("amenities-selected-details", "children"),
Input("toronto-selected-neighbourhood", "data"),
Input("toronto-year-select", "value"),
)
def update_amenities_selected(neighbourhood_id: int | None, year: str):
"""Update selected neighbourhood details in amenities tab."""
if not neighbourhood_id:
return "Click map to select", [dmc.Text("", c="dimmed")]
year_int = int(year) if year else 2021
details = get_neighbourhood_details(neighbourhood_id, year_int)
if not details:
return "Unknown", [dmc.Text("No data", c="dimmed")]
name = details.get("neighbourhood_name", "Unknown")
parks = details.get("park_count")
schools = details.get("school_count")
info = [
dmc.Text(f"Parks: {parks}" if parks is not None else "Parks: —", size="sm"),
dmc.Text(
f"Schools: {schools}" if schools is not None else "Schools: —", size="sm"
),
]
return name, info

View File

@@ -1,62 +1,56 @@
"""Toronto Housing Dashboard page."""
"""Toronto Neighbourhood Dashboard page.
Displays neighbourhood-level data across 5 tabs: Overview, Housing, Safety,
Demographics, and Amenities. Each tab provides interactive choropleth maps,
KPI cards, and supporting charts.
"""
import dash
import dash_mantine_components as dmc
from dash import dcc, html
from dash import dcc
from dash_iconify import DashIconify
from portfolio_app.components import (
create_map_controls,
create_metric_cards_row,
create_time_slider,
create_year_selector,
from portfolio_app.pages.toronto.tabs import (
create_amenities_tab,
create_demographics_tab,
create_housing_tab,
create_overview_tab,
create_safety_tab,
)
dash.register_page(__name__, path="/toronto", name="Toronto Housing")
dash.register_page(__name__, path="/toronto", name="Toronto Neighbourhoods")
# Metric options for the purchase market
PURCHASE_METRIC_OPTIONS = [
{"label": "Average Price", "value": "avg_price"},
{"label": "Median Price", "value": "median_price"},
{"label": "Sales Volume", "value": "sales_count"},
{"label": "Days on Market", "value": "avg_dom"},
]
# Metric options for the rental market
RENTAL_METRIC_OPTIONS = [
{"label": "Average Rent", "value": "avg_rent"},
{"label": "Vacancy Rate", "value": "vacancy_rate"},
{"label": "Rental Universe", "value": "rental_universe"},
]
# Sample metrics for KPI cards (will be populated by callbacks)
SAMPLE_METRICS = [
# Tab configuration
TAB_CONFIG = [
{
"title": "Avg. Price",
"value": 1125000,
"delta": 2.3,
"prefix": "$",
"format_spec": ",.0f",
"value": "overview",
"label": "Overview",
"icon": "tabler:chart-pie",
"color": "blue",
},
{
"title": "Sales Volume",
"value": 4850,
"delta": -5.1,
"format_spec": ",",
"value": "housing",
"label": "Housing",
"icon": "tabler:home",
"color": "teal",
},
{
"title": "Avg. DOM",
"value": 18,
"delta": 3,
"suffix": " days",
"positive_is_good": False,
"value": "safety",
"label": "Safety",
"icon": "tabler:shield-check",
"color": "orange",
},
{
"title": "Avg. Rent",
"value": 2450,
"delta": 4.2,
"prefix": "$",
"format_spec": ",.0f",
"value": "demographics",
"label": "Demographics",
"icon": "tabler:users",
"color": "violet",
},
{
"value": "amenities",
"label": "Amenities",
"icon": "tabler:trees",
"color": "green",
},
]
@@ -67,9 +61,9 @@ def create_header() -> dmc.Group:
[
dmc.Stack(
[
dmc.Title("Toronto Housing Dashboard", order=1),
dmc.Title("Toronto Neighbourhood Dashboard", order=1),
dmc.Text(
"Real estate market analysis for the Greater Toronto Area",
"Explore livability across 158 Toronto neighbourhoods",
c="dimmed",
),
],
@@ -88,11 +82,17 @@ def create_header() -> dmc.Group:
),
href="/toronto/methodology",
),
create_year_selector(
id_prefix="toronto",
min_year=2020,
default_year=2024,
label="Year",
dmc.Select(
id="toronto-year-select",
data=[
{"value": "2021", "label": "2021"},
{"value": "2022", "label": "2022"},
{"value": "2023", "label": "2023"},
],
value="2021",
label="Census Year",
size="sm",
w=120,
),
],
gap="md",
@@ -103,187 +103,100 @@ def create_header() -> dmc.Group:
)
def create_kpi_section() -> dmc.Box:
"""Create the KPI metrics row."""
return dmc.Box(
children=[
dmc.Title("Key Metrics", order=3, size="h4", mb="sm"),
html.Div(
id="toronto-kpi-cards",
children=[
create_metric_cards_row(SAMPLE_METRICS, id_prefix="toronto-kpi")
],
),
],
)
def create_purchase_map_section() -> dmc.Grid:
"""Create the purchase market choropleth section."""
return dmc.Grid(
[
dmc.GridCol(
create_map_controls(
id_prefix="purchase-map",
metric_options=PURCHASE_METRIC_OPTIONS,
default_metric="avg_price",
),
span={"base": 12, "md": 3},
),
dmc.GridCol(
dmc.Paper(
children=[
dcc.Graph(
id="purchase-choropleth",
config={"scrollZoom": True},
style={"height": "500px"},
),
],
p="xs",
radius="sm",
withBorder=True,
),
span={"base": 12, "md": 9},
),
],
gutter="md",
)
def create_rental_map_section() -> dmc.Grid:
"""Create the rental market choropleth section."""
return dmc.Grid(
[
dmc.GridCol(
create_map_controls(
id_prefix="rental-map",
metric_options=RENTAL_METRIC_OPTIONS,
default_metric="avg_rent",
),
span={"base": 12, "md": 3},
),
dmc.GridCol(
dmc.Paper(
children=[
dcc.Graph(
id="rental-choropleth",
config={"scrollZoom": True},
style={"height": "500px"},
),
],
p="xs",
radius="sm",
withBorder=True,
),
span={"base": 12, "md": 9},
),
],
gutter="md",
)
def create_time_series_section() -> dmc.Grid:
"""Create the time series charts section."""
return dmc.Grid(
[
dmc.GridCol(
dmc.Paper(
children=[
dmc.Title("Price Trends", order=4, size="h5", mb="sm"),
dcc.Graph(
id="price-time-series",
config={"displayModeBar": False},
style={"height": "350px"},
),
],
p="md",
radius="sm",
withBorder=True,
),
span={"base": 12, "md": 6},
),
dmc.GridCol(
dmc.Paper(
children=[
dmc.Title("Sales Volume", order=4, size="h5", mb="sm"),
dcc.Graph(
id="volume-time-series",
config={"displayModeBar": False},
style={"height": "350px"},
),
],
p="md",
radius="sm",
withBorder=True,
),
span={"base": 12, "md": 6},
),
],
gutter="md",
)
def create_market_comparison_section() -> dmc.Paper:
"""Create the market comparison chart section."""
def create_neighbourhood_selector() -> dmc.Paper:
"""Create the neighbourhood search/select component."""
return dmc.Paper(
children=[
dmc.Group(
[
dmc.Title("Market Indicators", order=4, size="h5"),
create_time_slider(
id_prefix="market-comparison",
min_year=2020,
label="",
),
],
justify="space-between",
align="center",
mb="md",
),
dcc.Graph(
id="market-comparison-chart",
config={"displayModeBar": False},
style={"height": "400px"},
),
],
p="md",
dmc.Group(
[
DashIconify(icon="tabler:search", width=20, color="gray"),
dmc.Select(
id="toronto-neighbourhood-select",
placeholder="Search neighbourhoods...",
searchable=True,
clearable=True,
data=[], # Populated by callback
style={"flex": 1},
),
dmc.Button(
"Compare",
id="toronto-compare-btn",
leftSection=DashIconify(icon="tabler:git-compare", width=16),
variant="light",
disabled=True,
),
],
gap="sm",
),
p="sm",
radius="sm",
withBorder=True,
)
def create_tab_navigation() -> dmc.Tabs:
"""Create the tab navigation with icons."""
return dmc.Tabs(
[
dmc.TabsList(
[
dmc.TabsTab(
dmc.Group(
[
DashIconify(icon=tab["icon"], width=18),
dmc.Text(tab["label"], size="sm"),
],
gap="xs",
),
value=tab["value"],
)
for tab in TAB_CONFIG
],
grow=True,
),
# Tab panels
dmc.TabsPanel(create_overview_tab(), value="overview", pt="md"),
dmc.TabsPanel(create_housing_tab(), value="housing", pt="md"),
dmc.TabsPanel(create_safety_tab(), value="safety", pt="md"),
dmc.TabsPanel(create_demographics_tab(), value="demographics", pt="md"),
dmc.TabsPanel(create_amenities_tab(), value="amenities", pt="md"),
],
id="toronto-tabs",
value="overview",
variant="default",
)
def create_data_notice() -> dmc.Alert:
"""Create a notice about data availability."""
"""Create a notice about data sources."""
return dmc.Alert(
children=[
dmc.Text(
"This dashboard displays Toronto neighbourhood and CMHC rental data. "
"Sample data is shown for demonstration purposes.",
"Data from Toronto Open Data (Census 2021, Crime Statistics) and "
"CMHC Rental Market Reports. Click neighbourhoods on the map for details.",
size="sm",
),
],
title="Data Notice",
title="Data Sources",
color="blue",
variant="light",
icon=DashIconify(icon="tabler:info-circle", width=20),
)
# Store for selected neighbourhood
neighbourhood_store = dcc.Store(id="toronto-selected-neighbourhood", data=None)
# Register callbacks
from portfolio_app.pages.toronto import callbacks # noqa: E402, F401
layout = dmc.Container(
dmc.Stack(
[
neighbourhood_store,
create_header(),
create_data_notice(),
create_kpi_section(),
dmc.Divider(my="md", label="Purchase Market", labelPosition="center"),
create_purchase_map_section(),
dmc.Divider(my="md", label="Rental Market", labelPosition="center"),
create_rental_map_section(),
dmc.Divider(my="md", label="Trends", labelPosition="center"),
create_time_series_section(),
create_market_comparison_section(),
create_neighbourhood_selector(),
create_tab_navigation(),
dmc.Space(h=40),
],
gap="lg",

View File

@@ -0,0 +1,15 @@
"""Tab modules for Toronto Neighbourhood Dashboard."""
from .amenities import create_amenities_tab
from .demographics import create_demographics_tab
from .housing import create_housing_tab
from .overview import create_overview_tab
from .safety import create_safety_tab
__all__ = [
"create_overview_tab",
"create_housing_tab",
"create_safety_tab",
"create_demographics_tab",
"create_amenities_tab",
]

View File

@@ -0,0 +1,207 @@
"""Amenities tab for Toronto Neighbourhood Dashboard.
Displays parks, schools, transit, and other amenity metrics.
"""
import dash_mantine_components as dmc
from dash import dcc
def create_amenities_tab() -> dmc.Stack:
"""Create the Amenities tab layout.
Layout:
- Choropleth map (amenity score) | KPI cards
- Amenity breakdown chart | Amenity comparison radar
Returns:
Tab content as a Mantine Stack component.
"""
return dmc.Stack(
[
# Main content: Map + KPIs
dmc.Grid(
[
# Choropleth map
dmc.GridCol(
dmc.Paper(
[
dmc.Group(
[
dmc.Title(
"Neighbourhood Amenities",
order=4,
size="h5",
),
dmc.Select(
id="amenities-metric-select",
data=[
{
"value": "amenity_score",
"label": "Amenity Score",
},
{
"value": "parks_per_capita",
"label": "Parks per 1K",
},
{
"value": "schools_per_capita",
"label": "Schools per 1K",
},
{
"value": "transit_score",
"label": "Transit Score",
},
],
value="amenity_score",
size="sm",
w=180,
),
],
justify="space-between",
mb="sm",
),
dcc.Graph(
id="amenities-choropleth",
config={
"scrollZoom": True,
"displayModeBar": False,
},
style={"height": "450px"},
),
],
p="md",
radius="sm",
withBorder=True,
),
span={"base": 12, "lg": 8},
),
# KPI cards
dmc.GridCol(
dmc.Stack(
[
dmc.Paper(
[
dmc.Text(
"City Amenity Score", size="xs", c="dimmed"
),
dmc.Title(
id="amenities-city-score",
children="68",
order=2,
),
dmc.Text(
"Out of 100",
size="sm",
c="dimmed",
),
],
p="md",
radius="sm",
withBorder=True,
),
dmc.Paper(
[
dmc.Text("Total Parks", size="xs", c="dimmed"),
dmc.Title(
id="amenities-total-parks",
children="1,500+",
order=2,
),
dmc.Text(
id="amenities-park-area",
children="8,000+ hectares",
size="sm",
c="green",
),
],
p="md",
radius="sm",
withBorder=True,
),
dmc.Paper(
[
dmc.Text(
"Selected Neighbourhood",
size="xs",
c="dimmed",
),
dmc.Title(
id="amenities-selected-name",
children="Click map to select",
order=4,
size="h5",
),
dmc.Stack(
id="amenities-selected-details",
children=[
dmc.Text("", c="dimmed"),
],
gap="xs",
),
],
p="md",
radius="sm",
withBorder=True,
),
],
gap="md",
),
span={"base": 12, "lg": 4},
),
],
gutter="md",
),
# Supporting charts
dmc.Grid(
[
# Amenity breakdown
dmc.GridCol(
dmc.Paper(
[
dmc.Title(
"Amenity Breakdown",
order=4,
size="h5",
mb="sm",
),
dcc.Graph(
id="amenities-breakdown-chart",
config={"displayModeBar": False},
style={"height": "300px"},
),
],
p="md",
radius="sm",
withBorder=True,
),
span={"base": 12, "md": 6},
),
# Amenity comparison radar
dmc.GridCol(
dmc.Paper(
[
dmc.Title(
"Amenity Comparison",
order=4,
size="h5",
mb="sm",
),
dcc.Graph(
id="amenities-radar-chart",
config={"displayModeBar": False},
style={"height": "300px"},
),
],
p="md",
radius="sm",
withBorder=True,
),
span={"base": 12, "md": 6},
),
],
gutter="md",
),
],
gap="md",
)

View File

@@ -0,0 +1,211 @@
"""Demographics tab for Toronto Neighbourhood Dashboard.
Displays population, income, age, and diversity metrics.
"""
import dash_mantine_components as dmc
from dash import dcc
def create_demographics_tab() -> dmc.Stack:
"""Create the Demographics tab layout.
Layout:
- Choropleth map (demographic metric) | KPI cards
- Age distribution chart | Income distribution chart
Returns:
Tab content as a Mantine Stack component.
"""
return dmc.Stack(
[
# Main content: Map + KPIs
dmc.Grid(
[
# Choropleth map
dmc.GridCol(
dmc.Paper(
[
dmc.Group(
[
dmc.Title(
"Neighbourhood Demographics",
order=4,
size="h5",
),
dmc.Select(
id="demographics-metric-select",
data=[
{
"value": "population",
"label": "Population",
},
{
"value": "median_income",
"label": "Median Income",
},
{
"value": "median_age",
"label": "Median Age",
},
{
"value": "diversity_index",
"label": "Diversity Index",
},
],
value="population",
size="sm",
w=180,
),
],
justify="space-between",
mb="sm",
),
dcc.Graph(
id="demographics-choropleth",
config={
"scrollZoom": True,
"displayModeBar": False,
},
style={"height": "450px"},
),
],
p="md",
radius="sm",
withBorder=True,
),
span={"base": 12, "lg": 8},
),
# KPI cards
dmc.GridCol(
dmc.Stack(
[
dmc.Paper(
[
dmc.Text(
"City Population", size="xs", c="dimmed"
),
dmc.Title(
id="demographics-city-pop",
children="2.79M",
order=2,
),
dmc.Text(
id="demographics-pop-change",
children="+2.3% since 2016",
size="sm",
c="green",
),
],
p="md",
radius="sm",
withBorder=True,
),
dmc.Paper(
[
dmc.Text(
"Median Household Income",
size="xs",
c="dimmed",
),
dmc.Title(
id="demographics-city-income",
children="$84,000",
order=2,
),
dmc.Text(
"City average",
size="sm",
c="dimmed",
),
],
p="md",
radius="sm",
withBorder=True,
),
dmc.Paper(
[
dmc.Text(
"Selected Neighbourhood",
size="xs",
c="dimmed",
),
dmc.Title(
id="demographics-selected-name",
children="Click map to select",
order=4,
size="h5",
),
dmc.Stack(
id="demographics-selected-details",
children=[
dmc.Text("", c="dimmed"),
],
gap="xs",
),
],
p="md",
radius="sm",
withBorder=True,
),
],
gap="md",
),
span={"base": 12, "lg": 4},
),
],
gutter="md",
),
# Supporting charts
dmc.Grid(
[
# Age distribution
dmc.GridCol(
dmc.Paper(
[
dmc.Title(
"Age Distribution",
order=4,
size="h5",
mb="sm",
),
dcc.Graph(
id="demographics-age-chart",
config={"displayModeBar": False},
style={"height": "300px"},
),
],
p="md",
radius="sm",
withBorder=True,
),
span={"base": 12, "md": 6},
),
# Income distribution
dmc.GridCol(
dmc.Paper(
[
dmc.Title(
"Income Distribution",
order=4,
size="h5",
mb="sm",
),
dcc.Graph(
id="demographics-income-chart",
config={"displayModeBar": False},
style={"height": "300px"},
),
],
p="md",
radius="sm",
withBorder=True,
),
span={"base": 12, "md": 6},
),
],
gutter="md",
),
],
gap="md",
)

View File

@@ -0,0 +1,209 @@
"""Housing tab for Toronto Neighbourhood Dashboard.
Displays affordability metrics, rent trends, and housing indicators.
"""
import dash_mantine_components as dmc
from dash import dcc
def create_housing_tab() -> dmc.Stack:
"""Create the Housing tab layout.
Layout:
- Choropleth map (affordability index) | KPI cards
- Rent trend line chart | Dwelling types breakdown
Returns:
Tab content as a Mantine Stack component.
"""
return dmc.Stack(
[
# Main content: Map + KPIs
dmc.Grid(
[
# Choropleth map
dmc.GridCol(
dmc.Paper(
[
dmc.Group(
[
dmc.Title(
"Housing Affordability",
order=4,
size="h5",
),
dmc.Select(
id="housing-metric-select",
data=[
{
"value": "affordability_index",
"label": "Affordability Index",
},
{
"value": "avg_rent_2bed",
"label": "Avg Rent (2BR)",
},
{
"value": "rent_to_income_pct",
"label": "Rent-to-Income %",
},
{
"value": "vacancy_rate",
"label": "Vacancy Rate",
},
],
value="affordability_index",
size="sm",
w=180,
),
],
justify="space-between",
mb="sm",
),
dcc.Graph(
id="housing-choropleth",
config={
"scrollZoom": True,
"displayModeBar": False,
},
style={"height": "450px"},
),
],
p="md",
radius="sm",
withBorder=True,
),
span={"base": 12, "lg": 8},
),
# KPI cards
dmc.GridCol(
dmc.Stack(
[
dmc.Paper(
[
dmc.Text(
"City Avg 2BR Rent", size="xs", c="dimmed"
),
dmc.Title(
id="housing-city-rent",
children="$2,450",
order=2,
),
dmc.Text(
id="housing-rent-change",
children="+4.2% YoY",
size="sm",
c="red",
),
],
p="md",
radius="sm",
withBorder=True,
),
dmc.Paper(
[
dmc.Text(
"City Avg Vacancy", size="xs", c="dimmed"
),
dmc.Title(
id="housing-city-vacancy",
children="1.8%",
order=2,
),
dmc.Text(
"Below healthy rate (3%)",
size="sm",
c="orange",
),
],
p="md",
radius="sm",
withBorder=True,
),
dmc.Paper(
[
dmc.Text(
"Selected Neighbourhood",
size="xs",
c="dimmed",
),
dmc.Title(
id="housing-selected-name",
children="Click map to select",
order=4,
size="h5",
),
dmc.Stack(
id="housing-selected-details",
children=[
dmc.Text("", c="dimmed"),
],
gap="xs",
),
],
p="md",
radius="sm",
withBorder=True,
),
],
gap="md",
),
span={"base": 12, "lg": 4},
),
],
gutter="md",
),
# Supporting charts
dmc.Grid(
[
# Rent trend
dmc.GridCol(
dmc.Paper(
[
dmc.Title(
"Rent Trends (5 Year)",
order=4,
size="h5",
mb="sm",
),
dcc.Graph(
id="housing-trend-chart",
config={"displayModeBar": False},
style={"height": "300px"},
),
],
p="md",
radius="sm",
withBorder=True,
),
span={"base": 12, "md": 6},
),
# Dwelling types
dmc.GridCol(
dmc.Paper(
[
dmc.Title(
"Dwelling Types",
order=4,
size="h5",
mb="sm",
),
dcc.Graph(
id="housing-types-chart",
config={"displayModeBar": False},
style={"height": "300px"},
),
],
p="md",
radius="sm",
withBorder=True,
),
span={"base": 12, "md": 6},
),
],
gutter="md",
),
],
gap="md",
)

View File

@@ -0,0 +1,233 @@
"""Overview tab for Toronto Neighbourhood Dashboard.
Displays composite livability score with safety, affordability, and amenity components.
"""
import dash_mantine_components as dmc
from dash import dcc, html
def create_overview_tab() -> dmc.Stack:
"""Create the Overview tab layout.
Layout:
- Choropleth map (livability score) | KPI cards
- Top/Bottom 10 bar chart | Income vs Crime scatter
Returns:
Tab content as a Mantine Stack component.
"""
return dmc.Stack(
[
# Main content: Map + KPIs
dmc.Grid(
[
# Choropleth map
dmc.GridCol(
dmc.Paper(
[
dmc.Group(
[
dmc.Title(
"Neighbourhood Livability",
order=4,
size="h5",
),
dmc.Select(
id="overview-metric-select",
data=[
{
"value": "livability_score",
"label": "Livability Score",
},
{
"value": "safety_score",
"label": "Safety Score",
},
{
"value": "affordability_score",
"label": "Affordability Score",
},
{
"value": "amenity_score",
"label": "Amenity Score",
},
],
value="livability_score",
size="sm",
w=180,
),
],
justify="space-between",
mb="sm",
),
dcc.Graph(
id="overview-choropleth",
config={
"scrollZoom": True,
"displayModeBar": False,
},
style={"height": "450px"},
),
],
p="md",
radius="sm",
withBorder=True,
),
span={"base": 12, "lg": 8},
),
# KPI cards
dmc.GridCol(
dmc.Stack(
[
dmc.Paper(
[
dmc.Text("City Average", size="xs", c="dimmed"),
dmc.Title(
id="overview-city-avg",
children="72",
order=2,
),
dmc.Text("Livability Score", size="sm", fw=500),
],
p="md",
radius="sm",
withBorder=True,
),
dmc.Paper(
[
dmc.Text(
"Selected Neighbourhood",
size="xs",
c="dimmed",
),
dmc.Title(
id="overview-selected-name",
children="Click map to select",
order=4,
size="h5",
),
html.Div(
id="overview-selected-scores",
children=[
dmc.Text("", c="dimmed"),
],
),
],
p="md",
radius="sm",
withBorder=True,
),
dmc.Paper(
[
dmc.Text(
"Score Components", size="xs", c="dimmed"
),
dmc.Stack(
[
dmc.Group(
[
dmc.Text("Safety", size="sm"),
dmc.Text(
"30%",
size="sm",
c="dimmed",
),
],
justify="space-between",
),
dmc.Group(
[
dmc.Text(
"Affordability", size="sm"
),
dmc.Text(
"40%",
size="sm",
c="dimmed",
),
],
justify="space-between",
),
dmc.Group(
[
dmc.Text(
"Amenities", size="sm"
),
dmc.Text(
"30%",
size="sm",
c="dimmed",
),
],
justify="space-between",
),
],
gap="xs",
),
],
p="md",
radius="sm",
withBorder=True,
),
],
gap="md",
),
span={"base": 12, "lg": 4},
),
],
gutter="md",
),
# Supporting charts
dmc.Grid(
[
# Top/Bottom rankings
dmc.GridCol(
dmc.Paper(
[
dmc.Title(
"Top & Bottom Neighbourhoods",
order=4,
size="h5",
mb="sm",
),
dcc.Graph(
id="overview-rankings-chart",
config={"displayModeBar": False},
style={"height": "300px"},
),
],
p="md",
radius="sm",
withBorder=True,
),
span={"base": 12, "md": 6},
),
# Scatter plot
dmc.GridCol(
dmc.Paper(
[
dmc.Title(
"Income vs Safety",
order=4,
size="h5",
mb="sm",
),
dcc.Graph(
id="overview-scatter-chart",
config={"displayModeBar": False},
style={"height": "300px"},
),
],
p="md",
radius="sm",
withBorder=True,
),
span={"base": 12, "md": 6},
),
],
gutter="md",
),
],
gap="md",
)

View File

@@ -0,0 +1,211 @@
"""Safety tab for Toronto Neighbourhood Dashboard.
Displays crime statistics, trends, and safety indicators.
"""
import dash_mantine_components as dmc
from dash import dcc
def create_safety_tab() -> dmc.Stack:
"""Create the Safety tab layout.
Layout:
- Choropleth map (crime rate) | KPI cards
- Crime trend line chart | Crime by type breakdown
Returns:
Tab content as a Mantine Stack component.
"""
return dmc.Stack(
[
# Main content: Map + KPIs
dmc.Grid(
[
# Choropleth map
dmc.GridCol(
dmc.Paper(
[
dmc.Group(
[
dmc.Title(
"Crime Rate by Neighbourhood",
order=4,
size="h5",
),
dmc.Select(
id="safety-metric-select",
data=[
{
"value": "total_crime_rate",
"label": "Total Crime Rate",
},
{
"value": "violent_crime_rate",
"label": "Violent Crime",
},
{
"value": "property_crime_rate",
"label": "Property Crime",
},
{
"value": "theft_rate",
"label": "Theft",
},
],
value="total_crime_rate",
size="sm",
w=180,
),
],
justify="space-between",
mb="sm",
),
dcc.Graph(
id="safety-choropleth",
config={
"scrollZoom": True,
"displayModeBar": False,
},
style={"height": "450px"},
),
],
p="md",
radius="sm",
withBorder=True,
),
span={"base": 12, "lg": 8},
),
# KPI cards
dmc.GridCol(
dmc.Stack(
[
dmc.Paper(
[
dmc.Text(
"City Crime Rate", size="xs", c="dimmed"
),
dmc.Title(
id="safety-city-rate",
children="4,250",
order=2,
),
dmc.Text(
id="safety-rate-change",
children="-2.1% YoY",
size="sm",
c="green",
),
],
p="md",
radius="sm",
withBorder=True,
),
dmc.Paper(
[
dmc.Text(
"Total Incidents (2023)",
size="xs",
c="dimmed",
),
dmc.Title(
id="safety-total-incidents",
children="125,430",
order=2,
),
dmc.Text(
"Per 100,000 residents",
size="sm",
c="dimmed",
),
],
p="md",
radius="sm",
withBorder=True,
),
dmc.Paper(
[
dmc.Text(
"Selected Neighbourhood",
size="xs",
c="dimmed",
),
dmc.Title(
id="safety-selected-name",
children="Click map to select",
order=4,
size="h5",
),
dmc.Stack(
id="safety-selected-details",
children=[
dmc.Text("", c="dimmed"),
],
gap="xs",
),
],
p="md",
radius="sm",
withBorder=True,
),
],
gap="md",
),
span={"base": 12, "lg": 4},
),
],
gutter="md",
),
# Supporting charts
dmc.Grid(
[
# Crime trend
dmc.GridCol(
dmc.Paper(
[
dmc.Title(
"Crime Trends (5 Year)",
order=4,
size="h5",
mb="sm",
),
dcc.Graph(
id="safety-trend-chart",
config={"displayModeBar": False},
style={"height": "300px"},
),
],
p="md",
radius="sm",
withBorder=True,
),
span={"base": 12, "md": 6},
),
# Crime by type
dmc.GridCol(
dmc.Paper(
[
dmc.Title(
"Crime by Category",
order=4,
size="h5",
mb="sm",
),
dcc.Graph(
id="safety-types-chart",
config={"displayModeBar": False},
style={"height": "300px"},
),
],
p="md",
radius="sm",
withBorder=True,
),
span={"base": 12, "md": 6},
),
],
gutter="md",
),
],
gap="md",
)

View File

@@ -57,6 +57,7 @@ class TorontoOpenDataParser:
self._cache_dir = cache_dir
self._timeout = timeout
self._client: httpx.Client | None = None
self._neighbourhood_name_map: dict[str, int] | None = None
@property
def client(self) -> httpx.Client:
@@ -75,6 +76,63 @@ class TorontoOpenDataParser:
self._client.close()
self._client = None
def _get_neighbourhood_name_map(self) -> dict[str, int]:
"""Build and cache a mapping of neighbourhood names to IDs.
Returns:
Dictionary mapping normalized neighbourhood names to area_id.
"""
if self._neighbourhood_name_map is not None:
return self._neighbourhood_name_map
neighbourhoods = self.get_neighbourhoods()
self._neighbourhood_name_map = {}
for n in neighbourhoods:
# Add multiple variations of the name for flexible matching
name_lower = n.area_name.lower().strip()
self._neighbourhood_name_map[name_lower] = n.area_id
# Also add without common suffixes/prefixes
for suffix in [" neighbourhood", " area", "-"]:
if suffix in name_lower:
alt_name = name_lower.replace(suffix, "").strip()
self._neighbourhood_name_map[alt_name] = n.area_id
logger.debug(
f"Built neighbourhood name map with {len(self._neighbourhood_name_map)} entries"
)
return self._neighbourhood_name_map
def _match_neighbourhood_id(self, name: str) -> int | None:
"""Match a neighbourhood name to its ID.
Args:
name: Neighbourhood name from census data.
Returns:
Neighbourhood ID or None if not found.
"""
name_map = self._get_neighbourhood_name_map()
name_lower = name.lower().strip()
# Direct match
if name_lower in name_map:
return name_map[name_lower]
# Try removing parenthetical content
if "(" in name_lower:
base_name = name_lower.split("(")[0].strip()
if base_name in name_map:
return name_map[base_name]
# Try fuzzy matching with first few chars
for key, area_id in name_map.items():
if key.startswith(name_lower[:10]) or name_lower.startswith(key[:10]):
return area_id
return None
def __enter__(self) -> "TorontoOpenDataParser":
return self
@@ -254,11 +312,30 @@ class TorontoOpenDataParser:
logger.info(f"Parsed {len(records)} neighbourhoods")
return records
# Mapping of indicator names to CensusRecord fields
# Keys are partial matches (case-insensitive) found in the "Characteristic" column
CENSUS_INDICATOR_MAPPING: dict[str, str] = {
"population, 2021": "population",
"population, 2016": "population",
"population density per square kilometre": "population_density",
"median total income of household": "median_household_income",
"average total income of household": "average_household_income",
"unemployment rate": "unemployment_rate",
"bachelor's degree or higher": "pct_bachelors_or_higher",
"owner": "pct_owner_occupied",
"renter": "pct_renter_occupied",
"median age": "median_age",
"average value of dwellings": "average_dwelling_value",
}
def get_census_profiles(self, year: int = 2021) -> list[CensusRecord]:
"""Fetch neighbourhood census profiles.
Note: Census profile data structure varies by year. This method
extracts key demographic indicators where available.
The Toronto Open Data neighbourhood profiles dataset is pivoted:
- Rows are demographic indicators (e.g., "Population", "Median Income")
- Columns are neighbourhoods (e.g., "Agincourt North", "Alderwood")
This method transposes the data to create one CensusRecord per neighbourhood.
Args:
year: Census year (2016 or 2021).
@@ -266,7 +343,6 @@ class TorontoOpenDataParser:
Returns:
List of validated CensusRecord objects.
"""
# Census profiles are typically in CSV/datastore format
try:
raw_records = self._fetch_csv_as_json(
self.DATASETS["neighbourhood_profiles"]
@@ -275,13 +351,119 @@ class TorontoOpenDataParser:
logger.warning(f"Could not fetch census profiles: {e}")
return []
# Census profiles are pivoted - rows are indicators, columns are neighbourhoods
# This requires special handling based on the actual data structure
if not raw_records:
logger.warning("Census profiles dataset is empty")
return []
logger.info(f"Fetched {len(raw_records)} census profile rows")
# For now, return empty list - actual implementation depends on data structure
# TODO: Implement census profile parsing based on actual data format
return []
# Find the characteristic/indicator column name
sample_row = raw_records[0]
char_col = None
for col in sample_row:
col_lower = col.lower()
if "characteristic" in col_lower or "category" in col_lower:
char_col = col
break
if not char_col:
# Try common column names
for candidate in ["Characteristic", "Category", "Topic", "_id"]:
if candidate in sample_row:
char_col = candidate
break
if not char_col:
logger.warning("Could not find characteristic column in census data")
return []
# Identify neighbourhood columns (exclude metadata columns)
exclude_cols = {
char_col,
"_id",
"Topic",
"Data Source",
"Characteristic",
"Category",
}
neighbourhood_cols = [col for col in sample_row if col not in exclude_cols]
logger.info(f"Found {len(neighbourhood_cols)} neighbourhood columns")
# Build a lookup: neighbourhood_name -> {field: value}
neighbourhood_data: dict[str, dict[str, Decimal | int | None]] = {
col: {} for col in neighbourhood_cols
}
# Process each row to extract indicator values
for row in raw_records:
characteristic = str(row.get(char_col, "")).lower().strip()
# Check if this row matches any indicator we care about
for indicator_pattern, field_name in self.CENSUS_INDICATOR_MAPPING.items():
if indicator_pattern in characteristic:
# Extract values for each neighbourhood
for col in neighbourhood_cols:
value = row.get(col)
if value is not None and value != "":
try:
# Clean and convert value
str_val = str(value).replace(",", "").replace("$", "")
str_val = str_val.replace("%", "").strip()
if str_val and str_val not in ("x", "X", "F", ".."):
numeric_val = Decimal(str_val)
# Only store if not already set (first match wins)
if field_name not in neighbourhood_data[col]:
neighbourhood_data[col][
field_name
] = numeric_val
except (ValueError, TypeError):
pass
break # Move to next row after matching
# Convert to CensusRecord objects
records = []
unmatched = []
for neighbourhood_name, data in neighbourhood_data.items():
if not data:
continue
# Match neighbourhood name to ID
neighbourhood_id = self._match_neighbourhood_id(neighbourhood_name)
if neighbourhood_id is None:
unmatched.append(neighbourhood_name)
continue
try:
pop_val = data.get("population")
population = int(pop_val) if pop_val is not None else None
record = CensusRecord(
neighbourhood_id=neighbourhood_id,
census_year=year,
population=population,
population_density=data.get("population_density"),
median_household_income=data.get("median_household_income"),
average_household_income=data.get("average_household_income"),
unemployment_rate=data.get("unemployment_rate"),
pct_bachelors_or_higher=data.get("pct_bachelors_or_higher"),
pct_owner_occupied=data.get("pct_owner_occupied"),
pct_renter_occupied=data.get("pct_renter_occupied"),
median_age=data.get("median_age"),
average_dwelling_value=data.get("average_dwelling_value"),
)
records.append(record)
except Exception as e:
logger.debug(f"Skipping neighbourhood {neighbourhood_name}: {e}")
if unmatched:
logger.warning(
f"Could not match {len(unmatched)} neighbourhoods: {unmatched[:5]}..."
)
logger.info(f"Parsed {len(records)} census records for year {year}")
return records
def get_parks(self) -> list[AmenityRecord]:
"""Fetch park locations.

View File

@@ -0,0 +1,33 @@
"""Data service layer for Toronto neighbourhood dashboard."""
from .geometry_service import (
get_cmhc_zones_geojson,
get_neighbourhoods_geojson,
)
from .neighbourhood_service import (
get_amenities_data,
get_city_averages,
get_demographics_data,
get_housing_data,
get_neighbourhood_details,
get_neighbourhood_list,
get_overview_data,
get_rankings,
get_safety_data,
)
__all__ = [
# Neighbourhood data
"get_overview_data",
"get_housing_data",
"get_safety_data",
"get_demographics_data",
"get_amenities_data",
"get_neighbourhood_details",
"get_neighbourhood_list",
"get_rankings",
"get_city_averages",
# Geometry
"get_neighbourhoods_geojson",
"get_cmhc_zones_geojson",
]

View File

@@ -0,0 +1,176 @@
"""Service layer for generating GeoJSON from PostGIS geometry."""
import json
from functools import lru_cache
from typing import Any
import pandas as pd
from sqlalchemy import text
from portfolio_app.toronto.models import get_engine
def _execute_query(sql: str, params: dict[str, Any] | None = None) -> pd.DataFrame:
"""Execute SQL query and return DataFrame."""
engine = get_engine()
with engine.connect() as conn:
return pd.read_sql(text(sql), conn, params=params)
@lru_cache(maxsize=8)
def get_neighbourhoods_geojson(year: int = 2021) -> dict[str, Any]:
"""Get GeoJSON FeatureCollection for all neighbourhoods.
Queries mart_neighbourhood_overview for geometries and basic properties.
Args:
year: Year to query for joining properties.
Returns:
GeoJSON FeatureCollection dictionary.
"""
# Query geometries with ST_AsGeoJSON
sql = """
SELECT
neighbourhood_id,
neighbourhood_name,
ST_AsGeoJSON(geometry)::json as geom,
population,
livability_score
FROM mart_neighbourhood_overview
WHERE year = :year
AND geometry IS NOT NULL
"""
try:
df = _execute_query(sql, {"year": year})
except Exception:
# Table might not exist or have data yet
return _empty_geojson()
if df.empty:
return _empty_geojson()
# Build GeoJSON features
features = []
for _, row in df.iterrows():
geom = row["geom"]
if geom is None:
continue
# Handle geometry that might be a string or dict
if isinstance(geom, str):
geom = json.loads(geom)
feature = {
"type": "Feature",
"id": row["neighbourhood_id"],
"properties": {
"neighbourhood_id": int(row["neighbourhood_id"]),
"neighbourhood_name": row["neighbourhood_name"],
"population": int(row["population"])
if pd.notna(row["population"])
else None,
"livability_score": float(row["livability_score"])
if pd.notna(row["livability_score"])
else None,
},
"geometry": geom,
}
features.append(feature)
return {
"type": "FeatureCollection",
"features": features,
}
@lru_cache(maxsize=4)
def get_cmhc_zones_geojson() -> dict[str, Any]:
"""Get GeoJSON FeatureCollection for CMHC zones.
Queries dim_cmhc_zone for zone geometries.
Returns:
GeoJSON FeatureCollection dictionary.
"""
sql = """
SELECT
zone_code,
zone_name,
ST_AsGeoJSON(geometry)::json as geom
FROM dim_cmhc_zone
WHERE geometry IS NOT NULL
"""
try:
df = _execute_query(sql, {})
except Exception:
return _empty_geojson()
if df.empty:
return _empty_geojson()
features = []
for _, row in df.iterrows():
geom = row["geom"]
if geom is None:
continue
if isinstance(geom, str):
geom = json.loads(geom)
feature = {
"type": "Feature",
"id": row["zone_code"],
"properties": {
"zone_code": row["zone_code"],
"zone_name": row["zone_name"],
},
"geometry": geom,
}
features.append(feature)
return {
"type": "FeatureCollection",
"features": features,
}
def get_neighbourhood_geometry(neighbourhood_id: int) -> dict[str, Any] | None:
"""Get GeoJSON geometry for a single neighbourhood.
Args:
neighbourhood_id: The neighbourhood ID.
Returns:
GeoJSON geometry dict, or None if not found.
"""
sql = """
SELECT ST_AsGeoJSON(geometry)::json as geom
FROM dim_neighbourhood
WHERE neighbourhood_id = :neighbourhood_id
AND geometry IS NOT NULL
"""
try:
df = _execute_query(sql, {"neighbourhood_id": neighbourhood_id})
except Exception:
return None
if df.empty:
return None
geom = df.iloc[0]["geom"]
if isinstance(geom, str):
result: dict[str, Any] = json.loads(geom)
return result
return dict(geom) if geom is not None else None
def _empty_geojson() -> dict[str, Any]:
"""Return an empty GeoJSON FeatureCollection."""
return {
"type": "FeatureCollection",
"features": [],
}

View File

@@ -0,0 +1,392 @@
"""Service layer for querying neighbourhood data from dbt marts."""
from functools import lru_cache
from typing import Any
import pandas as pd
from sqlalchemy import text
from portfolio_app.toronto.models import get_engine
def _execute_query(sql: str, params: dict[str, Any] | None = None) -> pd.DataFrame:
"""Execute SQL query and return DataFrame.
Args:
sql: SQL query string.
params: Query parameters.
Returns:
pandas DataFrame with results, or empty DataFrame on error.
"""
try:
engine = get_engine()
with engine.connect() as conn:
return pd.read_sql(text(sql), conn, params=params)
except Exception:
# Return empty DataFrame on connection or query error
return pd.DataFrame()
def get_overview_data(year: int = 2021) -> pd.DataFrame:
"""Get overview data for all neighbourhoods.
Queries mart_neighbourhood_overview for livability scores and components.
Args:
year: Census year to query.
Returns:
DataFrame with columns: neighbourhood_id, neighbourhood_name,
livability_score, safety_score, affordability_score, amenity_score,
population, median_household_income, etc.
"""
sql = """
SELECT
neighbourhood_id,
neighbourhood_name,
year,
population,
median_household_income,
livability_score,
safety_score,
affordability_score,
amenity_score,
crime_rate_per_100k,
rent_to_income_pct,
avg_rent_2bed,
total_amenities_per_1000
FROM mart_neighbourhood_overview
WHERE year = :year
ORDER BY livability_score DESC NULLS LAST
"""
return _execute_query(sql, {"year": year})
def get_housing_data(year: int = 2021) -> pd.DataFrame:
"""Get housing data for all neighbourhoods.
Queries mart_neighbourhood_housing for affordability metrics.
Args:
year: Year to query.
Returns:
DataFrame with columns: neighbourhood_id, neighbourhood_name,
avg_rent_2bed, vacancy_rate, rent_to_income_pct, affordability_index, etc.
"""
sql = """
SELECT
neighbourhood_id,
neighbourhood_name,
year,
pct_owner_occupied,
pct_renter_occupied,
average_dwelling_value,
median_household_income,
avg_rent_bachelor,
avg_rent_1bed,
avg_rent_2bed,
avg_rent_3bed,
vacancy_rate,
total_rental_units,
rent_to_income_pct,
is_affordable,
affordability_index,
rent_yoy_change_pct,
income_quintile
FROM mart_neighbourhood_housing
WHERE year = :year
ORDER BY affordability_index ASC NULLS LAST
"""
return _execute_query(sql, {"year": year})
def get_safety_data(year: int = 2021) -> pd.DataFrame:
"""Get safety/crime data for all neighbourhoods.
Queries mart_neighbourhood_safety for crime statistics.
Args:
year: Year to query.
Returns:
DataFrame with columns: neighbourhood_id, neighbourhood_name,
total_crime_rate, violent_crime_rate, property_crime_rate, etc.
"""
sql = """
SELECT
neighbourhood_id,
neighbourhood_name,
year,
total_crimes,
crime_rate_per_100k as total_crime_rate,
violent_crimes,
violent_crime_rate,
property_crimes,
property_crime_rate,
theft_crimes,
theft_rate,
crime_yoy_change_pct,
crime_trend
FROM mart_neighbourhood_safety
WHERE year = :year
ORDER BY total_crime_rate ASC NULLS LAST
"""
return _execute_query(sql, {"year": year})
def get_demographics_data(year: int = 2021) -> pd.DataFrame:
"""Get demographic data for all neighbourhoods.
Queries mart_neighbourhood_demographics for population/income metrics.
Args:
year: Census year to query.
Returns:
DataFrame with columns: neighbourhood_id, neighbourhood_name,
population, median_age, median_income, diversity_index, etc.
"""
sql = """
SELECT
neighbourhood_id,
neighbourhood_name,
census_year as year,
population,
population_density,
population_change_pct,
median_household_income,
average_household_income,
income_quintile,
median_age,
pct_under_18,
pct_18_to_64,
pct_65_plus,
pct_bachelors_or_higher,
unemployment_rate,
diversity_index
FROM mart_neighbourhood_demographics
WHERE census_year = :year
ORDER BY population DESC NULLS LAST
"""
return _execute_query(sql, {"year": year})
def get_amenities_data(year: int = 2021) -> pd.DataFrame:
"""Get amenities data for all neighbourhoods.
Queries mart_neighbourhood_amenities for parks, schools, transit.
Args:
year: Year to query.
Returns:
DataFrame with columns: neighbourhood_id, neighbourhood_name,
amenity_score, parks_per_capita, schools_per_capita, transit_score, etc.
"""
sql = """
SELECT
neighbourhood_id,
neighbourhood_name,
year,
park_count,
parks_per_1000,
school_count,
schools_per_1000,
childcare_count,
childcare_per_1000,
total_amenities,
total_amenities_per_1000,
amenity_score,
amenity_rank
FROM mart_neighbourhood_amenities
WHERE year = :year
ORDER BY amenity_score DESC NULLS LAST
"""
return _execute_query(sql, {"year": year})
def get_neighbourhood_details(
neighbourhood_id: int, year: int = 2021
) -> dict[str, Any]:
"""Get detailed data for a single neighbourhood.
Combines data from all mart tables for a complete neighbourhood profile.
Args:
neighbourhood_id: The neighbourhood ID.
year: Year to query.
Returns:
Dictionary with all metrics for the neighbourhood.
"""
sql = """
SELECT
o.neighbourhood_id,
o.neighbourhood_name,
o.year,
o.population,
o.median_household_income,
o.livability_score,
o.safety_score,
o.affordability_score,
o.amenity_score,
s.total_crimes,
s.crime_rate_per_100k,
s.violent_crime_rate,
s.property_crime_rate,
h.avg_rent_2bed,
h.vacancy_rate,
h.rent_to_income_pct,
h.affordability_index,
h.pct_owner_occupied,
h.pct_renter_occupied,
d.median_age,
d.diversity_index,
d.unemployment_rate,
d.pct_bachelors_or_higher,
a.park_count,
a.school_count,
a.total_amenities
FROM mart_neighbourhood_overview o
LEFT JOIN mart_neighbourhood_safety s
ON o.neighbourhood_id = s.neighbourhood_id
AND o.year = s.year
LEFT JOIN mart_neighbourhood_housing h
ON o.neighbourhood_id = h.neighbourhood_id
AND o.year = h.year
LEFT JOIN mart_neighbourhood_demographics d
ON o.neighbourhood_id = d.neighbourhood_id
AND o.year = d.census_year
LEFT JOIN mart_neighbourhood_amenities a
ON o.neighbourhood_id = a.neighbourhood_id
AND o.year = a.year
WHERE o.neighbourhood_id = :neighbourhood_id
AND o.year = :year
"""
df = _execute_query(sql, {"neighbourhood_id": neighbourhood_id, "year": year})
if df.empty:
return {}
return {str(k): v for k, v in df.iloc[0].to_dict().items()}
@lru_cache(maxsize=32)
def get_neighbourhood_list(year: int = 2021) -> list[dict[str, Any]]:
"""Get list of all neighbourhoods for dropdown selectors.
Args:
year: Year to query.
Returns:
List of dicts with neighbourhood_id, name, and population.
"""
sql = """
SELECT DISTINCT
neighbourhood_id,
neighbourhood_name,
population
FROM mart_neighbourhood_overview
WHERE year = :year
ORDER BY neighbourhood_name
"""
df = _execute_query(sql, {"year": year})
if df.empty:
return []
return list(df.to_dict("records")) # type: ignore[arg-type]
def get_rankings(
metric: str,
year: int = 2021,
top_n: int = 10,
ascending: bool = True,
) -> pd.DataFrame:
"""Get top/bottom neighbourhoods for a specific metric.
Args:
metric: Column name to rank by.
year: Year to query.
top_n: Number of top and bottom records.
ascending: If True, rank from lowest to highest (good for crime, rent).
Returns:
DataFrame with top and bottom neighbourhoods.
"""
# Map metrics to their source tables
table_map = {
"livability_score": "mart_neighbourhood_overview",
"safety_score": "mart_neighbourhood_overview",
"affordability_score": "mart_neighbourhood_overview",
"amenity_score": "mart_neighbourhood_overview",
"crime_rate_per_100k": "mart_neighbourhood_safety",
"total_crime_rate": "mart_neighbourhood_safety",
"avg_rent_2bed": "mart_neighbourhood_housing",
"affordability_index": "mart_neighbourhood_housing",
"population": "mart_neighbourhood_demographics",
"median_household_income": "mart_neighbourhood_demographics",
}
table = table_map.get(metric, "mart_neighbourhood_overview")
year_col = "census_year" if "demographics" in table else "year"
order = "ASC" if ascending else "DESC"
reverse_order = "DESC" if ascending else "ASC"
sql = f"""
(
SELECT neighbourhood_id, neighbourhood_name, {metric}, 'bottom' as rank_group
FROM {table}
WHERE {year_col} = :year AND {metric} IS NOT NULL
ORDER BY {metric} {order}
LIMIT :top_n
)
UNION ALL
(
SELECT neighbourhood_id, neighbourhood_name, {metric}, 'top' as rank_group
FROM {table}
WHERE {year_col} = :year AND {metric} IS NOT NULL
ORDER BY {metric} {reverse_order}
LIMIT :top_n
)
"""
return _execute_query(sql, {"year": year, "top_n": top_n})
def get_city_averages(year: int = 2021) -> dict[str, Any]:
"""Get city-wide average metrics.
Args:
year: Year to query.
Returns:
Dictionary with city averages for key metrics.
"""
sql = """
SELECT
AVG(livability_score) as avg_livability_score,
AVG(safety_score) as avg_safety_score,
AVG(affordability_score) as avg_affordability_score,
AVG(amenity_score) as avg_amenity_score,
SUM(population) as total_population,
AVG(median_household_income) as avg_median_income,
AVG(crime_rate_per_100k) as avg_crime_rate,
AVG(avg_rent_2bed) as avg_rent_2bed,
AVG(rent_to_income_pct) as avg_rent_to_income
FROM mart_neighbourhood_overview
WHERE year = :year
"""
df = _execute_query(sql, {"year": year})
if df.empty:
return {}
result: dict[str, Any] = {str(k): v for k, v in df.iloc[0].to_dict().items()}
# Round numeric values
for key, value in result.items():
if pd.notna(value) and isinstance(value, float):
result[key] = round(value, 2)
return result

1
scripts/data/__init__.py Normal file
View File

@@ -0,0 +1 @@
"""Data loading scripts for the portfolio app."""

View File

@@ -0,0 +1,367 @@
#!/usr/bin/env python3
"""Load Toronto neighbourhood data into the database.
Usage:
python scripts/data/load_toronto_data.py [OPTIONS]
Options:
--skip-fetch Skip API fetching, only run dbt
--skip-dbt Skip dbt run, only load data
--dry-run Show what would be done without executing
-v, --verbose Enable verbose logging
This script orchestrates:
1. Fetching data from Toronto Open Data and CMHC APIs
2. Loading data into PostgreSQL fact tables
3. Running dbt to transform staging -> intermediate -> marts
Exit codes:
0 = Success
1 = Error
"""
import argparse
import logging
import subprocess
import sys
from datetime import date
from pathlib import Path
from typing import Any
# Add project root to path
PROJECT_ROOT = Path(__file__).parent.parent.parent
sys.path.insert(0, str(PROJECT_ROOT))
from portfolio_app.toronto.loaders import ( # noqa: E402
get_session,
load_amenities,
load_census_data,
load_crime_data,
load_neighbourhoods,
load_time_dimension,
)
from portfolio_app.toronto.parsers import ( # noqa: E402
TorontoOpenDataParser,
TorontoPoliceParser,
)
from portfolio_app.toronto.schemas import Neighbourhood # noqa: E402
# Configure logging
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s - %(levelname)s - %(message)s",
datefmt="%H:%M:%S",
)
logger = logging.getLogger(__name__)
class DataPipeline:
"""Orchestrates data loading from APIs to database to dbt."""
def __init__(self, dry_run: bool = False, verbose: bool = False):
self.dry_run = dry_run
self.verbose = verbose
self.stats: dict[str, int] = {}
if verbose:
logging.getLogger().setLevel(logging.DEBUG)
def fetch_and_load(self) -> bool:
"""Fetch data from APIs and load into database.
Returns:
True if successful, False otherwise.
"""
logger.info("Starting data fetch and load pipeline...")
try:
with get_session() as session:
# 1. Load time dimension first (for date keys)
self._load_time_dimension(session)
# 2. Load neighbourhoods (required for foreign keys)
self._load_neighbourhoods(session)
# 3. Load census data
self._load_census(session)
# 4. Load crime data
self._load_crime(session)
# 5. Load amenities
self._load_amenities(session)
session.commit()
logger.info("All data committed to database")
self._print_stats()
return True
except Exception as e:
logger.error(f"Pipeline failed: {e}")
if self.verbose:
import traceback
traceback.print_exc()
return False
def _load_time_dimension(self, session: Any) -> None:
"""Load time dimension with date range for dashboard."""
logger.info("Loading time dimension...")
if self.dry_run:
logger.info(
" [DRY RUN] Would load time dimension 2019-01-01 to 2025-12-01"
)
return
count = load_time_dimension(
start_date=date(2019, 1, 1),
end_date=date(2025, 12, 1),
session=session,
)
self.stats["time_dimension"] = count
logger.info(f" Loaded {count} time dimension records")
def _load_neighbourhoods(self, session: Any) -> None:
"""Fetch and load neighbourhood boundaries."""
logger.info("Fetching neighbourhoods from Toronto Open Data...")
if self.dry_run:
logger.info(" [DRY RUN] Would fetch and load neighbourhoods")
return
import json
parser = TorontoOpenDataParser()
raw_neighbourhoods = parser.get_neighbourhoods()
# Convert NeighbourhoodRecord to Neighbourhood schema
neighbourhoods = []
for n in raw_neighbourhoods:
# Convert GeoJSON geometry dict to WKT if present
geometry_wkt = None
if n.geometry:
# Store as GeoJSON string for PostGIS ST_GeomFromGeoJSON
geometry_wkt = json.dumps(n.geometry)
neighbourhood = Neighbourhood(
neighbourhood_id=n.area_id,
name=n.area_name,
geometry_wkt=geometry_wkt,
population=None, # Will be filled from census data
land_area_sqkm=None,
pop_density_per_sqkm=None,
census_year=2021,
)
neighbourhoods.append(neighbourhood)
count = load_neighbourhoods(neighbourhoods, session)
self.stats["neighbourhoods"] = count
logger.info(f" Loaded {count} neighbourhoods")
def _load_census(self, session: Any) -> None:
"""Fetch and load census profile data."""
logger.info("Fetching census profiles from Toronto Open Data...")
if self.dry_run:
logger.info(" [DRY RUN] Would fetch and load census data")
return
parser = TorontoOpenDataParser()
census_records = parser.get_census_profiles(year=2021)
if not census_records:
logger.warning(" No census records fetched")
return
count = load_census_data(census_records, session)
self.stats["census"] = count
logger.info(f" Loaded {count} census records")
def _load_crime(self, session: Any) -> None:
"""Fetch and load crime statistics."""
logger.info("Fetching crime data from Toronto Police Service...")
if self.dry_run:
logger.info(" [DRY RUN] Would fetch and load crime data")
return
parser = TorontoPoliceParser()
crime_records = parser.get_crime_rates()
if not crime_records:
logger.warning(" No crime records fetched")
return
count = load_crime_data(crime_records, session)
self.stats["crime"] = count
logger.info(f" Loaded {count} crime records")
def _load_amenities(self, session: Any) -> None:
"""Fetch and load amenity data (parks, schools, childcare)."""
logger.info("Fetching amenities from Toronto Open Data...")
if self.dry_run:
logger.info(" [DRY RUN] Would fetch and load amenity data")
return
parser = TorontoOpenDataParser()
total_count = 0
# Fetch parks
try:
parks = parser.get_parks()
if parks:
count = load_amenities(parks, year=2024, session=session)
total_count += count
logger.info(f" Loaded {count} park amenities")
except Exception as e:
logger.warning(f" Failed to load parks: {e}")
# Fetch schools
try:
schools = parser.get_schools()
if schools:
count = load_amenities(schools, year=2024, session=session)
total_count += count
logger.info(f" Loaded {count} school amenities")
except Exception as e:
logger.warning(f" Failed to load schools: {e}")
# Fetch childcare centres
try:
childcare = parser.get_childcare_centres()
if childcare:
count = load_amenities(childcare, year=2024, session=session)
total_count += count
logger.info(f" Loaded {count} childcare amenities")
except Exception as e:
logger.warning(f" Failed to load childcare: {e}")
self.stats["amenities"] = total_count
def run_dbt(self) -> bool:
"""Run dbt to transform data.
Returns:
True if successful, False otherwise.
"""
logger.info("Running dbt transformations...")
dbt_project_dir = PROJECT_ROOT / "dbt"
if not dbt_project_dir.exists():
logger.error(f"dbt project directory not found: {dbt_project_dir}")
return False
if self.dry_run:
logger.info(" [DRY RUN] Would run: dbt run")
logger.info(" [DRY RUN] Would run: dbt test")
return True
try:
# Run dbt models
logger.info(" Running dbt run...")
result = subprocess.run(
["dbt", "run"],
cwd=dbt_project_dir,
capture_output=True,
text=True,
)
if result.returncode != 0:
logger.error(f"dbt run failed:\n{result.stderr}")
if self.verbose:
logger.debug(f"dbt output:\n{result.stdout}")
return False
logger.info(" dbt run completed successfully")
# Run dbt tests
logger.info(" Running dbt test...")
result = subprocess.run(
["dbt", "test"],
cwd=dbt_project_dir,
capture_output=True,
text=True,
)
if result.returncode != 0:
logger.warning(f"dbt test had failures:\n{result.stderr}")
# Don't fail on test failures, just warn
else:
logger.info(" dbt test completed successfully")
return True
except FileNotFoundError:
logger.error(
"dbt not found in PATH. Install with: pip install dbt-postgres"
)
return False
except Exception as e:
logger.error(f"dbt execution failed: {e}")
return False
def _print_stats(self) -> None:
"""Print loading statistics."""
if not self.stats:
return
logger.info("Loading statistics:")
for key, count in self.stats.items():
logger.info(f" {key}: {count} records")
def main() -> int:
"""Main entry point for the data loading script."""
parser = argparse.ArgumentParser(
description="Load Toronto neighbourhood data into the database",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog=__doc__,
)
parser.add_argument(
"--skip-fetch",
action="store_true",
help="Skip API fetching, only run dbt",
)
parser.add_argument(
"--skip-dbt",
action="store_true",
help="Skip dbt run, only load data",
)
parser.add_argument(
"--dry-run",
action="store_true",
help="Show what would be done without executing",
)
parser.add_argument(
"-v",
"--verbose",
action="store_true",
help="Enable verbose logging",
)
args = parser.parse_args()
if args.skip_fetch and args.skip_dbt:
logger.error("Cannot skip both fetch and dbt - nothing to do")
return 1
pipeline = DataPipeline(dry_run=args.dry_run, verbose=args.verbose)
# Execute pipeline stages
if not args.skip_fetch and not pipeline.fetch_and_load():
return 1
if not args.skip_dbt and not pipeline.run_dbt():
return 1
logger.info("Pipeline completed successfully!")
return 0
if __name__ == "__main__":
sys.exit(main())