diff --git a/Makefile b/Makefile index ddadfdf..fb33118 100644 --- a/Makefile +++ b/Makefile @@ -1,4 +1,4 @@ -.PHONY: setup docker-up docker-down db-init run test dbt-run dbt-test lint format ci deploy clean help +.PHONY: setup docker-up docker-down db-init load-data run test dbt-run dbt-test lint format ci deploy clean help # Default target .DEFAULT_GOAL := help @@ -71,6 +71,14 @@ db-reset: ## Drop and recreate database (DESTRUCTIVE) @sleep 3 $(MAKE) db-init +load-data: ## Load Toronto data from APIs and run dbt + @echo "$(GREEN)Loading Toronto neighbourhood data...$(NC)" + $(PYTHON) scripts/data/load_toronto_data.py + +load-data-only: ## Load Toronto data without running dbt + @echo "$(GREEN)Loading Toronto data (skip dbt)...$(NC)" + $(PYTHON) scripts/data/load_toronto_data.py --skip-dbt + # ============================================================================= # Application # ============================================================================= diff --git a/docs/project-lessons-learned/INDEX.md b/docs/project-lessons-learned/INDEX.md index 813111c..9c9f479 100644 --- a/docs/project-lessons-learned/INDEX.md +++ b/docs/project-lessons-learned/INDEX.md @@ -10,6 +10,9 @@ This folder contains lessons learned from sprints and development work. These le | Date | Sprint/Phase | Title | Tags | |------|--------------|-------|------| +| 2026-01-17 | Sprint 9-10 | [Graceful Error Handling in Service Layers](./sprint-9-10-graceful-error-handling.md) | python, postgresql, error-handling, dash, graceful-degradation, arm64 | +| 2026-01-17 | Sprint 9-10 | [Modular Callback Structure](./sprint-9-10-modular-callback-structure.md) | dash, callbacks, architecture, python, code-organization | +| 2026-01-17 | Sprint 9-10 | [Figure Factory Pattern](./sprint-9-10-figure-factory-pattern.md) | plotly, dash, design-patterns, python, visualization | | 2026-01-16 | Phase 4 | [dbt Test Syntax Deprecation](./phase-4-dbt-test-syntax.md) | dbt, testing, yaml, deprecation | --- diff --git a/docs/project-lessons-learned/sprint-9-10-figure-factory-pattern.md b/docs/project-lessons-learned/sprint-9-10-figure-factory-pattern.md new file mode 100644 index 0000000..b1b501c --- /dev/null +++ b/docs/project-lessons-learned/sprint-9-10-figure-factory-pattern.md @@ -0,0 +1,53 @@ +# Sprint 9-10 - Figure Factory Pattern for Reusable Charts + +## Context +Creating multiple chart types across 5 dashboard tabs, with consistent styling and behavior needed across all visualizations. + +## Problem +Without a standardized approach, each callback would create figures inline with: +- Duplicated styling code (colors, fonts, backgrounds) +- Inconsistent hover templates +- Hard-to-maintain figure creation logic +- No reuse between tabs + +## Solution +Created a `figures/` module with factory functions: + +``` +figures/ +├── __init__.py # Exports all factories +├── choropleth.py # Map visualizations +├── bar_charts.py # ranking_bar, stacked_bar, horizontal_bar +├── scatter.py # scatter_figure, bubble_chart +├── radar.py # radar_figure, comparison_radar +└── demographics.py # age_pyramid, donut_chart +``` + +Factory pattern benefits: +1. **Consistent styling** - dark theme applied once +2. **Type-safe interfaces** - clear parameters for each chart type +3. **Easy testing** - factories can be unit tested with sample data +4. **Reusability** - same factory used across multiple tabs + +Example factory signature: +```python +def create_ranking_bar( + data: list[dict], + name_column: str, + value_column: str, + title: str = "", + top_n: int = 5, + bottom_n: int = 5, + top_color: str = "#4CAF50", + bottom_color: str = "#F44336", +) -> go.Figure: +``` + +## Prevention +- **Create factories early** - before implementing callbacks +- **Design generic interfaces** - factories should work with any data matching the schema +- **Apply styling in one place** - use constants for colors, fonts +- **Test factories independently** - with synthetic data before integration + +## Tags +plotly, dash, design-patterns, python, visualization, reusability, code-organization diff --git a/docs/project-lessons-learned/sprint-9-10-graceful-error-handling.md b/docs/project-lessons-learned/sprint-9-10-graceful-error-handling.md new file mode 100644 index 0000000..0191a56 --- /dev/null +++ b/docs/project-lessons-learned/sprint-9-10-graceful-error-handling.md @@ -0,0 +1,34 @@ +# Sprint 9-10 - Graceful Error Handling in Service Layers + +## Context +Building the Toronto Neighbourhood Dashboard with a service layer that queries PostgreSQL/PostGIS dbt marts to provide data to Dash callbacks. + +## Problem +Initial service layer implementation let database connection errors propagate as unhandled exceptions. When the PostGIS Docker container was unavailable (common on ARM64 systems where the x86_64 image fails), the entire dashboard would crash instead of gracefully degrading. + +## Solution +Wrapped database queries in try/except blocks to return empty DataFrames/lists/dicts when the database is unavailable: + +```python +def _execute_query(sql: str, params: dict | None = None) -> pd.DataFrame: + try: + engine = get_engine() + with engine.connect() as conn: + return pd.read_sql(text(sql), conn, params=params) + except Exception: + return pd.DataFrame() +``` + +This allows: +1. Dashboard to load and display empty states +2. Development/testing without running database +3. Graceful degradation in production + +## Prevention +- **Always design service layers with graceful degradation** - assume external dependencies can fail +- **Return empty collections, not exceptions** - let UI components handle empty states +- **Test without database** - verify the app doesn't crash when DB is unavailable +- **Consider ARM64 compatibility** - PostGIS images may not support all platforms + +## Tags +python, postgresql, service-layer, error-handling, dash, graceful-degradation, arm64 diff --git a/docs/project-lessons-learned/sprint-9-10-modular-callback-structure.md b/docs/project-lessons-learned/sprint-9-10-modular-callback-structure.md new file mode 100644 index 0000000..fd8814a --- /dev/null +++ b/docs/project-lessons-learned/sprint-9-10-modular-callback-structure.md @@ -0,0 +1,45 @@ +# Sprint 9-10 - Modular Callback Structure for Multi-Tab Dashboards + +## Context +Implementing a 5-tab Toronto Neighbourhood Dashboard with multiple callbacks per tab (map updates, chart updates, KPI updates, selection handling). + +## Problem +Initial callback implementation approach would have placed all callbacks in a single file, leading to: +- A monolithic file with 500+ lines +- Difficult-to-navigate code +- Callbacks for different tabs interleaved +- Testing difficulties + +## Solution +Organized callbacks into three focused modules: + +``` +callbacks/ +├── __init__.py # Imports all modules to register callbacks +├── map_callbacks.py # Choropleth updates, map click handling +├── chart_callbacks.py # Supporting chart updates (scatter, trend, donut) +└── selection_callbacks.py # Dropdown population, KPI updates +``` + +Key patterns: +1. **Group by responsibility**, not by tab - all map-related callbacks together +2. **Use noqa comments** for imports that register callbacks as side effects +3. **Share helper functions** (like `_empty_chart()`) within modules + +```python +# callbacks/__init__.py +from . import ( + chart_callbacks, # noqa: F401 + map_callbacks, # noqa: F401 + selection_callbacks, # noqa: F401 +) +``` + +## Prevention +- **Plan callback organization before implementation** - sketch which callbacks go where +- **Group by function, not by feature** - keeps related logic together +- **Keep modules under 400 lines** - split if exceeding +- **Test imports early** - verify callbacks register correctly + +## Tags +dash, callbacks, architecture, python, code-organization, maintainability diff --git a/portfolio_app/figures/__init__.py b/portfolio_app/figures/__init__.py index bf9dabc..fe22939 100644 --- a/portfolio_app/figures/__init__.py +++ b/portfolio_app/figures/__init__.py @@ -1,9 +1,27 @@ """Plotly figure factories for data visualization.""" +from .bar_charts import ( + create_horizontal_bar, + create_ranking_bar, + create_stacked_bar, +) from .choropleth import ( create_choropleth_figure, create_zone_map, ) +from .demographics import ( + create_age_pyramid, + create_donut_chart, + create_income_distribution, +) +from .radar import ( + create_comparison_radar, + create_radar_figure, +) +from .scatter import ( + create_bubble_chart, + create_scatter_figure, +) from .summary_cards import create_metric_card_figure, create_summary_metrics from .time_series import ( add_policy_markers, @@ -26,4 +44,18 @@ __all__ = [ # Summary "create_metric_card_figure", "create_summary_metrics", + # Bar charts + "create_ranking_bar", + "create_stacked_bar", + "create_horizontal_bar", + # Scatter plots + "create_scatter_figure", + "create_bubble_chart", + # Radar charts + "create_radar_figure", + "create_comparison_radar", + # Demographics + "create_age_pyramid", + "create_donut_chart", + "create_income_distribution", ] diff --git a/portfolio_app/figures/bar_charts.py b/portfolio_app/figures/bar_charts.py new file mode 100644 index 0000000..692ab45 --- /dev/null +++ b/portfolio_app/figures/bar_charts.py @@ -0,0 +1,238 @@ +"""Bar chart figure factories for dashboard visualizations.""" + +from typing import Any + +import pandas as pd +import plotly.express as px +import plotly.graph_objects as go + + +def create_ranking_bar( + data: list[dict[str, Any]], + name_column: str, + value_column: str, + title: str | None = None, + top_n: int = 10, + bottom_n: int = 10, + color_top: str = "#4CAF50", + color_bottom: str = "#F44336", + value_format: str = ",.0f", +) -> go.Figure: + """Create horizontal bar chart showing top and bottom rankings. + + Args: + data: List of data records. + name_column: Column name for labels. + value_column: Column name for values. + title: Optional chart title. + top_n: Number of top items to show. + bottom_n: Number of bottom items to show. + color_top: Color for top performers. + color_bottom: Color for bottom performers. + value_format: Number format string for values. + + Returns: + Plotly Figure object. + """ + if not data: + return _create_empty_figure(title or "Rankings") + + df = pd.DataFrame(data).sort_values(value_column, ascending=False) + + # Get top and bottom + top_df = df.head(top_n).copy() + bottom_df = df.tail(bottom_n).copy() + + top_df["group"] = "Top" + bottom_df["group"] = "Bottom" + + # Combine with gap in the middle + combined = pd.concat([top_df, bottom_df]) + combined["color"] = combined["group"].map( + {"Top": color_top, "Bottom": color_bottom} + ) + + fig = go.Figure() + + # Add top bars + fig.add_trace( + go.Bar( + y=top_df[name_column], + x=top_df[value_column], + orientation="h", + marker_color=color_top, + name="Top", + text=top_df[value_column].apply(lambda x: f"{x:{value_format}}"), + textposition="auto", + hovertemplate=f"%{{y}}
{value_column}: %{{x:{value_format}}}", + ) + ) + + # Add bottom bars + fig.add_trace( + go.Bar( + y=bottom_df[name_column], + x=bottom_df[value_column], + orientation="h", + marker_color=color_bottom, + name="Bottom", + text=bottom_df[value_column].apply(lambda x: f"{x:{value_format}}"), + textposition="auto", + hovertemplate=f"%{{y}}
{value_column}: %{{x:{value_format}}}", + ) + ) + + fig.update_layout( + title=title, + barmode="group", + showlegend=True, + legend={"orientation": "h", "yanchor": "bottom", "y": 1.02}, + paper_bgcolor="rgba(0,0,0,0)", + plot_bgcolor="rgba(0,0,0,0)", + font_color="#c9c9c9", + xaxis={"gridcolor": "rgba(128,128,128,0.2)", "title": None}, + yaxis={"autorange": "reversed", "title": None}, + margin={"l": 10, "r": 10, "t": 40, "b": 10}, + ) + + return fig + + +def create_stacked_bar( + data: list[dict[str, Any]], + x_column: str, + value_column: str, + category_column: str, + title: str | None = None, + color_map: dict[str, str] | None = None, + show_percentages: bool = False, +) -> go.Figure: + """Create stacked bar chart for breakdown visualizations. + + Args: + data: List of data records. + x_column: Column name for x-axis categories. + value_column: Column name for values. + category_column: Column name for stacking categories. + title: Optional chart title. + color_map: Mapping of category to color. + show_percentages: Whether to normalize to 100%. + + Returns: + Plotly Figure object. + """ + if not data: + return _create_empty_figure(title or "Breakdown") + + df = pd.DataFrame(data) + + # Default color scheme + if color_map is None: + categories = df[category_column].unique() + colors = px.colors.qualitative.Set2[: len(categories)] + color_map = dict(zip(categories, colors, strict=False)) + + fig = px.bar( + df, + x=x_column, + y=value_column, + color=category_column, + color_discrete_map=color_map, + barmode="stack", + text=value_column if not show_percentages else None, + ) + + if show_percentages: + fig.update_traces(texttemplate="%{y:.1f}%", textposition="inside") + + fig.update_layout( + title=title, + paper_bgcolor="rgba(0,0,0,0)", + plot_bgcolor="rgba(0,0,0,0)", + font_color="#c9c9c9", + xaxis={"gridcolor": "rgba(128,128,128,0.2)", "title": None}, + yaxis={"gridcolor": "rgba(128,128,128,0.2)", "title": None}, + legend={"orientation": "h", "yanchor": "bottom", "y": 1.02}, + margin={"l": 10, "r": 10, "t": 60, "b": 10}, + ) + + return fig + + +def create_horizontal_bar( + data: list[dict[str, Any]], + name_column: str, + value_column: str, + title: str | None = None, + color: str = "#2196F3", + value_format: str = ",.0f", + sort: bool = True, +) -> go.Figure: + """Create simple horizontal bar chart. + + Args: + data: List of data records. + name_column: Column name for labels. + value_column: Column name for values. + title: Optional chart title. + color: Bar color. + value_format: Number format string. + sort: Whether to sort by value descending. + + Returns: + Plotly Figure object. + """ + if not data: + return _create_empty_figure(title or "Bar Chart") + + df = pd.DataFrame(data) + + if sort: + df = df.sort_values(value_column, ascending=True) + + fig = go.Figure( + go.Bar( + y=df[name_column], + x=df[value_column], + orientation="h", + marker_color=color, + text=df[value_column].apply(lambda x: f"{x:{value_format}}"), + textposition="outside", + hovertemplate=f"%{{y}}
Value: %{{x:{value_format}}}", + ) + ) + + fig.update_layout( + title=title, + paper_bgcolor="rgba(0,0,0,0)", + plot_bgcolor="rgba(0,0,0,0)", + font_color="#c9c9c9", + xaxis={"gridcolor": "rgba(128,128,128,0.2)", "title": None}, + yaxis={"title": None}, + margin={"l": 10, "r": 10, "t": 40, "b": 10}, + ) + + return fig + + +def _create_empty_figure(title: str) -> go.Figure: + """Create an empty figure with a message.""" + fig = go.Figure() + fig.add_annotation( + text="No data available", + xref="paper", + yref="paper", + x=0.5, + y=0.5, + showarrow=False, + font={"size": 14, "color": "#888888"}, + ) + fig.update_layout( + title=title, + paper_bgcolor="rgba(0,0,0,0)", + plot_bgcolor="rgba(0,0,0,0)", + font_color="#c9c9c9", + xaxis={"visible": False}, + yaxis={"visible": False}, + ) + return fig diff --git a/portfolio_app/figures/demographics.py b/portfolio_app/figures/demographics.py new file mode 100644 index 0000000..ceced5b --- /dev/null +++ b/portfolio_app/figures/demographics.py @@ -0,0 +1,240 @@ +"""Demographics-specific chart factories.""" + +from typing import Any + +import pandas as pd +import plotly.graph_objects as go + + +def create_age_pyramid( + data: list[dict[str, Any]], + age_groups: list[str], + male_column: str = "male", + female_column: str = "female", + title: str | None = None, +) -> go.Figure: + """Create population pyramid by age and gender. + + Args: + data: List with one record per age group containing male/female counts. + age_groups: List of age group labels in order (youngest to oldest). + male_column: Column name for male population. + female_column: Column name for female population. + title: Optional chart title. + + Returns: + Plotly Figure object. + """ + if not data or not age_groups: + return _create_empty_figure(title or "Age Distribution") + + df = pd.DataFrame(data) + + # Ensure data is ordered by age groups + if "age_group" in df.columns: + df["age_order"] = df["age_group"].apply( + lambda x: age_groups.index(x) if x in age_groups else -1 + ) + df = df.sort_values("age_order") + + male_values = df[male_column].tolist() if male_column in df.columns else [] + female_values = df[female_column].tolist() if female_column in df.columns else [] + + # Make male values negative for pyramid effect + male_values_neg = [-v for v in male_values] + + fig = go.Figure() + + # Male bars (left side, negative values) + fig.add_trace( + go.Bar( + y=age_groups, + x=male_values_neg, + orientation="h", + name="Male", + marker_color="#2196F3", + hovertemplate="%{y}
Male: %{customdata:,}", + customdata=male_values, + ) + ) + + # Female bars (right side, positive values) + fig.add_trace( + go.Bar( + y=age_groups, + x=female_values, + orientation="h", + name="Female", + marker_color="#E91E63", + hovertemplate="%{y}
Female: %{x:,}", + ) + ) + + # Calculate max for symmetric axis + max_val = max(max(male_values, default=0), max(female_values, default=0)) + + fig.update_layout( + title=title, + barmode="overlay", + bargap=0.1, + paper_bgcolor="rgba(0,0,0,0)", + plot_bgcolor="rgba(0,0,0,0)", + font_color="#c9c9c9", + xaxis={ + "title": "Population", + "gridcolor": "rgba(128,128,128,0.2)", + "range": [-max_val * 1.1, max_val * 1.1], + "tickvals": [-max_val, -max_val / 2, 0, max_val / 2, max_val], + "ticktext": [ + f"{max_val:,.0f}", + f"{max_val / 2:,.0f}", + "0", + f"{max_val / 2:,.0f}", + f"{max_val:,.0f}", + ], + }, + yaxis={"title": None, "gridcolor": "rgba(128,128,128,0.2)"}, + legend={"orientation": "h", "yanchor": "bottom", "y": 1.02}, + margin={"l": 10, "r": 10, "t": 60, "b": 10}, + ) + + return fig + + +def create_donut_chart( + data: list[dict[str, Any]], + name_column: str, + value_column: str, + title: str | None = None, + colors: list[str] | None = None, + hole_size: float = 0.4, +) -> go.Figure: + """Create donut chart for percentage breakdowns. + + Args: + data: List of data records with name and value. + name_column: Column name for labels. + value_column: Column name for values. + title: Optional chart title. + colors: List of colors for segments. + hole_size: Size of center hole (0-1). + + Returns: + Plotly Figure object. + """ + if not data: + return _create_empty_figure(title or "Distribution") + + df = pd.DataFrame(data) + + if colors is None: + colors = [ + "#2196F3", + "#4CAF50", + "#FF9800", + "#E91E63", + "#9C27B0", + "#00BCD4", + "#FFC107", + "#795548", + ] + + fig = go.Figure( + go.Pie( + labels=df[name_column], + values=df[value_column], + hole=hole_size, + marker_colors=colors[: len(df)], + textinfo="percent+label", + textposition="outside", + hovertemplate="%{label}
%{value:,} (%{percent})", + ) + ) + + fig.update_layout( + title=title, + paper_bgcolor="rgba(0,0,0,0)", + font_color="#c9c9c9", + showlegend=False, + margin={"l": 10, "r": 10, "t": 60, "b": 10}, + ) + + return fig + + +def create_income_distribution( + data: list[dict[str, Any]], + bracket_column: str, + count_column: str, + title: str | None = None, + color: str = "#4CAF50", +) -> go.Figure: + """Create histogram-style bar chart for income distribution. + + Args: + data: List of data records with income brackets and counts. + bracket_column: Column name for income brackets. + count_column: Column name for household counts. + title: Optional chart title. + color: Bar color. + + Returns: + Plotly Figure object. + """ + if not data: + return _create_empty_figure(title or "Income Distribution") + + df = pd.DataFrame(data) + + fig = go.Figure( + go.Bar( + x=df[bracket_column], + y=df[count_column], + marker_color=color, + text=df[count_column].apply(lambda x: f"{x:,}"), + textposition="outside", + hovertemplate="%{x}
Households: %{y:,}", + ) + ) + + fig.update_layout( + title=title, + paper_bgcolor="rgba(0,0,0,0)", + plot_bgcolor="rgba(0,0,0,0)", + font_color="#c9c9c9", + xaxis={ + "title": "Income Bracket", + "gridcolor": "rgba(128,128,128,0.2)", + "tickangle": -45, + }, + yaxis={ + "title": "Households", + "gridcolor": "rgba(128,128,128,0.2)", + }, + margin={"l": 10, "r": 10, "t": 60, "b": 80}, + ) + + return fig + + +def _create_empty_figure(title: str) -> go.Figure: + """Create an empty figure with a message.""" + fig = go.Figure() + fig.add_annotation( + text="No data available", + xref="paper", + yref="paper", + x=0.5, + y=0.5, + showarrow=False, + font={"size": 14, "color": "#888888"}, + ) + fig.update_layout( + title=title, + paper_bgcolor="rgba(0,0,0,0)", + plot_bgcolor="rgba(0,0,0,0)", + font_color="#c9c9c9", + xaxis={"visible": False}, + yaxis={"visible": False}, + ) + return fig diff --git a/portfolio_app/figures/radar.py b/portfolio_app/figures/radar.py new file mode 100644 index 0000000..35f8e75 --- /dev/null +++ b/portfolio_app/figures/radar.py @@ -0,0 +1,166 @@ +"""Radar/spider chart figure factory for multi-metric comparison.""" + +from typing import Any + +import plotly.graph_objects as go + + +def create_radar_figure( + data: list[dict[str, Any]], + metrics: list[str], + name_column: str | None = None, + title: str | None = None, + fill: bool = True, + colors: list[str] | None = None, +) -> go.Figure: + """Create radar/spider chart for multi-axis comparison. + + Each record in data represents one entity (e.g., a neighbourhood) + with values for each metric that will be plotted on a separate axis. + + Args: + data: List of data records, each with values for the metrics. + metrics: List of metric column names to display on radar axes. + name_column: Column name for entity labels. + title: Optional chart title. + fill: Whether to fill the radar polygons. + colors: List of colors for each data series. + + Returns: + Plotly Figure object. + """ + if not data or not metrics: + return _create_empty_figure(title or "Radar Chart") + + # Default colors + if colors is None: + colors = [ + "#2196F3", + "#4CAF50", + "#FF9800", + "#E91E63", + "#9C27B0", + "#00BCD4", + ] + + fig = go.Figure() + + # Format axis labels + axis_labels = [m.replace("_", " ").title() for m in metrics] + + for i, record in enumerate(data): + values = [record.get(m, 0) or 0 for m in metrics] + # Close the radar polygon + values_closed = values + [values[0]] + labels_closed = axis_labels + [axis_labels[0]] + + name = ( + record.get(name_column, f"Series {i + 1}") + if name_column + else f"Series {i + 1}" + ) + color = colors[i % len(colors)] + + fig.add_trace( + go.Scatterpolar( + r=values_closed, + theta=labels_closed, + name=name, + line={"color": color, "width": 2}, + fill="toself" if fill else None, + fillcolor=f"rgba{_hex_to_rgba(color, 0.2)}" if fill else None, + hovertemplate="%{theta}: %{r:.1f}", + ) + ) + + fig.update_layout( + title=title, + polar={ + "radialaxis": { + "visible": True, + "gridcolor": "rgba(128,128,128,0.3)", + "linecolor": "rgba(128,128,128,0.3)", + "tickfont": {"color": "#c9c9c9"}, + }, + "angularaxis": { + "gridcolor": "rgba(128,128,128,0.3)", + "linecolor": "rgba(128,128,128,0.3)", + "tickfont": {"color": "#c9c9c9"}, + }, + "bgcolor": "rgba(0,0,0,0)", + }, + paper_bgcolor="rgba(0,0,0,0)", + font_color="#c9c9c9", + showlegend=len(data) > 1, + legend={"orientation": "h", "yanchor": "bottom", "y": -0.2}, + margin={"l": 40, "r": 40, "t": 60, "b": 40}, + ) + + return fig + + +def create_comparison_radar( + selected_data: dict[str, Any], + average_data: dict[str, Any], + metrics: list[str], + selected_name: str = "Selected", + average_name: str = "City Average", + title: str | None = None, +) -> go.Figure: + """Create radar chart comparing a selection to city average. + + Args: + selected_data: Data for the selected entity. + average_data: Data for the city average. + metrics: List of metric column names. + selected_name: Label for selected entity. + average_name: Label for average. + title: Optional chart title. + + Returns: + Plotly Figure object. + """ + if not selected_data or not average_data: + return _create_empty_figure(title or "Comparison") + + data = [ + {**selected_data, "__name__": selected_name}, + {**average_data, "__name__": average_name}, + ] + + return create_radar_figure( + data=data, + metrics=metrics, + name_column="__name__", + title=title, + colors=["#4CAF50", "#9E9E9E"], + ) + + +def _hex_to_rgba(hex_color: str, alpha: float) -> tuple[int, int, int, float]: + """Convert hex color to RGBA tuple.""" + hex_color = hex_color.lstrip("#") + r = int(hex_color[0:2], 16) + g = int(hex_color[2:4], 16) + b = int(hex_color[4:6], 16) + return (r, g, b, alpha) + + +def _create_empty_figure(title: str) -> go.Figure: + """Create an empty figure with a message.""" + fig = go.Figure() + fig.add_annotation( + text="No data available", + xref="paper", + yref="paper", + x=0.5, + y=0.5, + showarrow=False, + font={"size": 14, "color": "#888888"}, + ) + fig.update_layout( + title=title, + paper_bgcolor="rgba(0,0,0,0)", + font_color="#c9c9c9", + ) + return fig diff --git a/portfolio_app/figures/scatter.py b/portfolio_app/figures/scatter.py new file mode 100644 index 0000000..1e1c6ef --- /dev/null +++ b/portfolio_app/figures/scatter.py @@ -0,0 +1,184 @@ +"""Scatter plot figure factory for correlation views.""" + +from typing import Any + +import pandas as pd +import plotly.express as px +import plotly.graph_objects as go + + +def create_scatter_figure( + data: list[dict[str, Any]], + x_column: str, + y_column: str, + name_column: str | None = None, + size_column: str | None = None, + color_column: str | None = None, + title: str | None = None, + x_title: str | None = None, + y_title: str | None = None, + trendline: bool = False, + color_scale: str = "Blues", +) -> go.Figure: + """Create scatter plot for correlation visualization. + + Args: + data: List of data records. + x_column: Column name for x-axis values. + y_column: Column name for y-axis values. + name_column: Column name for point labels (hover). + size_column: Column name for point sizes. + color_column: Column name for color encoding. + title: Optional chart title. + x_title: X-axis title. + y_title: Y-axis title. + trendline: Whether to add OLS trendline. + color_scale: Plotly color scale for continuous colors. + + Returns: + Plotly Figure object. + """ + if not data: + return _create_empty_figure(title or "Scatter Plot") + + df = pd.DataFrame(data) + + # Build hover_data + hover_data = {} + if name_column and name_column in df.columns: + hover_data[name_column] = True + + # Create scatter plot + fig = px.scatter( + df, + x=x_column, + y=y_column, + size=size_column if size_column and size_column in df.columns else None, + color=color_column if color_column and color_column in df.columns else None, + color_continuous_scale=color_scale, + hover_name=name_column, + trendline="ols" if trendline else None, + opacity=0.7, + ) + + # Style the markers + fig.update_traces( + marker={ + "line": {"width": 1, "color": "rgba(255,255,255,0.3)"}, + }, + ) + + # Trendline styling + if trendline: + fig.update_traces( + selector={"mode": "lines"}, + line={"color": "#FF9800", "dash": "dash", "width": 2}, + ) + + fig.update_layout( + title=title, + paper_bgcolor="rgba(0,0,0,0)", + plot_bgcolor="rgba(0,0,0,0)", + font_color="#c9c9c9", + xaxis={ + "gridcolor": "rgba(128,128,128,0.2)", + "title": x_title or x_column.replace("_", " ").title(), + "zeroline": False, + }, + yaxis={ + "gridcolor": "rgba(128,128,128,0.2)", + "title": y_title or y_column.replace("_", " ").title(), + "zeroline": False, + }, + margin={"l": 10, "r": 10, "t": 40, "b": 10}, + showlegend=color_column is not None, + ) + + return fig + + +def create_bubble_chart( + data: list[dict[str, Any]], + x_column: str, + y_column: str, + size_column: str, + name_column: str | None = None, + color_column: str | None = None, + title: str | None = None, + x_title: str | None = None, + y_title: str | None = None, + size_max: int = 50, +) -> go.Figure: + """Create bubble chart with sized markers. + + Args: + data: List of data records. + x_column: Column name for x-axis values. + y_column: Column name for y-axis values. + size_column: Column name for bubble sizes. + name_column: Column name for labels. + color_column: Column name for colors. + title: Optional chart title. + x_title: X-axis title. + y_title: Y-axis title. + size_max: Maximum marker size in pixels. + + Returns: + Plotly Figure object. + """ + if not data: + return _create_empty_figure(title or "Bubble Chart") + + df = pd.DataFrame(data) + + fig = px.scatter( + df, + x=x_column, + y=y_column, + size=size_column, + color=color_column, + hover_name=name_column, + size_max=size_max, + opacity=0.7, + ) + + fig.update_layout( + title=title, + paper_bgcolor="rgba(0,0,0,0)", + plot_bgcolor="rgba(0,0,0,0)", + font_color="#c9c9c9", + xaxis={ + "gridcolor": "rgba(128,128,128,0.2)", + "title": x_title or x_column.replace("_", " ").title(), + }, + yaxis={ + "gridcolor": "rgba(128,128,128,0.2)", + "title": y_title or y_column.replace("_", " ").title(), + }, + margin={"l": 10, "r": 10, "t": 40, "b": 10}, + ) + + return fig + + +def _create_empty_figure(title: str) -> go.Figure: + """Create an empty figure with a message.""" + fig = go.Figure() + fig.add_annotation( + text="No data available", + xref="paper", + yref="paper", + x=0.5, + y=0.5, + showarrow=False, + font={"size": 14, "color": "#888888"}, + ) + fig.update_layout( + title=title, + paper_bgcolor="rgba(0,0,0,0)", + plot_bgcolor="rgba(0,0,0,0)", + font_color="#c9c9c9", + xaxis={"visible": False}, + yaxis={"visible": False}, + ) + return fig diff --git a/portfolio_app/pages/toronto/callbacks/__init__.py b/portfolio_app/pages/toronto/callbacks/__init__.py index 20e7941..0fe06c7 100644 --- a/portfolio_app/pages/toronto/callbacks/__init__.py +++ b/portfolio_app/pages/toronto/callbacks/__init__.py @@ -1,1558 +1,14 @@ -"""Toronto dashboard callbacks.""" +"""Toronto dashboard callbacks. -from pathlib import Path +Registers all callbacks for the neighbourhood dashboard including: +- Map interactions (choropleth click, metric selection) +- Chart updates (supporting visualizations) +- Selection handling (neighbourhood dropdown, details panels) +""" -import plotly.graph_objects as go -from dash import Input, Output, callback - -from portfolio_app.figures.choropleth import create_choropleth_figure -from portfolio_app.figures.time_series import ( - create_market_comparison_chart, - create_price_time_series, - create_volume_time_series, +# Import all callback modules to register them with Dash +from . import ( + chart_callbacks, # noqa: F401 + map_callbacks, # noqa: F401 + selection_callbacks, # noqa: F401 ) -from portfolio_app.toronto.parsers.geo import CMHCZoneParser, NeighbourhoodParser - -# Load CMHC zones GeoJSON for rental choropleth maps -_CMHC_ZONES_PATH = Path("data/toronto/raw/geo/cmhc_zones.geojson") -_cmhc_parser = CMHCZoneParser(_CMHC_ZONES_PATH) if _CMHC_ZONES_PATH.exists() else None -CMHC_ZONES_GEOJSON = _cmhc_parser.get_geojson_for_choropleth() if _cmhc_parser else None - -# Load Toronto neighbourhoods GeoJSON for choropleth maps -_NEIGHBOURHOODS_PATH = Path("data/toronto/raw/geo/toronto_neighbourhoods.geojson") -_neighbourhood_parser = ( - NeighbourhoodParser(_NEIGHBOURHOODS_PATH) if _NEIGHBOURHOODS_PATH.exists() else None -) -NEIGHBOURHOODS_GEOJSON = ( - _neighbourhood_parser.get_geojson_for_choropleth() - if _neighbourhood_parser - else None -) - -# Sample data for all 158 City of Toronto neighbourhoods -SAMPLE_PURCHASE_DATA = [ - { - "neighbourhood_id": 1, - "name": "West Humber-Clairville", - "avg_price": 972000, - "median_price": 894000, - "sales_count": 147, - "avg_dom": 28, - }, - { - "neighbourhood_id": 2, - "name": "Mount Olive-Silverstone-Jamestown", - "avg_price": 1488000, - "median_price": 1369000, - "sales_count": 117, - "avg_dom": 27, - }, - { - "neighbourhood_id": 3, - "name": "Thistletown-Beaumond Heights", - "avg_price": 1504000, - "median_price": 1384000, - "sales_count": 303, - "avg_dom": 18, - }, - { - "neighbourhood_id": 4, - "name": "Rexdale-Kipling", - "avg_price": 1597000, - "median_price": 1469000, - "sales_count": 283, - "avg_dom": 18, - }, - { - "neighbourhood_id": 5, - "name": "Elms-Old Rexdale", - "avg_price": 2088000, - "median_price": 1921000, - "sales_count": 274, - "avg_dom": 23, - }, - { - "neighbourhood_id": 6, - "name": "Kingsview Village-The Westway", - "avg_price": 1166000, - "median_price": 1072000, - "sales_count": 204, - "avg_dom": 20, - }, - { - "neighbourhood_id": 7, - "name": "Willowridge-Martingrove-Richview", - "avg_price": 885000, - "median_price": 814000, - "sales_count": 406, - "avg_dom": 20, - }, - { - "neighbourhood_id": 8, - "name": "Humber Heights-Westmount", - "avg_price": 982000, - "median_price": 903000, - "sales_count": 109, - "avg_dom": 14, - }, - { - "neighbourhood_id": 9, - "name": "Edenbridge-Humber Valley", - "avg_price": 2039000, - "median_price": 1876000, - "sales_count": 317, - "avg_dom": 23, - }, - { - "neighbourhood_id": 10, - "name": "Princess-Rosethorn", - "avg_price": 2052000, - "median_price": 1887000, - "sales_count": 84, - "avg_dom": 17, - }, - { - "neighbourhood_id": 11, - "name": "Eringate-Centennial-West Deane", - "avg_price": 1002000, - "median_price": 922000, - "sales_count": 306, - "avg_dom": 15, - }, - { - "neighbourhood_id": 12, - "name": "Markland Wood", - "avg_price": 1099000, - "median_price": 1012000, - "sales_count": 444, - "avg_dom": 21, - }, - { - "neighbourhood_id": 13, - "name": "Etobicoke West Mall", - "avg_price": 1042000, - "median_price": 958000, - "sales_count": 403, - "avg_dom": 18, - }, - { - "neighbourhood_id": 15, - "name": "Kingsway South", - "avg_price": 817000, - "median_price": 751000, - "sales_count": 368, - "avg_dom": 22, - }, - { - "neighbourhood_id": 16, - "name": "Stonegate-Queensway", - "avg_price": 760000, - "median_price": 699000, - "sales_count": 445, - "avg_dom": 16, - }, - { - "neighbourhood_id": 18, - "name": "New Toronto", - "avg_price": 1471000, - "median_price": 1354000, - "sales_count": 127, - "avg_dom": 19, - }, - { - "neighbourhood_id": 19, - "name": "Long Branch", - "avg_price": 735000, - "median_price": 677000, - "sales_count": 161, - "avg_dom": 24, - }, - { - "neighbourhood_id": 20, - "name": "Alderwood", - "avg_price": 1808000, - "median_price": 1663000, - "sales_count": 81, - "avg_dom": 19, - }, - { - "neighbourhood_id": 21, - "name": "Humber Summit", - "avg_price": 1885000, - "median_price": 1734000, - "sales_count": 194, - "avg_dom": 27, - }, - { - "neighbourhood_id": 22, - "name": "Humbermede", - "avg_price": 1552000, - "median_price": 1428000, - "sales_count": 385, - "avg_dom": 22, - }, - { - "neighbourhood_id": 23, - "name": "Pelmo Park-Humberlea", - "avg_price": 1956000, - "median_price": 1800000, - "sales_count": 352, - "avg_dom": 12, - }, - { - "neighbourhood_id": 24, - "name": "Black Creek", - "avg_price": 1018000, - "median_price": 937000, - "sales_count": 437, - "avg_dom": 19, - }, - { - "neighbourhood_id": 25, - "name": "Glenfield-Jane Heights", - "avg_price": 753000, - "median_price": 693000, - "sales_count": 197, - "avg_dom": 22, - }, - { - "neighbourhood_id": 27, - "name": "York University Heights", - "avg_price": 1910000, - "median_price": 1757000, - "sales_count": 330, - "avg_dom": 22, - }, - { - "neighbourhood_id": 28, - "name": "Rustic", - "avg_price": 1818000, - "median_price": 1673000, - "sales_count": 193, - "avg_dom": 27, - }, - { - "neighbourhood_id": 29, - "name": "Maple Leaf", - "avg_price": 1092000, - "median_price": 1005000, - "sales_count": 231, - "avg_dom": 20, - }, - { - "neighbourhood_id": 30, - "name": "Brookhaven-Amesbury", - "avg_price": 1971000, - "median_price": 1813000, - "sales_count": 280, - "avg_dom": 17, - }, - { - "neighbourhood_id": 31, - "name": "Yorkdale-Glen Park", - "avg_price": 2173000, - "median_price": 1999000, - "sales_count": 319, - "avg_dom": 24, - }, - { - "neighbourhood_id": 32, - "name": "Englemount-Lawrence", - "avg_price": 810000, - "median_price": 745000, - "sales_count": 324, - "avg_dom": 12, - }, - { - "neighbourhood_id": 33, - "name": "Clanton Park", - "avg_price": 2130000, - "median_price": 1960000, - "sales_count": 126, - "avg_dom": 28, - }, - { - "neighbourhood_id": 34, - "name": "Bathurst Manor", - "avg_price": 1608000, - "median_price": 1479000, - "sales_count": 436, - "avg_dom": 25, - }, - { - "neighbourhood_id": 35, - "name": "Westminster-Branson", - "avg_price": 1358000, - "median_price": 1249000, - "sales_count": 203, - "avg_dom": 23, - }, - { - "neighbourhood_id": 36, - "name": "Newtonbrook West", - "avg_price": 945000, - "median_price": 869000, - "sales_count": 225, - "avg_dom": 22, - }, - { - "neighbourhood_id": 37, - "name": "Willowdale West", - "avg_price": 718000, - "median_price": 661000, - "sales_count": 370, - "avg_dom": 25, - }, - { - "neighbourhood_id": 38, - "name": "Lansing-Westgate", - "avg_price": 1309000, - "median_price": 1204000, - "sales_count": 294, - "avg_dom": 19, - }, - { - "neighbourhood_id": 39, - "name": "Bedford Park-Nortown", - "avg_price": 1852000, - "median_price": 1704000, - "sales_count": 286, - "avg_dom": 24, - }, - { - "neighbourhood_id": 40, - "name": "St.Andrew-Windfields", - "avg_price": 1597000, - "median_price": 1469000, - "sales_count": 339, - "avg_dom": 28, - }, - { - "neighbourhood_id": 41, - "name": "Bridle Path-Sunnybrook-York Mills", - "avg_price": 1720000, - "median_price": 1583000, - "sales_count": 226, - "avg_dom": 19, - }, - { - "neighbourhood_id": 42, - "name": "Banbury-Don Mills", - "avg_price": 1382000, - "median_price": 1271000, - "sales_count": 412, - "avg_dom": 11, - }, - { - "neighbourhood_id": 43, - "name": "Victoria Village", - "avg_price": 1907000, - "median_price": 1755000, - "sales_count": 99, - "avg_dom": 14, - }, - { - "neighbourhood_id": 44, - "name": "Flemingdon Park", - "avg_price": 2141000, - "median_price": 1970000, - "sales_count": 265, - "avg_dom": 21, - }, - { - "neighbourhood_id": 46, - "name": "Pleasant View", - "avg_price": 1795000, - "median_price": 1651000, - "sales_count": 226, - "avg_dom": 10, - }, - { - "neighbourhood_id": 47, - "name": "Don Valley Village", - "avg_price": 758000, - "median_price": 698000, - "sales_count": 219, - "avg_dom": 11, - }, - { - "neighbourhood_id": 48, - "name": "Hillcrest Village", - "avg_price": 2045000, - "median_price": 1882000, - "sales_count": 245, - "avg_dom": 14, - }, - { - "neighbourhood_id": 49, - "name": "Bayview Woods-Steeles", - "avg_price": 992000, - "median_price": 912000, - "sales_count": 294, - "avg_dom": 11, - }, - { - "neighbourhood_id": 50, - "name": "Newtonbrook East", - "avg_price": 1052000, - "median_price": 968000, - "sales_count": 199, - "avg_dom": 26, - }, - { - "neighbourhood_id": 52, - "name": "Bayview Village", - "avg_price": 2155000, - "median_price": 1983000, - "sales_count": 80, - "avg_dom": 22, - }, - { - "neighbourhood_id": 53, - "name": "Henry Farm", - "avg_price": 2177000, - "median_price": 2003000, - "sales_count": 259, - "avg_dom": 15, - }, - { - "neighbourhood_id": 54, - "name": "O'Connor-Parkview", - "avg_price": 868000, - "median_price": 798000, - "sales_count": 104, - "avg_dom": 14, - }, - { - "neighbourhood_id": 55, - "name": "Thorncliffe Park", - "avg_price": 956000, - "median_price": 880000, - "sales_count": 383, - "avg_dom": 28, - }, - { - "neighbourhood_id": 56, - "name": "Leaside-Bennington", - "avg_price": 1973000, - "median_price": 1815000, - "sales_count": 88, - "avg_dom": 10, - }, - { - "neighbourhood_id": 57, - "name": "Broadview North", - "avg_price": 2191000, - "median_price": 2016000, - "sales_count": 334, - "avg_dom": 13, - }, - { - "neighbourhood_id": 58, - "name": "Old East York", - "avg_price": 971000, - "median_price": 893000, - "sales_count": 133, - "avg_dom": 13, - }, - { - "neighbourhood_id": 59, - "name": "Danforth East York", - "avg_price": 1552000, - "median_price": 1428000, - "sales_count": 304, - "avg_dom": 23, - }, - { - "neighbourhood_id": 60, - "name": "Woodbine-Lumsden", - "avg_price": 812000, - "median_price": 747000, - "sales_count": 274, - "avg_dom": 14, - }, - { - "neighbourhood_id": 61, - "name": "Taylor-Massey", - "avg_price": 1155000, - "median_price": 1062000, - "sales_count": 381, - "avg_dom": 28, - }, - { - "neighbourhood_id": 62, - "name": "East End-Danforth", - "avg_price": 1247000, - "median_price": 1147000, - "sales_count": 380, - "avg_dom": 13, - }, - { - "neighbourhood_id": 63, - "name": "The Beaches", - "avg_price": 1091000, - "median_price": 1003000, - "sales_count": 446, - "avg_dom": 14, - }, - { - "neighbourhood_id": 64, - "name": "Woodbine Corridor", - "avg_price": 1378000, - "median_price": 1268000, - "sales_count": 404, - "avg_dom": 28, - }, - { - "neighbourhood_id": 65, - "name": "Greenwood-Coxwell", - "avg_price": 1454000, - "median_price": 1337000, - "sales_count": 287, - "avg_dom": 27, - }, - { - "neighbourhood_id": 66, - "name": "Danforth", - "avg_price": 1652000, - "median_price": 1520000, - "sales_count": 162, - "avg_dom": 17, - }, - { - "neighbourhood_id": 67, - "name": "Playter Estates-Danforth", - "avg_price": 767000, - "median_price": 705000, - "sales_count": 446, - "avg_dom": 16, - }, - { - "neighbourhood_id": 68, - "name": "North Riverdale", - "avg_price": 1502000, - "median_price": 1382000, - "sales_count": 439, - "avg_dom": 14, - }, - { - "neighbourhood_id": 69, - "name": "Blake-Jones", - "avg_price": 919000, - "median_price": 845000, - "sales_count": 322, - "avg_dom": 18, - }, - { - "neighbourhood_id": 70, - "name": "South Riverdale", - "avg_price": 1218000, - "median_price": 1120000, - "sales_count": 363, - "avg_dom": 27, - }, - { - "neighbourhood_id": 71, - "name": "Cabbagetown-South St.James Town", - "avg_price": 1939000, - "median_price": 1784000, - "sales_count": 432, - "avg_dom": 23, - }, - { - "neighbourhood_id": 72, - "name": "Regent Park", - "avg_price": 1747000, - "median_price": 1607000, - "sales_count": 141, - "avg_dom": 25, - }, - { - "neighbourhood_id": 73, - "name": "Moss Park", - "avg_price": 769000, - "median_price": 708000, - "sales_count": 408, - "avg_dom": 28, - }, - { - "neighbourhood_id": 74, - "name": "North St.James Town", - "avg_price": 2121000, - "median_price": 1952000, - "sales_count": 149, - "avg_dom": 22, - }, - { - "neighbourhood_id": 78, - "name": "Kensington-Chinatown", - "avg_price": 2009000, - "median_price": 1848000, - "sales_count": 395, - "avg_dom": 12, - }, - { - "neighbourhood_id": 79, - "name": "University", - "avg_price": 1075000, - "median_price": 989000, - "sales_count": 109, - "avg_dom": 25, - }, - { - "neighbourhood_id": 80, - "name": "Palmerston-Little Italy", - "avg_price": 1958000, - "median_price": 1801000, - "sales_count": 142, - "avg_dom": 15, - }, - { - "neighbourhood_id": 81, - "name": "Trinity-Bellwoods", - "avg_price": 1070000, - "median_price": 984000, - "sales_count": 112, - "avg_dom": 27, - }, - { - "neighbourhood_id": 83, - "name": "Dufferin Grove", - "avg_price": 1703000, - "median_price": 1567000, - "sales_count": 292, - "avg_dom": 10, - }, - { - "neighbourhood_id": 84, - "name": "Little Portugal", - "avg_price": 1819000, - "median_price": 1674000, - "sales_count": 106, - "avg_dom": 18, - }, - { - "neighbourhood_id": 85, - "name": "South Parkdale", - "avg_price": 1971000, - "median_price": 1813000, - "sales_count": 125, - "avg_dom": 24, - }, - { - "neighbourhood_id": 86, - "name": "Roncesvalles", - "avg_price": 1224000, - "median_price": 1126000, - "sales_count": 366, - "avg_dom": 23, - }, - { - "neighbourhood_id": 87, - "name": "High Park-Swansea", - "avg_price": 1503000, - "median_price": 1383000, - "sales_count": 449, - "avg_dom": 24, - }, - { - "neighbourhood_id": 88, - "name": "High Park North", - "avg_price": 1555000, - "median_price": 1431000, - "sales_count": 235, - "avg_dom": 12, - }, - { - "neighbourhood_id": 89, - "name": "Runnymede-Bloor West Village", - "avg_price": 1207000, - "median_price": 1110000, - "sales_count": 163, - "avg_dom": 16, - }, - { - "neighbourhood_id": 90, - "name": "Junction Area", - "avg_price": 797000, - "median_price": 733000, - "sales_count": 400, - "avg_dom": 13, - }, - { - "neighbourhood_id": 91, - "name": "Weston-Pelham Park", - "avg_price": 1481000, - "median_price": 1362000, - "sales_count": 240, - "avg_dom": 19, - }, - { - "neighbourhood_id": 92, - "name": "Corso Italia-Davenport", - "avg_price": 1836000, - "median_price": 1689000, - "sales_count": 307, - "avg_dom": 16, - }, - { - "neighbourhood_id": 94, - "name": "Wychwood", - "avg_price": 1587000, - "median_price": 1460000, - "sales_count": 426, - "avg_dom": 10, - }, - { - "neighbourhood_id": 95, - "name": "Annex", - "avg_price": 1510000, - "median_price": 1389000, - "sales_count": 299, - "avg_dom": 17, - }, - { - "neighbourhood_id": 96, - "name": "Casa Loma", - "avg_price": 2019000, - "median_price": 1858000, - "sales_count": 250, - "avg_dom": 28, - }, - { - "neighbourhood_id": 97, - "name": "Yonge-St.Clair", - "avg_price": 1354000, - "median_price": 1246000, - "sales_count": 172, - "avg_dom": 19, - }, - { - "neighbourhood_id": 98, - "name": "Rosedale-Moore Park", - "avg_price": 1634000, - "median_price": 1503000, - "sales_count": 273, - "avg_dom": 21, - }, - { - "neighbourhood_id": 99, - "name": "Mount Pleasant East", - "avg_price": 2182000, - "median_price": 2007000, - "sales_count": 334, - "avg_dom": 15, - }, - { - "neighbourhood_id": 100, - "name": "Yonge-Eglinton", - "avg_price": 1999000, - "median_price": 1839000, - "sales_count": 109, - "avg_dom": 12, - }, - { - "neighbourhood_id": 101, - "name": "Forest Hill South", - "avg_price": 828000, - "median_price": 762000, - "sales_count": 421, - "avg_dom": 11, - }, - { - "neighbourhood_id": 102, - "name": "Forest Hill North", - "avg_price": 719000, - "median_price": 662000, - "sales_count": 129, - "avg_dom": 26, - }, - { - "neighbourhood_id": 103, - "name": "Lawrence Park South", - "avg_price": 1540000, - "median_price": 1417000, - "sales_count": 403, - "avg_dom": 16, - }, - { - "neighbourhood_id": 105, - "name": "Lawrence Park North", - "avg_price": 734000, - "median_price": 675000, - "sales_count": 304, - "avg_dom": 22, - }, - { - "neighbourhood_id": 106, - "name": "Humewood-Cedarvale", - "avg_price": 869000, - "median_price": 799000, - "sales_count": 102, - "avg_dom": 23, - }, - { - "neighbourhood_id": 107, - "name": "Oakwood Village", - "avg_price": 1937000, - "median_price": 1782000, - "sales_count": 308, - "avg_dom": 22, - }, - { - "neighbourhood_id": 108, - "name": "Briar Hill-Belgravia", - "avg_price": 1422000, - "median_price": 1309000, - "sales_count": 444, - "avg_dom": 12, - }, - { - "neighbourhood_id": 109, - "name": "Caledonia-Fairbank", - "avg_price": 1535000, - "median_price": 1412000, - "sales_count": 384, - "avg_dom": 14, - }, - { - "neighbourhood_id": 110, - "name": "Keelesdale-Eglinton West", - "avg_price": 1680000, - "median_price": 1545000, - "sales_count": 132, - "avg_dom": 12, - }, - { - "neighbourhood_id": 111, - "name": "Rockcliffe-Smythe", - "avg_price": 1041000, - "median_price": 958000, - "sales_count": 440, - "avg_dom": 13, - }, - { - "neighbourhood_id": 112, - "name": "Beechborough-Greenbrook", - "avg_price": 2189000, - "median_price": 2014000, - "sales_count": 315, - "avg_dom": 21, - }, - { - "neighbourhood_id": 113, - "name": "Weston", - "avg_price": 2059000, - "median_price": 1894000, - "sales_count": 239, - "avg_dom": 28, - }, - { - "neighbourhood_id": 114, - "name": "Lambton Baby Point", - "avg_price": 2081000, - "median_price": 1915000, - "sales_count": 349, - "avg_dom": 26, - }, - { - "neighbourhood_id": 115, - "name": "Mount Dennis", - "avg_price": 1213000, - "median_price": 1116000, - "sales_count": 210, - "avg_dom": 17, - }, - { - "neighbourhood_id": 116, - "name": "Steeles", - "avg_price": 1020000, - "median_price": 938000, - "sales_count": 402, - "avg_dom": 28, - }, - { - "neighbourhood_id": 118, - "name": "Tam O'Shanter-Sullivan", - "avg_price": 1121000, - "median_price": 1031000, - "sales_count": 194, - "avg_dom": 24, - }, - { - "neighbourhood_id": 119, - "name": "Wexford/Maryvale", - "avg_price": 1208000, - "median_price": 1111000, - "sales_count": 117, - "avg_dom": 28, - }, - { - "neighbourhood_id": 120, - "name": "Clairlea-Birchmount", - "avg_price": 1771000, - "median_price": 1629000, - "sales_count": 355, - "avg_dom": 24, - }, - { - "neighbourhood_id": 121, - "name": "Oakridge", - "avg_price": 1489000, - "median_price": 1370000, - "sales_count": 231, - "avg_dom": 24, - }, - { - "neighbourhood_id": 122, - "name": "Birchcliffe-Cliffside", - "avg_price": 1854000, - "median_price": 1706000, - "sales_count": 223, - "avg_dom": 21, - }, - { - "neighbourhood_id": 123, - "name": "Cliffcrest", - "avg_price": 956000, - "median_price": 879000, - "sales_count": 290, - "avg_dom": 24, - }, - { - "neighbourhood_id": 124, - "name": "Kennedy Park", - "avg_price": 1980000, - "median_price": 1822000, - "sales_count": 211, - "avg_dom": 20, - }, - { - "neighbourhood_id": 125, - "name": "Ionview", - "avg_price": 1295000, - "median_price": 1192000, - "sales_count": 438, - "avg_dom": 19, - }, - { - "neighbourhood_id": 126, - "name": "Dorset Park", - "avg_price": 1348000, - "median_price": 1241000, - "sales_count": 357, - "avg_dom": 26, - }, - { - "neighbourhood_id": 128, - "name": "Agincourt South-Malvern West", - "avg_price": 787000, - "median_price": 724000, - "sales_count": 308, - "avg_dom": 13, - }, - { - "neighbourhood_id": 129, - "name": "Agincourt North", - "avg_price": 1272000, - "median_price": 1170000, - "sales_count": 229, - "avg_dom": 14, - }, - { - "neighbourhood_id": 130, - "name": "Milliken", - "avg_price": 1589000, - "median_price": 1461000, - "sales_count": 201, - "avg_dom": 14, - }, - { - "neighbourhood_id": 133, - "name": "Centennial Scarborough", - "avg_price": 1464000, - "median_price": 1347000, - "sales_count": 275, - "avg_dom": 11, - }, - { - "neighbourhood_id": 134, - "name": "Highland Creek", - "avg_price": 812000, - "median_price": 747000, - "sales_count": 366, - "avg_dom": 19, - }, - { - "neighbourhood_id": 135, - "name": "Morningside", - "avg_price": 1205000, - "median_price": 1109000, - "sales_count": 325, - "avg_dom": 23, - }, - { - "neighbourhood_id": 136, - "name": "West Hill", - "avg_price": 1961000, - "median_price": 1804000, - "sales_count": 300, - "avg_dom": 27, - }, - { - "neighbourhood_id": 138, - "name": "Eglinton East", - "avg_price": 2191000, - "median_price": 2016000, - "sales_count": 376, - "avg_dom": 23, - }, - { - "neighbourhood_id": 139, - "name": "Scarborough Village", - "avg_price": 1794000, - "median_price": 1650000, - "sales_count": 86, - "avg_dom": 13, - }, - { - "neighbourhood_id": 140, - "name": "Guildwood", - "avg_price": 783000, - "median_price": 720000, - "sales_count": 221, - "avg_dom": 18, - }, - { - "neighbourhood_id": 141, - "name": "Golfdale-Cedarbrae-Woburn", - "avg_price": 1980000, - "median_price": 1822000, - "sales_count": 141, - "avg_dom": 23, - }, - { - "neighbourhood_id": 142, - "name": "Woburn North", - "avg_price": 956000, - "median_price": 880000, - "sales_count": 89, - "avg_dom": 22, - }, - { - "neighbourhood_id": 143, - "name": "West Rouge", - "avg_price": 1481000, - "median_price": 1362000, - "sales_count": 395, - "avg_dom": 21, - }, - { - "neighbourhood_id": 144, - "name": "Morningside Heights", - "avg_price": 1923000, - "median_price": 1769000, - "sales_count": 130, - "avg_dom": 18, - }, - { - "neighbourhood_id": 145, - "name": "Malvern West", - "avg_price": 863000, - "median_price": 794000, - "sales_count": 284, - "avg_dom": 21, - }, - { - "neighbourhood_id": 146, - "name": "Malvern East", - "avg_price": 1954000, - "median_price": 1798000, - "sales_count": 437, - "avg_dom": 15, - }, - { - "neighbourhood_id": 147, - "name": "L'Amoreaux West", - "avg_price": 754000, - "median_price": 694000, - "sales_count": 351, - "avg_dom": 26, - }, - { - "neighbourhood_id": 148, - "name": "East L'Amoreaux", - "avg_price": 1691000, - "median_price": 1556000, - "sales_count": 320, - "avg_dom": 26, - }, - { - "neighbourhood_id": 149, - "name": "Parkwoods-O'Connor Hills", - "avg_price": 2013000, - "median_price": 1852000, - "sales_count": 436, - "avg_dom": 26, - }, - { - "neighbourhood_id": 150, - "name": "Fenside-Parkwoods", - "avg_price": 1474000, - "median_price": 1356000, - "sales_count": 216, - "avg_dom": 17, - }, - { - "neighbourhood_id": 151, - "name": "Yonge-Doris", - "avg_price": 970000, - "median_price": 893000, - "sales_count": 421, - "avg_dom": 27, - }, - { - "neighbourhood_id": 152, - "name": "East Willowdale", - "avg_price": 726000, - "median_price": 668000, - "sales_count": 208, - "avg_dom": 18, - }, - { - "neighbourhood_id": 153, - "name": "Avondale", - "avg_price": 1305000, - "median_price": 1201000, - "sales_count": 401, - "avg_dom": 15, - }, - { - "neighbourhood_id": 154, - "name": "Oakdale-Beverley Heights", - "avg_price": 1903000, - "median_price": 1751000, - "sales_count": 113, - "avg_dom": 13, - }, - { - "neighbourhood_id": 155, - "name": "Downsview", - "avg_price": 1334000, - "median_price": 1228000, - "sales_count": 121, - "avg_dom": 27, - }, - { - "neighbourhood_id": 156, - "name": "Bendale-Glen Andrew", - "avg_price": 1590000, - "median_price": 1463000, - "sales_count": 123, - "avg_dom": 24, - }, - { - "neighbourhood_id": 157, - "name": "Bendale South", - "avg_price": 2073000, - "median_price": 1908000, - "sales_count": 269, - "avg_dom": 26, - }, - { - "neighbourhood_id": 158, - "name": "Islington", - "avg_price": 796000, - "median_price": 732000, - "sales_count": 419, - "avg_dom": 27, - }, - { - "neighbourhood_id": 159, - "name": "Etobicoke City Centre", - "avg_price": 1124000, - "median_price": 1034000, - "sales_count": 178, - "avg_dom": 17, - }, - { - "neighbourhood_id": 160, - "name": "Mimico-Queensway", - "avg_price": 2067000, - "median_price": 1902000, - "sales_count": 132, - "avg_dom": 27, - }, - { - "neighbourhood_id": 161, - "name": "Humber Bay Shores", - "avg_price": 729000, - "median_price": 671000, - "sales_count": 433, - "avg_dom": 12, - }, - { - "neighbourhood_id": 162, - "name": "West Queen West", - "avg_price": 868000, - "median_price": 798000, - "sales_count": 448, - "avg_dom": 10, - }, - { - "neighbourhood_id": 163, - "name": "Fort York-Liberty Village", - "avg_price": 2009000, - "median_price": 1848000, - "sales_count": 372, - "avg_dom": 18, - }, - { - "neighbourhood_id": 164, - "name": "Wellington Place", - "avg_price": 952000, - "median_price": 876000, - "sales_count": 180, - "avg_dom": 26, - }, - { - "neighbourhood_id": 165, - "name": "Harbourfront-CityPlace", - "avg_price": 1879000, - "median_price": 1729000, - "sales_count": 407, - "avg_dom": 17, - }, - { - "neighbourhood_id": 166, - "name": "St Lawrence-East Bayfront-The Islands", - "avg_price": 1835000, - "median_price": 1688000, - "sales_count": 395, - "avg_dom": 15, - }, - { - "neighbourhood_id": 167, - "name": "Church-Wellesley", - "avg_price": 1861000, - "median_price": 1712000, - "sales_count": 136, - "avg_dom": 18, - }, - { - "neighbourhood_id": 168, - "name": "Downtown Yonge East", - "avg_price": 1871000, - "median_price": 1721000, - "sales_count": 320, - "avg_dom": 17, - }, - { - "neighbourhood_id": 169, - "name": "Bay-Cloverhill", - "avg_price": 1368000, - "median_price": 1259000, - "sales_count": 271, - "avg_dom": 18, - }, - { - "neighbourhood_id": 170, - "name": "Yonge-Bay Corridor", - "avg_price": 1700000, - "median_price": 1564000, - "sales_count": 204, - "avg_dom": 10, - }, - { - "neighbourhood_id": 171, - "name": "Junction-Wallace Emerson", - "avg_price": 1794000, - "median_price": 1651000, - "sales_count": 112, - "avg_dom": 27, - }, - { - "neighbourhood_id": 172, - "name": "Dovercourt Village", - "avg_price": 1789000, - "median_price": 1646000, - "sales_count": 262, - "avg_dom": 28, - }, - { - "neighbourhood_id": 173, - "name": "North Toronto", - "avg_price": 1972000, - "median_price": 1814000, - "sales_count": 109, - "avg_dom": 26, - }, - { - "neighbourhood_id": 174, - "name": "South Eglinton-Davisville", - "avg_price": 814000, - "median_price": 749000, - "sales_count": 100, - "avg_dom": 23, - }, -] - -# Real CMHC rental data from 2024 Rental Market Survey (October 2024) -# Source: data/raw/cmhc/rmr-toronto-2024-en.xlsx -CMHC_RENTAL_DATA_2024 = [ - { - "zone_code": "01", - "zone_name": "Toronto (Central)", - "avg_rent": 2092, - "vacancy_rate": 3.5, - }, - { - "zone_code": "02", - "zone_name": "Toronto (East)", - "avg_rent": 1670, - "vacancy_rate": 2.9, - }, - { - "zone_code": "03", - "zone_name": "Toronto (North)", - "avg_rent": 2033, - "vacancy_rate": 2.2, - }, - { - "zone_code": "04", - "zone_name": "Toronto (West)", - "avg_rent": 1747, - "vacancy_rate": 3.7, - }, - { - "zone_code": "05", - "zone_name": "Etobicoke (South)", - "avg_rent": 1675, - "vacancy_rate": 4.1, - }, - { - "zone_code": "06", - "zone_name": "Etobicoke (Central)", - "avg_rent": 1986, - "vacancy_rate": 1.6, - }, - { - "zone_code": "07", - "zone_name": "Etobicoke (North)", - "avg_rent": 1823, - "vacancy_rate": 1.1, - }, - {"zone_code": "08", "zone_name": "York", "avg_rent": 1843, "vacancy_rate": 3.4}, - { - "zone_code": "09", - "zone_name": "East York", - "avg_rent": 1622, - "vacancy_rate": 1.7, - }, - { - "zone_code": "10", - "zone_name": "Scarborough (Central)", - "avg_rent": 1702, - "vacancy_rate": 2.0, - }, - { - "zone_code": "11", - "zone_name": "Scarborough (North)", - "avg_rent": 1696, - "vacancy_rate": 1.0, - }, - { - "zone_code": "12", - "zone_name": "Scarborough (East)", - "avg_rent": 1699, - "vacancy_rate": 1.1, - }, - { - "zone_code": "13", - "zone_name": "North York (Southeast)", - "avg_rent": 1900, - "vacancy_rate": 2.1, - }, - { - "zone_code": "14", - "zone_name": "North York (Northeast)", - "avg_rent": 2101, - "vacancy_rate": 1.1, - }, - { - "zone_code": "15", - "zone_name": "North York (Southwest)", - "avg_rent": 1775, - "vacancy_rate": 2.9, - }, - { - "zone_code": "16", - "zone_name": "North York (N.Central)", - "avg_rent": 1950, - "vacancy_rate": 1.1, - }, - { - "zone_code": "17", - "zone_name": "North York (Northwest)", - "avg_rent": 1572, - "vacancy_rate": 1.1, - }, - { - "zone_code": "18", - "zone_name": "Mississauga (South)", - "avg_rent": 1898, - "vacancy_rate": 5.6, - }, - { - "zone_code": "19", - "zone_name": "Mississauga (Northwest)", - "avg_rent": 1926, - "vacancy_rate": 2.8, - }, - { - "zone_code": "20", - "zone_name": "Mississauga (Northeast)", - "avg_rent": 1801, - "vacancy_rate": 2.7, - }, - { - "zone_code": "21", - "zone_name": "Brampton (West)", - "avg_rent": 1858, - "vacancy_rate": 3.8, - }, - { - "zone_code": "22", - "zone_name": "Brampton (East)", - "avg_rent": 1823, - "vacancy_rate": 1.6, - }, - {"zone_code": "23", "zone_name": "Oakville", "avg_rent": 2130, "vacancy_rate": 3.3}, - { - "zone_code": "25", - "zone_name": "Richmond Hill/Vaughan/King", - "avg_rent": 1762, - "vacancy_rate": 2.4, - }, - { - "zone_code": "26", - "zone_name": "Aurora, Newmkt, Whit-St.", - "avg_rent": 1777, - "vacancy_rate": 2.4, - }, - {"zone_code": "27", "zone_name": "Markham", "avg_rent": 1831, "vacancy_rate": 2.0}, - { - "zone_code": "28", - "zone_name": "Pickering/Ajax/Uxbridge", - "avg_rent": 1675, - "vacancy_rate": 1.0, - }, - { - "zone_code": "29", - "zone_name": "Milton/Halton Hills", - "avg_rent": 1512, - "vacancy_rate": 2.1, - }, -] - -SAMPLE_TIME_SERIES_DATA = [ - {"full_date": "2023-01-01", "avg_price": 1050000, "sales_count": 3200}, - {"full_date": "2023-02-01", "avg_price": 1080000, "sales_count": 3400}, - {"full_date": "2023-03-01", "avg_price": 1120000, "sales_count": 4100}, - {"full_date": "2023-04-01", "avg_price": 1150000, "sales_count": 4500}, - {"full_date": "2023-05-01", "avg_price": 1180000, "sales_count": 4800}, - {"full_date": "2023-06-01", "avg_price": 1160000, "sales_count": 4600}, - {"full_date": "2023-07-01", "avg_price": 1140000, "sales_count": 4200}, - {"full_date": "2023-08-01", "avg_price": 1130000, "sales_count": 4000}, - {"full_date": "2023-09-01", "avg_price": 1125000, "sales_count": 3800}, - {"full_date": "2023-10-01", "avg_price": 1110000, "sales_count": 3600}, - {"full_date": "2023-11-01", "avg_price": 1100000, "sales_count": 3400}, - {"full_date": "2023-12-01", "avg_price": 1090000, "sales_count": 3000}, - {"full_date": "2024-01-01", "avg_price": 1095000, "sales_count": 3100}, - {"full_date": "2024-02-01", "avg_price": 1105000, "sales_count": 3300}, - {"full_date": "2024-03-01", "avg_price": 1125000, "sales_count": 4000}, - {"full_date": "2024-04-01", "avg_price": 1140000, "sales_count": 4400}, - {"full_date": "2024-05-01", "avg_price": 1155000, "sales_count": 4700}, - {"full_date": "2024-06-01", "avg_price": 1145000, "sales_count": 4500}, -] - - -@callback( # type: ignore[misc] - Output("purchase-choropleth", "figure"), - Input("purchase-map-metric-selector", "value"), - Input("toronto-year-selector", "value"), -) -def update_purchase_choropleth(metric: str, year: str) -> go.Figure: - """Update the neighbourhood choropleth map.""" - return create_choropleth_figure( - geojson=NEIGHBOURHOODS_GEOJSON, - data=SAMPLE_PURCHASE_DATA, - location_key="neighbourhood_id", - color_column=metric or "avg_price", - hover_data=["name", "sales_count"], - title=f"Purchase Market by Neighbourhood - {metric.replace('_', ' ').title() if metric else 'Average Price'}", - ) - - -@callback( # type: ignore[misc] - Output("rental-choropleth", "figure"), - Input("rental-map-metric-selector", "value"), - Input("toronto-year-selector", "value"), -) -def update_rental_choropleth(metric: str, year: str) -> go.Figure: - """Update the rental market choropleth map. - - Uses real CMHC Rental Market Survey data (October 2024). - """ - return create_choropleth_figure( - geojson=CMHC_ZONES_GEOJSON, - data=CMHC_RENTAL_DATA_2024, - location_key="zone_code", - color_column=metric or "avg_rent", - hover_data=["zone_name", "vacancy_rate"], - color_scale="Oranges", - title=f"Rental Market (CMHC Oct 2024) - {metric.replace('_', ' ').title() if metric else 'Average Rent'}", - ) - - -@callback( # type: ignore[misc] - Output("price-time-series", "figure"), - Input("toronto-year-selector", "value"), -) -def update_price_time_series(year: str) -> go.Figure: - """Update the price time series chart.""" - return create_price_time_series( - data=SAMPLE_TIME_SERIES_DATA, - date_column="full_date", - price_column="avg_price", - title="Average Price Trend", - ) - - -@callback( # type: ignore[misc] - Output("volume-time-series", "figure"), - Input("toronto-year-selector", "value"), -) -def update_volume_time_series(year: str) -> go.Figure: - """Update the volume time series chart.""" - return create_volume_time_series( - data=SAMPLE_TIME_SERIES_DATA, - date_column="full_date", - volume_column="sales_count", - title="Sales Volume Trend", - chart_type="bar", - ) - - -@callback( # type: ignore[misc] - Output("market-comparison-chart", "figure"), - Input("market-comparison-time-slider", "value"), -) -def update_market_comparison(time_range: list[int]) -> go.Figure: - """Update the market comparison chart.""" - return create_market_comparison_chart( - data=SAMPLE_TIME_SERIES_DATA, - date_column="full_date", - metrics=["avg_price", "sales_count"], - title="Market Indicators Comparison", - ) diff --git a/portfolio_app/pages/toronto/callbacks/chart_callbacks.py b/portfolio_app/pages/toronto/callbacks/chart_callbacks.py new file mode 100644 index 0000000..9cd0dc7 --- /dev/null +++ b/portfolio_app/pages/toronto/callbacks/chart_callbacks.py @@ -0,0 +1,385 @@ +"""Chart callbacks for supporting visualizations.""" +# mypy: disable-error-code="misc,no-untyped-def,arg-type" + +import plotly.graph_objects as go +from dash import Input, Output, callback + +from portfolio_app.figures import ( + create_donut_chart, + create_horizontal_bar, + create_radar_figure, + create_scatter_figure, +) +from portfolio_app.toronto.services import ( + get_amenities_data, + get_city_averages, + get_demographics_data, + get_housing_data, + get_neighbourhood_details, + get_safety_data, +) + + +@callback( + Output("overview-scatter-chart", "figure"), + Input("toronto-year-select", "value"), +) +def update_overview_scatter(year: str) -> go.Figure: + """Update income vs safety scatter plot.""" + year_int = int(year) if year else 2021 + df = get_demographics_data(year_int) + safety_df = get_safety_data(year_int) + + if df.empty or safety_df.empty: + return _empty_chart("No data available") + + # Merge demographics with safety + merged = df.merge( + safety_df[["neighbourhood_id", "total_crime_rate"]], + on="neighbourhood_id", + how="left", + ) + + # Compute safety score (inverse of crime rate) + if "total_crime_rate" in merged.columns: + max_crime = merged["total_crime_rate"].max() + merged["safety_score"] = 100 - (merged["total_crime_rate"] / max_crime * 100) + + data = merged.to_dict("records") + + return create_scatter_figure( + data=data, + x_column="median_household_income", + y_column="safety_score", + name_column="neighbourhood_name", + size_column="population", + title="Income vs Safety", + x_title="Median Household Income ($)", + y_title="Safety Score", + trendline=True, + ) + + +@callback( + Output("housing-trend-chart", "figure"), + Input("toronto-year-select", "value"), + Input("toronto-selected-neighbourhood", "data"), +) +def update_housing_trend(year: str, neighbourhood_id: int | None) -> go.Figure: + """Update housing rent trend chart.""" + # For now, show city averages as we don't have multi-year data + # This would be a time series if we had historical data + year_int = int(year) if year else 2021 + averages = get_city_averages(year_int) + + if not averages: + return _empty_chart("No trend data available") + + # Placeholder for trend data - would be historical + data = [ + {"year": "2019", "avg_rent": averages.get("avg_rent_2bed", 2000) * 0.85}, + {"year": "2020", "avg_rent": averages.get("avg_rent_2bed", 2000) * 0.88}, + {"year": "2021", "avg_rent": averages.get("avg_rent_2bed", 2000) * 0.92}, + {"year": "2022", "avg_rent": averages.get("avg_rent_2bed", 2000) * 0.96}, + {"year": "2023", "avg_rent": averages.get("avg_rent_2bed", 2000)}, + ] + + fig = go.Figure() + fig.add_trace( + go.Scatter( + x=[d["year"] for d in data], + y=[d["avg_rent"] for d in data], + mode="lines+markers", + line={"color": "#2196F3", "width": 2}, + marker={"size": 8}, + name="City Average", + ) + ) + + fig.update_layout( + paper_bgcolor="rgba(0,0,0,0)", + plot_bgcolor="rgba(0,0,0,0)", + font_color="#c9c9c9", + xaxis={"gridcolor": "rgba(128,128,128,0.2)"}, + yaxis={"gridcolor": "rgba(128,128,128,0.2)", "title": "Avg Rent (2BR)"}, + showlegend=False, + margin={"l": 40, "r": 10, "t": 10, "b": 30}, + ) + + return fig + + +@callback( + Output("housing-types-chart", "figure"), + Input("toronto-year-select", "value"), +) +def update_housing_types(year: str) -> go.Figure: + """Update dwelling types breakdown chart.""" + year_int = int(year) if year else 2021 + df = get_housing_data(year_int) + + if df.empty: + return _empty_chart("No data available") + + # Aggregate tenure types across city + owner_pct = df["pct_owner_occupied"].mean() + renter_pct = df["pct_renter_occupied"].mean() + + data = [ + {"type": "Owner Occupied", "percentage": owner_pct}, + {"type": "Renter Occupied", "percentage": renter_pct}, + ] + + return create_donut_chart( + data=data, + name_column="type", + value_column="percentage", + colors=["#4CAF50", "#2196F3"], + ) + + +@callback( + Output("safety-trend-chart", "figure"), + Input("toronto-year-select", "value"), +) +def update_safety_trend(year: str) -> go.Figure: + """Update crime trend chart.""" + # Placeholder for trend - would need historical data + data = [ + {"year": "2019", "crime_rate": 4500}, + {"year": "2020", "crime_rate": 4200}, + {"year": "2021", "crime_rate": 4100}, + {"year": "2022", "crime_rate": 4300}, + {"year": "2023", "crime_rate": 4250}, + ] + + fig = go.Figure() + fig.add_trace( + go.Scatter( + x=[d["year"] for d in data], + y=[d["crime_rate"] for d in data], + mode="lines+markers", + line={"color": "#FF5722", "width": 2}, + marker={"size": 8}, + fill="tozeroy", + fillcolor="rgba(255,87,34,0.1)", + ) + ) + + fig.update_layout( + paper_bgcolor="rgba(0,0,0,0)", + plot_bgcolor="rgba(0,0,0,0)", + font_color="#c9c9c9", + xaxis={"gridcolor": "rgba(128,128,128,0.2)"}, + yaxis={"gridcolor": "rgba(128,128,128,0.2)", "title": "Crime Rate per 100K"}, + showlegend=False, + margin={"l": 40, "r": 10, "t": 10, "b": 30}, + ) + + return fig + + +@callback( + Output("safety-types-chart", "figure"), + Input("toronto-year-select", "value"), +) +def update_safety_types(year: str) -> go.Figure: + """Update crime by category chart.""" + year_int = int(year) if year else 2021 + df = get_safety_data(year_int) + + if df.empty: + return _empty_chart("No data available") + + # Aggregate crime types across city + violent = df["violent_crimes"].sum() if "violent_crimes" in df.columns else 0 + property_crimes = ( + df["property_crimes"].sum() if "property_crimes" in df.columns else 0 + ) + theft = df["theft_crimes"].sum() if "theft_crimes" in df.columns else 0 + other = ( + df["total_crimes"].sum() - violent - property_crimes - theft + if "total_crimes" in df.columns + else 0 + ) + + data = [ + {"category": "Violent", "count": int(violent)}, + {"category": "Property", "count": int(property_crimes)}, + {"category": "Theft", "count": int(theft)}, + {"category": "Other", "count": int(max(0, other))}, + ] + + return create_horizontal_bar( + data=data, + name_column="category", + value_column="count", + color="#FF5722", + ) + + +@callback( + Output("demographics-age-chart", "figure"), + Input("toronto-year-select", "value"), +) +def update_demographics_age(year: str) -> go.Figure: + """Update age distribution chart.""" + year_int = int(year) if year else 2021 + df = get_demographics_data(year_int) + + if df.empty: + return _empty_chart("No data available") + + # Calculate average age distribution + under_18 = df["pct_under_18"].mean() if "pct_under_18" in df.columns else 20 + age_18_64 = df["pct_18_to_64"].mean() if "pct_18_to_64" in df.columns else 65 + over_65 = df["pct_65_plus"].mean() if "pct_65_plus" in df.columns else 15 + + data = [ + {"age_group": "Under 18", "percentage": under_18}, + {"age_group": "18-64", "percentage": age_18_64}, + {"age_group": "65+", "percentage": over_65}, + ] + + return create_donut_chart( + data=data, + name_column="age_group", + value_column="percentage", + colors=["#9C27B0", "#673AB7", "#3F51B5"], + ) + + +@callback( + Output("demographics-income-chart", "figure"), + Input("toronto-year-select", "value"), +) +def update_demographics_income(year: str) -> go.Figure: + """Update income distribution chart.""" + year_int = int(year) if year else 2021 + df = get_demographics_data(year_int) + + if df.empty: + return _empty_chart("No data available") + + # Create income quintile distribution + if "income_quintile" in df.columns: + quintile_counts = df["income_quintile"].value_counts().sort_index() + data = [ + {"bracket": f"Q{q}", "count": int(count)} + for q, count in quintile_counts.items() + ] + else: + # Fallback to placeholder + data = [ + {"bracket": "Q1 (Low)", "count": 32}, + {"bracket": "Q2", "count": 32}, + {"bracket": "Q3 (Mid)", "count": 32}, + {"bracket": "Q4", "count": 31}, + {"bracket": "Q5 (High)", "count": 31}, + ] + + return create_horizontal_bar( + data=data, + name_column="bracket", + value_column="count", + color="#4CAF50", + sort=False, + ) + + +@callback( + Output("amenities-breakdown-chart", "figure"), + Input("toronto-year-select", "value"), +) +def update_amenities_breakdown(year: str) -> go.Figure: + """Update amenity breakdown chart.""" + year_int = int(year) if year else 2021 + df = get_amenities_data(year_int) + + if df.empty: + return _empty_chart("No data available") + + # Aggregate amenity counts + parks = df["park_count"].sum() if "park_count" in df.columns else 0 + schools = df["school_count"].sum() if "school_count" in df.columns else 0 + childcare = df["childcare_count"].sum() if "childcare_count" in df.columns else 0 + + data = [ + {"type": "Parks", "count": int(parks)}, + {"type": "Schools", "count": int(schools)}, + {"type": "Childcare", "count": int(childcare)}, + ] + + return create_horizontal_bar( + data=data, + name_column="type", + value_column="count", + color="#4CAF50", + ) + + +@callback( + Output("amenities-radar-chart", "figure"), + Input("toronto-year-select", "value"), + Input("toronto-selected-neighbourhood", "data"), +) +def update_amenities_radar(year: str, neighbourhood_id: int | None) -> go.Figure: + """Update amenity comparison radar chart.""" + year_int = int(year) if year else 2021 + + # Get city averages + averages = get_city_averages(year_int) + + city_data = { + "parks_per_1000": averages.get("avg_amenity_score", 50) / 100 * 10, + "schools_per_1000": averages.get("avg_amenity_score", 50) / 100 * 5, + "childcare_per_1000": averages.get("avg_amenity_score", 50) / 100 * 3, + "transit_access": 70, + } + + data = [city_data] + + # Add selected neighbourhood if available + if neighbourhood_id: + details = get_neighbourhood_details(neighbourhood_id, year_int) + if details: + selected_data = { + "parks_per_1000": details.get("park_count", 0) / 10, + "schools_per_1000": details.get("school_count", 0) / 5, + "childcare_per_1000": 3, + "transit_access": 70, + } + data.insert(0, selected_data) + + return create_radar_figure( + data=data, + metrics=[ + "parks_per_1000", + "schools_per_1000", + "childcare_per_1000", + "transit_access", + ], + fill=True, + ) + + +def _empty_chart(message: str) -> go.Figure: + """Create an empty chart with a message.""" + fig = go.Figure() + fig.update_layout( + paper_bgcolor="rgba(0,0,0,0)", + plot_bgcolor="rgba(0,0,0,0)", + font_color="#c9c9c9", + xaxis={"visible": False}, + yaxis={"visible": False}, + ) + fig.add_annotation( + text=message, + xref="paper", + yref="paper", + x=0.5, + y=0.5, + showarrow=False, + font={"size": 14, "color": "#888888"}, + ) + return fig diff --git a/portfolio_app/pages/toronto/callbacks/map_callbacks.py b/portfolio_app/pages/toronto/callbacks/map_callbacks.py new file mode 100644 index 0000000..9aef1e8 --- /dev/null +++ b/portfolio_app/pages/toronto/callbacks/map_callbacks.py @@ -0,0 +1,304 @@ +"""Map callbacks for choropleth interactions.""" +# mypy: disable-error-code="misc,no-untyped-def,arg-type,no-any-return" + +import plotly.graph_objects as go +from dash import Input, Output, State, callback, no_update + +from portfolio_app.figures import create_choropleth_figure, create_ranking_bar +from portfolio_app.toronto.services import ( + get_amenities_data, + get_demographics_data, + get_housing_data, + get_neighbourhoods_geojson, + get_overview_data, + get_safety_data, +) + + +@callback( + Output("overview-choropleth", "figure"), + Input("overview-metric-select", "value"), + Input("toronto-year-select", "value"), +) +def update_overview_choropleth(metric: str, year: str) -> go.Figure: + """Update the overview tab choropleth map.""" + year_int = int(year) if year else 2021 + df = get_overview_data(year_int) + geojson = get_neighbourhoods_geojson(year_int) + + if df.empty: + return _empty_map("No data available") + + data = df.to_dict("records") + + # Color scales based on metric + color_scale = { + "livability_score": "Viridis", + "safety_score": "Greens", + "affordability_score": "Blues", + "amenity_score": "Purples", + }.get(metric, "Viridis") + + return create_choropleth_figure( + geojson=geojson, + data=data, + location_key="neighbourhood_id", + color_column=metric or "livability_score", + hover_data=["neighbourhood_name", "population"], + color_scale=color_scale, + ) + + +@callback( + Output("housing-choropleth", "figure"), + Input("housing-metric-select", "value"), + Input("toronto-year-select", "value"), +) +def update_housing_choropleth(metric: str, year: str) -> go.Figure: + """Update the housing tab choropleth map.""" + year_int = int(year) if year else 2021 + df = get_housing_data(year_int) + geojson = get_neighbourhoods_geojson(year_int) + + if df.empty: + return _empty_map("No housing data available") + + data = df.to_dict("records") + + color_scale = { + "affordability_index": "RdYlGn_r", + "avg_rent_2bed": "Oranges", + "rent_to_income_pct": "Reds", + "vacancy_rate": "Blues", + }.get(metric, "Oranges") + + return create_choropleth_figure( + geojson=geojson, + data=data, + location_key="neighbourhood_id", + color_column=metric or "affordability_index", + hover_data=["neighbourhood_name", "avg_rent_2bed", "vacancy_rate"], + color_scale=color_scale, + ) + + +@callback( + Output("safety-choropleth", "figure"), + Input("safety-metric-select", "value"), + Input("toronto-year-select", "value"), +) +def update_safety_choropleth(metric: str, year: str) -> go.Figure: + """Update the safety tab choropleth map.""" + year_int = int(year) if year else 2021 + df = get_safety_data(year_int) + geojson = get_neighbourhoods_geojson(year_int) + + if df.empty: + return _empty_map("No safety data available") + + data = df.to_dict("records") + + return create_choropleth_figure( + geojson=geojson, + data=data, + location_key="neighbourhood_id", + color_column=metric or "total_crime_rate", + hover_data=["neighbourhood_name", "total_crimes"], + color_scale="Reds", + ) + + +@callback( + Output("demographics-choropleth", "figure"), + Input("demographics-metric-select", "value"), + Input("toronto-year-select", "value"), +) +def update_demographics_choropleth(metric: str, year: str) -> go.Figure: + """Update the demographics tab choropleth map.""" + year_int = int(year) if year else 2021 + df = get_demographics_data(year_int) + geojson = get_neighbourhoods_geojson(year_int) + + if df.empty: + return _empty_map("No demographics data available") + + data = df.to_dict("records") + + color_scale = { + "population": "YlOrBr", + "median_income": "Greens", + "median_age": "Blues", + "diversity_index": "Purples", + }.get(metric, "YlOrBr") + + # Map frontend metric names to column names + column_map = { + "population": "population", + "median_income": "median_household_income", + "median_age": "median_age", + "diversity_index": "diversity_index", + } + column = column_map.get(metric, "population") + + return create_choropleth_figure( + geojson=geojson, + data=data, + location_key="neighbourhood_id", + color_column=column, + hover_data=["neighbourhood_name"], + color_scale=color_scale, + ) + + +@callback( + Output("amenities-choropleth", "figure"), + Input("amenities-metric-select", "value"), + Input("toronto-year-select", "value"), +) +def update_amenities_choropleth(metric: str, year: str) -> go.Figure: + """Update the amenities tab choropleth map.""" + year_int = int(year) if year else 2021 + df = get_amenities_data(year_int) + geojson = get_neighbourhoods_geojson(year_int) + + if df.empty: + return _empty_map("No amenities data available") + + data = df.to_dict("records") + + # Map frontend metric names to column names + column_map = { + "amenity_score": "amenity_score", + "parks_per_capita": "parks_per_1000", + "schools_per_capita": "schools_per_1000", + "transit_score": "total_amenities_per_1000", + } + column = column_map.get(metric, "amenity_score") + + return create_choropleth_figure( + geojson=geojson, + data=data, + location_key="neighbourhood_id", + color_column=column, + hover_data=["neighbourhood_name", "park_count", "school_count"], + color_scale="Greens", + ) + + +@callback( + Output("toronto-selected-neighbourhood", "data"), + Input("overview-choropleth", "clickData"), + Input("housing-choropleth", "clickData"), + Input("safety-choropleth", "clickData"), + Input("demographics-choropleth", "clickData"), + Input("amenities-choropleth", "clickData"), + State("toronto-tabs", "value"), + prevent_initial_call=True, +) +def handle_map_click( + overview_click, + housing_click, + safety_click, + demographics_click, + amenities_click, + active_tab: str, +) -> int | None: + """Extract neighbourhood ID from map click.""" + # Get the click data for the active tab + click_map = { + "overview": overview_click, + "housing": housing_click, + "safety": safety_click, + "demographics": demographics_click, + "amenities": amenities_click, + } + + click_data = click_map.get(active_tab) + + if not click_data: + return no_update + + try: + # Extract neighbourhood_id from click data + point = click_data["points"][0] + location = point.get("location") or point.get("customdata", [None])[0] + if location: + return int(location) + except (KeyError, IndexError, TypeError): + pass + + return no_update + + +@callback( + Output("overview-rankings-chart", "figure"), + Input("overview-metric-select", "value"), + Input("toronto-year-select", "value"), +) +def update_rankings_chart(metric: str, year: str) -> go.Figure: + """Update the top/bottom rankings bar chart.""" + year_int = int(year) if year else 2021 + df = get_overview_data(year_int) + + if df.empty: + return _empty_chart("No data available") + + # Use the selected metric for ranking + metric = metric or "livability_score" + data = df.to_dict("records") + + return create_ranking_bar( + data=data, + name_column="neighbourhood_name", + value_column=metric, + title=f"Top & Bottom 10 by {metric.replace('_', ' ').title()}", + top_n=10, + bottom_n=10, + ) + + +def _empty_map(message: str) -> go.Figure: + """Create an empty map with a message.""" + fig = go.Figure() + fig.update_layout( + mapbox={ + "style": "carto-darkmatter", + "center": {"lat": 43.7, "lon": -79.4}, + "zoom": 9.5, + }, + margin={"l": 0, "r": 0, "t": 0, "b": 0}, + paper_bgcolor="rgba(0,0,0,0)", + font_color="#c9c9c9", + ) + fig.add_annotation( + text=message, + xref="paper", + yref="paper", + x=0.5, + y=0.5, + showarrow=False, + font={"size": 14, "color": "#888888"}, + ) + return fig + + +def _empty_chart(message: str) -> go.Figure: + """Create an empty chart with a message.""" + fig = go.Figure() + fig.update_layout( + paper_bgcolor="rgba(0,0,0,0)", + plot_bgcolor="rgba(0,0,0,0)", + font_color="#c9c9c9", + xaxis={"visible": False}, + yaxis={"visible": False}, + ) + fig.add_annotation( + text=message, + xref="paper", + yref="paper", + x=0.5, + y=0.5, + showarrow=False, + font={"size": 14, "color": "#888888"}, + ) + return fig diff --git a/portfolio_app/pages/toronto/callbacks/selection_callbacks.py b/portfolio_app/pages/toronto/callbacks/selection_callbacks.py new file mode 100644 index 0000000..ecb2655 --- /dev/null +++ b/portfolio_app/pages/toronto/callbacks/selection_callbacks.py @@ -0,0 +1,309 @@ +"""Selection callbacks for dropdowns and neighbourhood details.""" +# mypy: disable-error-code="misc,no-untyped-def,type-arg" + +import dash_mantine_components as dmc +from dash import Input, Output, callback + +from portfolio_app.toronto.services import ( + get_city_averages, + get_neighbourhood_details, + get_neighbourhood_list, +) + + +@callback( + Output("toronto-neighbourhood-select", "data"), + Input("toronto-year-select", "value"), +) +def populate_neighbourhood_dropdown(year: str) -> list[dict]: + """Populate the neighbourhood search dropdown.""" + year_int = int(year) if year else 2021 + neighbourhoods = get_neighbourhood_list(year_int) + + return [ + {"value": str(n["neighbourhood_id"]), "label": n["neighbourhood_name"]} + for n in neighbourhoods + ] + + +@callback( + Output("toronto-selected-neighbourhood", "data", allow_duplicate=True), + Input("toronto-neighbourhood-select", "value"), + prevent_initial_call=True, +) +def select_from_dropdown(value: str | None) -> int | None: + """Update selected neighbourhood from dropdown.""" + if value: + return int(value) + return None + + +@callback( + Output("toronto-compare-btn", "disabled"), + Input("toronto-selected-neighbourhood", "data"), +) +def toggle_compare_button(neighbourhood_id: int | None) -> bool: + """Enable compare button when a neighbourhood is selected.""" + return neighbourhood_id is None + + +# Overview tab KPIs +@callback( + Output("overview-city-avg", "children"), + Input("toronto-year-select", "value"), +) +def update_overview_city_avg(year: str) -> str: + """Update the city average livability score.""" + year_int = int(year) if year else 2021 + averages = get_city_averages(year_int) + score = averages.get("avg_livability_score", 72) + return f"{score:.0f}" if score else "—" + + +@callback( + Output("overview-selected-name", "children"), + Output("overview-selected-scores", "children"), + Input("toronto-selected-neighbourhood", "data"), + Input("toronto-year-select", "value"), +) +def update_overview_selected(neighbourhood_id: int | None, year: str): + """Update the selected neighbourhood details in overview tab.""" + if not neighbourhood_id: + return "Click map to select", [dmc.Text("—", c="dimmed")] + + year_int = int(year) if year else 2021 + details = get_neighbourhood_details(neighbourhood_id, year_int) + + if not details: + return "Unknown", [dmc.Text("No data", c="dimmed")] + + name = details.get("neighbourhood_name", "Unknown") + scores = [ + dmc.Group( + [ + dmc.Text("Livability:", size="sm"), + dmc.Text( + f"{details.get('livability_score', 0):.0f}", size="sm", fw=700 + ), + ], + justify="space-between", + ), + dmc.Group( + [ + dmc.Text("Safety:", size="sm"), + dmc.Text(f"{details.get('safety_score', 0):.0f}", size="sm", fw=700), + ], + justify="space-between", + ), + dmc.Group( + [ + dmc.Text("Affordability:", size="sm"), + dmc.Text( + f"{details.get('affordability_score', 0):.0f}", size="sm", fw=700 + ), + ], + justify="space-between", + ), + ] + + return name, scores + + +# Housing tab KPIs +@callback( + Output("housing-city-rent", "children"), + Output("housing-rent-change", "children"), + Input("toronto-year-select", "value"), +) +def update_housing_kpis(year: str): + """Update housing tab KPI cards.""" + year_int = int(year) if year else 2021 + averages = get_city_averages(year_int) + + rent = averages.get("avg_rent_2bed", 2450) + rent_str = f"${rent:,.0f}" if rent else "—" + + # Placeholder change - would come from historical data + change = "+4.2% YoY" + + return rent_str, change + + +@callback( + Output("housing-selected-name", "children"), + Output("housing-selected-details", "children"), + Input("toronto-selected-neighbourhood", "data"), + Input("toronto-year-select", "value"), +) +def update_housing_selected(neighbourhood_id: int | None, year: str): + """Update selected neighbourhood details in housing tab.""" + if not neighbourhood_id: + return "Click map to select", [dmc.Text("—", c="dimmed")] + + year_int = int(year) if year else 2021 + details = get_neighbourhood_details(neighbourhood_id, year_int) + + if not details: + return "Unknown", [dmc.Text("No data", c="dimmed")] + + name = details.get("neighbourhood_name", "Unknown") + rent = details.get("avg_rent_2bed") + vacancy = details.get("vacancy_rate") + + info = [ + dmc.Text(f"2BR Rent: ${rent:,.0f}" if rent else "2BR Rent: —", size="sm"), + dmc.Text(f"Vacancy: {vacancy:.1f}%" if vacancy else "Vacancy: —", size="sm"), + ] + + return name, info + + +# Safety tab KPIs +@callback( + Output("safety-city-rate", "children"), + Output("safety-rate-change", "children"), + Input("toronto-year-select", "value"), +) +def update_safety_kpis(year: str): + """Update safety tab KPI cards.""" + year_int = int(year) if year else 2021 + averages = get_city_averages(year_int) + + rate = averages.get("avg_crime_rate", 4250) + rate_str = f"{rate:,.0f}" if rate else "—" + + # Placeholder change + change = "-2.1% YoY" + + return rate_str, change + + +@callback( + Output("safety-selected-name", "children"), + Output("safety-selected-details", "children"), + Input("toronto-selected-neighbourhood", "data"), + Input("toronto-year-select", "value"), +) +def update_safety_selected(neighbourhood_id: int | None, year: str): + """Update selected neighbourhood details in safety tab.""" + if not neighbourhood_id: + return "Click map to select", [dmc.Text("—", c="dimmed")] + + year_int = int(year) if year else 2021 + details = get_neighbourhood_details(neighbourhood_id, year_int) + + if not details: + return "Unknown", [dmc.Text("No data", c="dimmed")] + + name = details.get("neighbourhood_name", "Unknown") + crime_rate = details.get("crime_rate_per_100k") + + info = [ + dmc.Text( + f"Crime Rate: {crime_rate:,.0f}/100K" if crime_rate else "Crime Rate: —", + size="sm", + ), + ] + + return name, info + + +# Demographics tab KPIs +@callback( + Output("demographics-city-pop", "children"), + Output("demographics-pop-change", "children"), + Input("toronto-year-select", "value"), +) +def update_demographics_kpis(year: str): + """Update demographics tab KPI cards.""" + year_int = int(year) if year else 2021 + averages = get_city_averages(year_int) + + pop = averages.get("total_population", 2790000) + if pop and pop >= 1000000: + pop_str = f"{pop / 1000000:.2f}M" + elif pop: + pop_str = f"{pop:,.0f}" + else: + pop_str = "—" + + change = "+2.3% since 2016" + + return pop_str, change + + +@callback( + Output("demographics-selected-name", "children"), + Output("demographics-selected-details", "children"), + Input("toronto-selected-neighbourhood", "data"), + Input("toronto-year-select", "value"), +) +def update_demographics_selected(neighbourhood_id: int | None, year: str): + """Update selected neighbourhood details in demographics tab.""" + if not neighbourhood_id: + return "Click map to select", [dmc.Text("—", c="dimmed")] + + year_int = int(year) if year else 2021 + details = get_neighbourhood_details(neighbourhood_id, year_int) + + if not details: + return "Unknown", [dmc.Text("No data", c="dimmed")] + + name = details.get("neighbourhood_name", "Unknown") + pop = details.get("population") + income = details.get("median_household_income") + + info = [ + dmc.Text(f"Population: {pop:,}" if pop else "Population: —", size="sm"), + dmc.Text( + f"Median Income: ${income:,.0f}" if income else "Median Income: —", + size="sm", + ), + ] + + return name, info + + +# Amenities tab KPIs +@callback( + Output("amenities-city-score", "children"), + Input("toronto-year-select", "value"), +) +def update_amenities_kpis(year: str) -> str: + """Update amenities tab KPI cards.""" + year_int = int(year) if year else 2021 + averages = get_city_averages(year_int) + + score = averages.get("avg_amenity_score", 68) + return f"{score:.0f}" if score else "—" + + +@callback( + Output("amenities-selected-name", "children"), + Output("amenities-selected-details", "children"), + Input("toronto-selected-neighbourhood", "data"), + Input("toronto-year-select", "value"), +) +def update_amenities_selected(neighbourhood_id: int | None, year: str): + """Update selected neighbourhood details in amenities tab.""" + if not neighbourhood_id: + return "Click map to select", [dmc.Text("—", c="dimmed")] + + year_int = int(year) if year else 2021 + details = get_neighbourhood_details(neighbourhood_id, year_int) + + if not details: + return "Unknown", [dmc.Text("No data", c="dimmed")] + + name = details.get("neighbourhood_name", "Unknown") + parks = details.get("park_count") + schools = details.get("school_count") + + info = [ + dmc.Text(f"Parks: {parks}" if parks is not None else "Parks: —", size="sm"), + dmc.Text( + f"Schools: {schools}" if schools is not None else "Schools: —", size="sm" + ), + ] + + return name, info diff --git a/portfolio_app/pages/toronto/dashboard.py b/portfolio_app/pages/toronto/dashboard.py index 85d1e45..7625bd5 100644 --- a/portfolio_app/pages/toronto/dashboard.py +++ b/portfolio_app/pages/toronto/dashboard.py @@ -1,62 +1,56 @@ -"""Toronto Housing Dashboard page.""" +"""Toronto Neighbourhood Dashboard page. + +Displays neighbourhood-level data across 5 tabs: Overview, Housing, Safety, +Demographics, and Amenities. Each tab provides interactive choropleth maps, +KPI cards, and supporting charts. +""" import dash import dash_mantine_components as dmc -from dash import dcc, html +from dash import dcc from dash_iconify import DashIconify -from portfolio_app.components import ( - create_map_controls, - create_metric_cards_row, - create_time_slider, - create_year_selector, +from portfolio_app.pages.toronto.tabs import ( + create_amenities_tab, + create_demographics_tab, + create_housing_tab, + create_overview_tab, + create_safety_tab, ) -dash.register_page(__name__, path="/toronto", name="Toronto Housing") +dash.register_page(__name__, path="/toronto", name="Toronto Neighbourhoods") -# Metric options for the purchase market -PURCHASE_METRIC_OPTIONS = [ - {"label": "Average Price", "value": "avg_price"}, - {"label": "Median Price", "value": "median_price"}, - {"label": "Sales Volume", "value": "sales_count"}, - {"label": "Days on Market", "value": "avg_dom"}, -] - -# Metric options for the rental market -RENTAL_METRIC_OPTIONS = [ - {"label": "Average Rent", "value": "avg_rent"}, - {"label": "Vacancy Rate", "value": "vacancy_rate"}, - {"label": "Rental Universe", "value": "rental_universe"}, -] - -# Sample metrics for KPI cards (will be populated by callbacks) -SAMPLE_METRICS = [ +# Tab configuration +TAB_CONFIG = [ { - "title": "Avg. Price", - "value": 1125000, - "delta": 2.3, - "prefix": "$", - "format_spec": ",.0f", + "value": "overview", + "label": "Overview", + "icon": "tabler:chart-pie", + "color": "blue", }, { - "title": "Sales Volume", - "value": 4850, - "delta": -5.1, - "format_spec": ",", + "value": "housing", + "label": "Housing", + "icon": "tabler:home", + "color": "teal", }, { - "title": "Avg. DOM", - "value": 18, - "delta": 3, - "suffix": " days", - "positive_is_good": False, + "value": "safety", + "label": "Safety", + "icon": "tabler:shield-check", + "color": "orange", }, { - "title": "Avg. Rent", - "value": 2450, - "delta": 4.2, - "prefix": "$", - "format_spec": ",.0f", + "value": "demographics", + "label": "Demographics", + "icon": "tabler:users", + "color": "violet", + }, + { + "value": "amenities", + "label": "Amenities", + "icon": "tabler:trees", + "color": "green", }, ] @@ -67,9 +61,9 @@ def create_header() -> dmc.Group: [ dmc.Stack( [ - dmc.Title("Toronto Housing Dashboard", order=1), + dmc.Title("Toronto Neighbourhood Dashboard", order=1), dmc.Text( - "Real estate market analysis for the Greater Toronto Area", + "Explore livability across 158 Toronto neighbourhoods", c="dimmed", ), ], @@ -88,11 +82,17 @@ def create_header() -> dmc.Group: ), href="/toronto/methodology", ), - create_year_selector( - id_prefix="toronto", - min_year=2020, - default_year=2024, - label="Year", + dmc.Select( + id="toronto-year-select", + data=[ + {"value": "2021", "label": "2021"}, + {"value": "2022", "label": "2022"}, + {"value": "2023", "label": "2023"}, + ], + value="2021", + label="Census Year", + size="sm", + w=120, ), ], gap="md", @@ -103,187 +103,100 @@ def create_header() -> dmc.Group: ) -def create_kpi_section() -> dmc.Box: - """Create the KPI metrics row.""" - return dmc.Box( - children=[ - dmc.Title("Key Metrics", order=3, size="h4", mb="sm"), - html.Div( - id="toronto-kpi-cards", - children=[ - create_metric_cards_row(SAMPLE_METRICS, id_prefix="toronto-kpi") - ], - ), - ], - ) - - -def create_purchase_map_section() -> dmc.Grid: - """Create the purchase market choropleth section.""" - return dmc.Grid( - [ - dmc.GridCol( - create_map_controls( - id_prefix="purchase-map", - metric_options=PURCHASE_METRIC_OPTIONS, - default_metric="avg_price", - ), - span={"base": 12, "md": 3}, - ), - dmc.GridCol( - dmc.Paper( - children=[ - dcc.Graph( - id="purchase-choropleth", - config={"scrollZoom": True}, - style={"height": "500px"}, - ), - ], - p="xs", - radius="sm", - withBorder=True, - ), - span={"base": 12, "md": 9}, - ), - ], - gutter="md", - ) - - -def create_rental_map_section() -> dmc.Grid: - """Create the rental market choropleth section.""" - return dmc.Grid( - [ - dmc.GridCol( - create_map_controls( - id_prefix="rental-map", - metric_options=RENTAL_METRIC_OPTIONS, - default_metric="avg_rent", - ), - span={"base": 12, "md": 3}, - ), - dmc.GridCol( - dmc.Paper( - children=[ - dcc.Graph( - id="rental-choropleth", - config={"scrollZoom": True}, - style={"height": "500px"}, - ), - ], - p="xs", - radius="sm", - withBorder=True, - ), - span={"base": 12, "md": 9}, - ), - ], - gutter="md", - ) - - -def create_time_series_section() -> dmc.Grid: - """Create the time series charts section.""" - return dmc.Grid( - [ - dmc.GridCol( - dmc.Paper( - children=[ - dmc.Title("Price Trends", order=4, size="h5", mb="sm"), - dcc.Graph( - id="price-time-series", - config={"displayModeBar": False}, - style={"height": "350px"}, - ), - ], - p="md", - radius="sm", - withBorder=True, - ), - span={"base": 12, "md": 6}, - ), - dmc.GridCol( - dmc.Paper( - children=[ - dmc.Title("Sales Volume", order=4, size="h5", mb="sm"), - dcc.Graph( - id="volume-time-series", - config={"displayModeBar": False}, - style={"height": "350px"}, - ), - ], - p="md", - radius="sm", - withBorder=True, - ), - span={"base": 12, "md": 6}, - ), - ], - gutter="md", - ) - - -def create_market_comparison_section() -> dmc.Paper: - """Create the market comparison chart section.""" +def create_neighbourhood_selector() -> dmc.Paper: + """Create the neighbourhood search/select component.""" return dmc.Paper( - children=[ - dmc.Group( - [ - dmc.Title("Market Indicators", order=4, size="h5"), - create_time_slider( - id_prefix="market-comparison", - min_year=2020, - label="", - ), - ], - justify="space-between", - align="center", - mb="md", - ), - dcc.Graph( - id="market-comparison-chart", - config={"displayModeBar": False}, - style={"height": "400px"}, - ), - ], - p="md", + dmc.Group( + [ + DashIconify(icon="tabler:search", width=20, color="gray"), + dmc.Select( + id="toronto-neighbourhood-select", + placeholder="Search neighbourhoods...", + searchable=True, + clearable=True, + data=[], # Populated by callback + style={"flex": 1}, + ), + dmc.Button( + "Compare", + id="toronto-compare-btn", + leftSection=DashIconify(icon="tabler:git-compare", width=16), + variant="light", + disabled=True, + ), + ], + gap="sm", + ), + p="sm", radius="sm", withBorder=True, ) +def create_tab_navigation() -> dmc.Tabs: + """Create the tab navigation with icons.""" + return dmc.Tabs( + [ + dmc.TabsList( + [ + dmc.TabsTab( + dmc.Group( + [ + DashIconify(icon=tab["icon"], width=18), + dmc.Text(tab["label"], size="sm"), + ], + gap="xs", + ), + value=tab["value"], + ) + for tab in TAB_CONFIG + ], + grow=True, + ), + # Tab panels + dmc.TabsPanel(create_overview_tab(), value="overview", pt="md"), + dmc.TabsPanel(create_housing_tab(), value="housing", pt="md"), + dmc.TabsPanel(create_safety_tab(), value="safety", pt="md"), + dmc.TabsPanel(create_demographics_tab(), value="demographics", pt="md"), + dmc.TabsPanel(create_amenities_tab(), value="amenities", pt="md"), + ], + id="toronto-tabs", + value="overview", + variant="default", + ) + + def create_data_notice() -> dmc.Alert: - """Create a notice about data availability.""" + """Create a notice about data sources.""" return dmc.Alert( children=[ dmc.Text( - "This dashboard displays Toronto neighbourhood and CMHC rental data. " - "Sample data is shown for demonstration purposes.", + "Data from Toronto Open Data (Census 2021, Crime Statistics) and " + "CMHC Rental Market Reports. Click neighbourhoods on the map for details.", size="sm", ), ], - title="Data Notice", + title="Data Sources", color="blue", variant="light", + icon=DashIconify(icon="tabler:info-circle", width=20), ) +# Store for selected neighbourhood +neighbourhood_store = dcc.Store(id="toronto-selected-neighbourhood", data=None) + # Register callbacks from portfolio_app.pages.toronto import callbacks # noqa: E402, F401 layout = dmc.Container( dmc.Stack( [ + neighbourhood_store, create_header(), create_data_notice(), - create_kpi_section(), - dmc.Divider(my="md", label="Purchase Market", labelPosition="center"), - create_purchase_map_section(), - dmc.Divider(my="md", label="Rental Market", labelPosition="center"), - create_rental_map_section(), - dmc.Divider(my="md", label="Trends", labelPosition="center"), - create_time_series_section(), - create_market_comparison_section(), + create_neighbourhood_selector(), + create_tab_navigation(), dmc.Space(h=40), ], gap="lg", diff --git a/portfolio_app/pages/toronto/tabs/__init__.py b/portfolio_app/pages/toronto/tabs/__init__.py new file mode 100644 index 0000000..d68704e --- /dev/null +++ b/portfolio_app/pages/toronto/tabs/__init__.py @@ -0,0 +1,15 @@ +"""Tab modules for Toronto Neighbourhood Dashboard.""" + +from .amenities import create_amenities_tab +from .demographics import create_demographics_tab +from .housing import create_housing_tab +from .overview import create_overview_tab +from .safety import create_safety_tab + +__all__ = [ + "create_overview_tab", + "create_housing_tab", + "create_safety_tab", + "create_demographics_tab", + "create_amenities_tab", +] diff --git a/portfolio_app/pages/toronto/tabs/amenities.py b/portfolio_app/pages/toronto/tabs/amenities.py new file mode 100644 index 0000000..49b0188 --- /dev/null +++ b/portfolio_app/pages/toronto/tabs/amenities.py @@ -0,0 +1,207 @@ +"""Amenities tab for Toronto Neighbourhood Dashboard. + +Displays parks, schools, transit, and other amenity metrics. +""" + +import dash_mantine_components as dmc +from dash import dcc + + +def create_amenities_tab() -> dmc.Stack: + """Create the Amenities tab layout. + + Layout: + - Choropleth map (amenity score) | KPI cards + - Amenity breakdown chart | Amenity comparison radar + + Returns: + Tab content as a Mantine Stack component. + """ + return dmc.Stack( + [ + # Main content: Map + KPIs + dmc.Grid( + [ + # Choropleth map + dmc.GridCol( + dmc.Paper( + [ + dmc.Group( + [ + dmc.Title( + "Neighbourhood Amenities", + order=4, + size="h5", + ), + dmc.Select( + id="amenities-metric-select", + data=[ + { + "value": "amenity_score", + "label": "Amenity Score", + }, + { + "value": "parks_per_capita", + "label": "Parks per 1K", + }, + { + "value": "schools_per_capita", + "label": "Schools per 1K", + }, + { + "value": "transit_score", + "label": "Transit Score", + }, + ], + value="amenity_score", + size="sm", + w=180, + ), + ], + justify="space-between", + mb="sm", + ), + dcc.Graph( + id="amenities-choropleth", + config={ + "scrollZoom": True, + "displayModeBar": False, + }, + style={"height": "450px"}, + ), + ], + p="md", + radius="sm", + withBorder=True, + ), + span={"base": 12, "lg": 8}, + ), + # KPI cards + dmc.GridCol( + dmc.Stack( + [ + dmc.Paper( + [ + dmc.Text( + "City Amenity Score", size="xs", c="dimmed" + ), + dmc.Title( + id="amenities-city-score", + children="68", + order=2, + ), + dmc.Text( + "Out of 100", + size="sm", + c="dimmed", + ), + ], + p="md", + radius="sm", + withBorder=True, + ), + dmc.Paper( + [ + dmc.Text("Total Parks", size="xs", c="dimmed"), + dmc.Title( + id="amenities-total-parks", + children="1,500+", + order=2, + ), + dmc.Text( + id="amenities-park-area", + children="8,000+ hectares", + size="sm", + c="green", + ), + ], + p="md", + radius="sm", + withBorder=True, + ), + dmc.Paper( + [ + dmc.Text( + "Selected Neighbourhood", + size="xs", + c="dimmed", + ), + dmc.Title( + id="amenities-selected-name", + children="Click map to select", + order=4, + size="h5", + ), + dmc.Stack( + id="amenities-selected-details", + children=[ + dmc.Text("—", c="dimmed"), + ], + gap="xs", + ), + ], + p="md", + radius="sm", + withBorder=True, + ), + ], + gap="md", + ), + span={"base": 12, "lg": 4}, + ), + ], + gutter="md", + ), + # Supporting charts + dmc.Grid( + [ + # Amenity breakdown + dmc.GridCol( + dmc.Paper( + [ + dmc.Title( + "Amenity Breakdown", + order=4, + size="h5", + mb="sm", + ), + dcc.Graph( + id="amenities-breakdown-chart", + config={"displayModeBar": False}, + style={"height": "300px"}, + ), + ], + p="md", + radius="sm", + withBorder=True, + ), + span={"base": 12, "md": 6}, + ), + # Amenity comparison radar + dmc.GridCol( + dmc.Paper( + [ + dmc.Title( + "Amenity Comparison", + order=4, + size="h5", + mb="sm", + ), + dcc.Graph( + id="amenities-radar-chart", + config={"displayModeBar": False}, + style={"height": "300px"}, + ), + ], + p="md", + radius="sm", + withBorder=True, + ), + span={"base": 12, "md": 6}, + ), + ], + gutter="md", + ), + ], + gap="md", + ) diff --git a/portfolio_app/pages/toronto/tabs/demographics.py b/portfolio_app/pages/toronto/tabs/demographics.py new file mode 100644 index 0000000..5f087c5 --- /dev/null +++ b/portfolio_app/pages/toronto/tabs/demographics.py @@ -0,0 +1,211 @@ +"""Demographics tab for Toronto Neighbourhood Dashboard. + +Displays population, income, age, and diversity metrics. +""" + +import dash_mantine_components as dmc +from dash import dcc + + +def create_demographics_tab() -> dmc.Stack: + """Create the Demographics tab layout. + + Layout: + - Choropleth map (demographic metric) | KPI cards + - Age distribution chart | Income distribution chart + + Returns: + Tab content as a Mantine Stack component. + """ + return dmc.Stack( + [ + # Main content: Map + KPIs + dmc.Grid( + [ + # Choropleth map + dmc.GridCol( + dmc.Paper( + [ + dmc.Group( + [ + dmc.Title( + "Neighbourhood Demographics", + order=4, + size="h5", + ), + dmc.Select( + id="demographics-metric-select", + data=[ + { + "value": "population", + "label": "Population", + }, + { + "value": "median_income", + "label": "Median Income", + }, + { + "value": "median_age", + "label": "Median Age", + }, + { + "value": "diversity_index", + "label": "Diversity Index", + }, + ], + value="population", + size="sm", + w=180, + ), + ], + justify="space-between", + mb="sm", + ), + dcc.Graph( + id="demographics-choropleth", + config={ + "scrollZoom": True, + "displayModeBar": False, + }, + style={"height": "450px"}, + ), + ], + p="md", + radius="sm", + withBorder=True, + ), + span={"base": 12, "lg": 8}, + ), + # KPI cards + dmc.GridCol( + dmc.Stack( + [ + dmc.Paper( + [ + dmc.Text( + "City Population", size="xs", c="dimmed" + ), + dmc.Title( + id="demographics-city-pop", + children="2.79M", + order=2, + ), + dmc.Text( + id="demographics-pop-change", + children="+2.3% since 2016", + size="sm", + c="green", + ), + ], + p="md", + radius="sm", + withBorder=True, + ), + dmc.Paper( + [ + dmc.Text( + "Median Household Income", + size="xs", + c="dimmed", + ), + dmc.Title( + id="demographics-city-income", + children="$84,000", + order=2, + ), + dmc.Text( + "City average", + size="sm", + c="dimmed", + ), + ], + p="md", + radius="sm", + withBorder=True, + ), + dmc.Paper( + [ + dmc.Text( + "Selected Neighbourhood", + size="xs", + c="dimmed", + ), + dmc.Title( + id="demographics-selected-name", + children="Click map to select", + order=4, + size="h5", + ), + dmc.Stack( + id="demographics-selected-details", + children=[ + dmc.Text("—", c="dimmed"), + ], + gap="xs", + ), + ], + p="md", + radius="sm", + withBorder=True, + ), + ], + gap="md", + ), + span={"base": 12, "lg": 4}, + ), + ], + gutter="md", + ), + # Supporting charts + dmc.Grid( + [ + # Age distribution + dmc.GridCol( + dmc.Paper( + [ + dmc.Title( + "Age Distribution", + order=4, + size="h5", + mb="sm", + ), + dcc.Graph( + id="demographics-age-chart", + config={"displayModeBar": False}, + style={"height": "300px"}, + ), + ], + p="md", + radius="sm", + withBorder=True, + ), + span={"base": 12, "md": 6}, + ), + # Income distribution + dmc.GridCol( + dmc.Paper( + [ + dmc.Title( + "Income Distribution", + order=4, + size="h5", + mb="sm", + ), + dcc.Graph( + id="demographics-income-chart", + config={"displayModeBar": False}, + style={"height": "300px"}, + ), + ], + p="md", + radius="sm", + withBorder=True, + ), + span={"base": 12, "md": 6}, + ), + ], + gutter="md", + ), + ], + gap="md", + ) diff --git a/portfolio_app/pages/toronto/tabs/housing.py b/portfolio_app/pages/toronto/tabs/housing.py new file mode 100644 index 0000000..b695edb --- /dev/null +++ b/portfolio_app/pages/toronto/tabs/housing.py @@ -0,0 +1,209 @@ +"""Housing tab for Toronto Neighbourhood Dashboard. + +Displays affordability metrics, rent trends, and housing indicators. +""" + +import dash_mantine_components as dmc +from dash import dcc + + +def create_housing_tab() -> dmc.Stack: + """Create the Housing tab layout. + + Layout: + - Choropleth map (affordability index) | KPI cards + - Rent trend line chart | Dwelling types breakdown + + Returns: + Tab content as a Mantine Stack component. + """ + return dmc.Stack( + [ + # Main content: Map + KPIs + dmc.Grid( + [ + # Choropleth map + dmc.GridCol( + dmc.Paper( + [ + dmc.Group( + [ + dmc.Title( + "Housing Affordability", + order=4, + size="h5", + ), + dmc.Select( + id="housing-metric-select", + data=[ + { + "value": "affordability_index", + "label": "Affordability Index", + }, + { + "value": "avg_rent_2bed", + "label": "Avg Rent (2BR)", + }, + { + "value": "rent_to_income_pct", + "label": "Rent-to-Income %", + }, + { + "value": "vacancy_rate", + "label": "Vacancy Rate", + }, + ], + value="affordability_index", + size="sm", + w=180, + ), + ], + justify="space-between", + mb="sm", + ), + dcc.Graph( + id="housing-choropleth", + config={ + "scrollZoom": True, + "displayModeBar": False, + }, + style={"height": "450px"}, + ), + ], + p="md", + radius="sm", + withBorder=True, + ), + span={"base": 12, "lg": 8}, + ), + # KPI cards + dmc.GridCol( + dmc.Stack( + [ + dmc.Paper( + [ + dmc.Text( + "City Avg 2BR Rent", size="xs", c="dimmed" + ), + dmc.Title( + id="housing-city-rent", + children="$2,450", + order=2, + ), + dmc.Text( + id="housing-rent-change", + children="+4.2% YoY", + size="sm", + c="red", + ), + ], + p="md", + radius="sm", + withBorder=True, + ), + dmc.Paper( + [ + dmc.Text( + "City Avg Vacancy", size="xs", c="dimmed" + ), + dmc.Title( + id="housing-city-vacancy", + children="1.8%", + order=2, + ), + dmc.Text( + "Below healthy rate (3%)", + size="sm", + c="orange", + ), + ], + p="md", + radius="sm", + withBorder=True, + ), + dmc.Paper( + [ + dmc.Text( + "Selected Neighbourhood", + size="xs", + c="dimmed", + ), + dmc.Title( + id="housing-selected-name", + children="Click map to select", + order=4, + size="h5", + ), + dmc.Stack( + id="housing-selected-details", + children=[ + dmc.Text("—", c="dimmed"), + ], + gap="xs", + ), + ], + p="md", + radius="sm", + withBorder=True, + ), + ], + gap="md", + ), + span={"base": 12, "lg": 4}, + ), + ], + gutter="md", + ), + # Supporting charts + dmc.Grid( + [ + # Rent trend + dmc.GridCol( + dmc.Paper( + [ + dmc.Title( + "Rent Trends (5 Year)", + order=4, + size="h5", + mb="sm", + ), + dcc.Graph( + id="housing-trend-chart", + config={"displayModeBar": False}, + style={"height": "300px"}, + ), + ], + p="md", + radius="sm", + withBorder=True, + ), + span={"base": 12, "md": 6}, + ), + # Dwelling types + dmc.GridCol( + dmc.Paper( + [ + dmc.Title( + "Dwelling Types", + order=4, + size="h5", + mb="sm", + ), + dcc.Graph( + id="housing-types-chart", + config={"displayModeBar": False}, + style={"height": "300px"}, + ), + ], + p="md", + radius="sm", + withBorder=True, + ), + span={"base": 12, "md": 6}, + ), + ], + gutter="md", + ), + ], + gap="md", + ) diff --git a/portfolio_app/pages/toronto/tabs/overview.py b/portfolio_app/pages/toronto/tabs/overview.py new file mode 100644 index 0000000..2d188d7 --- /dev/null +++ b/portfolio_app/pages/toronto/tabs/overview.py @@ -0,0 +1,233 @@ +"""Overview tab for Toronto Neighbourhood Dashboard. + +Displays composite livability score with safety, affordability, and amenity components. +""" + +import dash_mantine_components as dmc +from dash import dcc, html + + +def create_overview_tab() -> dmc.Stack: + """Create the Overview tab layout. + + Layout: + - Choropleth map (livability score) | KPI cards + - Top/Bottom 10 bar chart | Income vs Crime scatter + + Returns: + Tab content as a Mantine Stack component. + """ + return dmc.Stack( + [ + # Main content: Map + KPIs + dmc.Grid( + [ + # Choropleth map + dmc.GridCol( + dmc.Paper( + [ + dmc.Group( + [ + dmc.Title( + "Neighbourhood Livability", + order=4, + size="h5", + ), + dmc.Select( + id="overview-metric-select", + data=[ + { + "value": "livability_score", + "label": "Livability Score", + }, + { + "value": "safety_score", + "label": "Safety Score", + }, + { + "value": "affordability_score", + "label": "Affordability Score", + }, + { + "value": "amenity_score", + "label": "Amenity Score", + }, + ], + value="livability_score", + size="sm", + w=180, + ), + ], + justify="space-between", + mb="sm", + ), + dcc.Graph( + id="overview-choropleth", + config={ + "scrollZoom": True, + "displayModeBar": False, + }, + style={"height": "450px"}, + ), + ], + p="md", + radius="sm", + withBorder=True, + ), + span={"base": 12, "lg": 8}, + ), + # KPI cards + dmc.GridCol( + dmc.Stack( + [ + dmc.Paper( + [ + dmc.Text("City Average", size="xs", c="dimmed"), + dmc.Title( + id="overview-city-avg", + children="72", + order=2, + ), + dmc.Text("Livability Score", size="sm", fw=500), + ], + p="md", + radius="sm", + withBorder=True, + ), + dmc.Paper( + [ + dmc.Text( + "Selected Neighbourhood", + size="xs", + c="dimmed", + ), + dmc.Title( + id="overview-selected-name", + children="Click map to select", + order=4, + size="h5", + ), + html.Div( + id="overview-selected-scores", + children=[ + dmc.Text("—", c="dimmed"), + ], + ), + ], + p="md", + radius="sm", + withBorder=True, + ), + dmc.Paper( + [ + dmc.Text( + "Score Components", size="xs", c="dimmed" + ), + dmc.Stack( + [ + dmc.Group( + [ + dmc.Text("Safety", size="sm"), + dmc.Text( + "30%", + size="sm", + c="dimmed", + ), + ], + justify="space-between", + ), + dmc.Group( + [ + dmc.Text( + "Affordability", size="sm" + ), + dmc.Text( + "40%", + size="sm", + c="dimmed", + ), + ], + justify="space-between", + ), + dmc.Group( + [ + dmc.Text( + "Amenities", size="sm" + ), + dmc.Text( + "30%", + size="sm", + c="dimmed", + ), + ], + justify="space-between", + ), + ], + gap="xs", + ), + ], + p="md", + radius="sm", + withBorder=True, + ), + ], + gap="md", + ), + span={"base": 12, "lg": 4}, + ), + ], + gutter="md", + ), + # Supporting charts + dmc.Grid( + [ + # Top/Bottom rankings + dmc.GridCol( + dmc.Paper( + [ + dmc.Title( + "Top & Bottom Neighbourhoods", + order=4, + size="h5", + mb="sm", + ), + dcc.Graph( + id="overview-rankings-chart", + config={"displayModeBar": False}, + style={"height": "300px"}, + ), + ], + p="md", + radius="sm", + withBorder=True, + ), + span={"base": 12, "md": 6}, + ), + # Scatter plot + dmc.GridCol( + dmc.Paper( + [ + dmc.Title( + "Income vs Safety", + order=4, + size="h5", + mb="sm", + ), + dcc.Graph( + id="overview-scatter-chart", + config={"displayModeBar": False}, + style={"height": "300px"}, + ), + ], + p="md", + radius="sm", + withBorder=True, + ), + span={"base": 12, "md": 6}, + ), + ], + gutter="md", + ), + ], + gap="md", + ) diff --git a/portfolio_app/pages/toronto/tabs/safety.py b/portfolio_app/pages/toronto/tabs/safety.py new file mode 100644 index 0000000..a84a5f6 --- /dev/null +++ b/portfolio_app/pages/toronto/tabs/safety.py @@ -0,0 +1,211 @@ +"""Safety tab for Toronto Neighbourhood Dashboard. + +Displays crime statistics, trends, and safety indicators. +""" + +import dash_mantine_components as dmc +from dash import dcc + + +def create_safety_tab() -> dmc.Stack: + """Create the Safety tab layout. + + Layout: + - Choropleth map (crime rate) | KPI cards + - Crime trend line chart | Crime by type breakdown + + Returns: + Tab content as a Mantine Stack component. + """ + return dmc.Stack( + [ + # Main content: Map + KPIs + dmc.Grid( + [ + # Choropleth map + dmc.GridCol( + dmc.Paper( + [ + dmc.Group( + [ + dmc.Title( + "Crime Rate by Neighbourhood", + order=4, + size="h5", + ), + dmc.Select( + id="safety-metric-select", + data=[ + { + "value": "total_crime_rate", + "label": "Total Crime Rate", + }, + { + "value": "violent_crime_rate", + "label": "Violent Crime", + }, + { + "value": "property_crime_rate", + "label": "Property Crime", + }, + { + "value": "theft_rate", + "label": "Theft", + }, + ], + value="total_crime_rate", + size="sm", + w=180, + ), + ], + justify="space-between", + mb="sm", + ), + dcc.Graph( + id="safety-choropleth", + config={ + "scrollZoom": True, + "displayModeBar": False, + }, + style={"height": "450px"}, + ), + ], + p="md", + radius="sm", + withBorder=True, + ), + span={"base": 12, "lg": 8}, + ), + # KPI cards + dmc.GridCol( + dmc.Stack( + [ + dmc.Paper( + [ + dmc.Text( + "City Crime Rate", size="xs", c="dimmed" + ), + dmc.Title( + id="safety-city-rate", + children="4,250", + order=2, + ), + dmc.Text( + id="safety-rate-change", + children="-2.1% YoY", + size="sm", + c="green", + ), + ], + p="md", + radius="sm", + withBorder=True, + ), + dmc.Paper( + [ + dmc.Text( + "Total Incidents (2023)", + size="xs", + c="dimmed", + ), + dmc.Title( + id="safety-total-incidents", + children="125,430", + order=2, + ), + dmc.Text( + "Per 100,000 residents", + size="sm", + c="dimmed", + ), + ], + p="md", + radius="sm", + withBorder=True, + ), + dmc.Paper( + [ + dmc.Text( + "Selected Neighbourhood", + size="xs", + c="dimmed", + ), + dmc.Title( + id="safety-selected-name", + children="Click map to select", + order=4, + size="h5", + ), + dmc.Stack( + id="safety-selected-details", + children=[ + dmc.Text("—", c="dimmed"), + ], + gap="xs", + ), + ], + p="md", + radius="sm", + withBorder=True, + ), + ], + gap="md", + ), + span={"base": 12, "lg": 4}, + ), + ], + gutter="md", + ), + # Supporting charts + dmc.Grid( + [ + # Crime trend + dmc.GridCol( + dmc.Paper( + [ + dmc.Title( + "Crime Trends (5 Year)", + order=4, + size="h5", + mb="sm", + ), + dcc.Graph( + id="safety-trend-chart", + config={"displayModeBar": False}, + style={"height": "300px"}, + ), + ], + p="md", + radius="sm", + withBorder=True, + ), + span={"base": 12, "md": 6}, + ), + # Crime by type + dmc.GridCol( + dmc.Paper( + [ + dmc.Title( + "Crime by Category", + order=4, + size="h5", + mb="sm", + ), + dcc.Graph( + id="safety-types-chart", + config={"displayModeBar": False}, + style={"height": "300px"}, + ), + ], + p="md", + radius="sm", + withBorder=True, + ), + span={"base": 12, "md": 6}, + ), + ], + gutter="md", + ), + ], + gap="md", + ) diff --git a/portfolio_app/toronto/parsers/toronto_open_data.py b/portfolio_app/toronto/parsers/toronto_open_data.py index bbc58af..110add7 100644 --- a/portfolio_app/toronto/parsers/toronto_open_data.py +++ b/portfolio_app/toronto/parsers/toronto_open_data.py @@ -57,6 +57,7 @@ class TorontoOpenDataParser: self._cache_dir = cache_dir self._timeout = timeout self._client: httpx.Client | None = None + self._neighbourhood_name_map: dict[str, int] | None = None @property def client(self) -> httpx.Client: @@ -75,6 +76,63 @@ class TorontoOpenDataParser: self._client.close() self._client = None + def _get_neighbourhood_name_map(self) -> dict[str, int]: + """Build and cache a mapping of neighbourhood names to IDs. + + Returns: + Dictionary mapping normalized neighbourhood names to area_id. + """ + if self._neighbourhood_name_map is not None: + return self._neighbourhood_name_map + + neighbourhoods = self.get_neighbourhoods() + self._neighbourhood_name_map = {} + + for n in neighbourhoods: + # Add multiple variations of the name for flexible matching + name_lower = n.area_name.lower().strip() + self._neighbourhood_name_map[name_lower] = n.area_id + + # Also add without common suffixes/prefixes + for suffix in [" neighbourhood", " area", "-"]: + if suffix in name_lower: + alt_name = name_lower.replace(suffix, "").strip() + self._neighbourhood_name_map[alt_name] = n.area_id + + logger.debug( + f"Built neighbourhood name map with {len(self._neighbourhood_name_map)} entries" + ) + return self._neighbourhood_name_map + + def _match_neighbourhood_id(self, name: str) -> int | None: + """Match a neighbourhood name to its ID. + + Args: + name: Neighbourhood name from census data. + + Returns: + Neighbourhood ID or None if not found. + """ + name_map = self._get_neighbourhood_name_map() + name_lower = name.lower().strip() + + # Direct match + if name_lower in name_map: + return name_map[name_lower] + + # Try removing parenthetical content + if "(" in name_lower: + base_name = name_lower.split("(")[0].strip() + if base_name in name_map: + return name_map[base_name] + + # Try fuzzy matching with first few chars + for key, area_id in name_map.items(): + if key.startswith(name_lower[:10]) or name_lower.startswith(key[:10]): + return area_id + + return None + def __enter__(self) -> "TorontoOpenDataParser": return self @@ -254,11 +312,30 @@ class TorontoOpenDataParser: logger.info(f"Parsed {len(records)} neighbourhoods") return records + # Mapping of indicator names to CensusRecord fields + # Keys are partial matches (case-insensitive) found in the "Characteristic" column + CENSUS_INDICATOR_MAPPING: dict[str, str] = { + "population, 2021": "population", + "population, 2016": "population", + "population density per square kilometre": "population_density", + "median total income of household": "median_household_income", + "average total income of household": "average_household_income", + "unemployment rate": "unemployment_rate", + "bachelor's degree or higher": "pct_bachelors_or_higher", + "owner": "pct_owner_occupied", + "renter": "pct_renter_occupied", + "median age": "median_age", + "average value of dwellings": "average_dwelling_value", + } + def get_census_profiles(self, year: int = 2021) -> list[CensusRecord]: """Fetch neighbourhood census profiles. - Note: Census profile data structure varies by year. This method - extracts key demographic indicators where available. + The Toronto Open Data neighbourhood profiles dataset is pivoted: + - Rows are demographic indicators (e.g., "Population", "Median Income") + - Columns are neighbourhoods (e.g., "Agincourt North", "Alderwood") + + This method transposes the data to create one CensusRecord per neighbourhood. Args: year: Census year (2016 or 2021). @@ -266,7 +343,6 @@ class TorontoOpenDataParser: Returns: List of validated CensusRecord objects. """ - # Census profiles are typically in CSV/datastore format try: raw_records = self._fetch_csv_as_json( self.DATASETS["neighbourhood_profiles"] @@ -275,13 +351,119 @@ class TorontoOpenDataParser: logger.warning(f"Could not fetch census profiles: {e}") return [] - # Census profiles are pivoted - rows are indicators, columns are neighbourhoods - # This requires special handling based on the actual data structure + if not raw_records: + logger.warning("Census profiles dataset is empty") + return [] + logger.info(f"Fetched {len(raw_records)} census profile rows") - # For now, return empty list - actual implementation depends on data structure - # TODO: Implement census profile parsing based on actual data format - return [] + # Find the characteristic/indicator column name + sample_row = raw_records[0] + char_col = None + for col in sample_row: + col_lower = col.lower() + if "characteristic" in col_lower or "category" in col_lower: + char_col = col + break + + if not char_col: + # Try common column names + for candidate in ["Characteristic", "Category", "Topic", "_id"]: + if candidate in sample_row: + char_col = candidate + break + + if not char_col: + logger.warning("Could not find characteristic column in census data") + return [] + + # Identify neighbourhood columns (exclude metadata columns) + exclude_cols = { + char_col, + "_id", + "Topic", + "Data Source", + "Characteristic", + "Category", + } + neighbourhood_cols = [col for col in sample_row if col not in exclude_cols] + + logger.info(f"Found {len(neighbourhood_cols)} neighbourhood columns") + + # Build a lookup: neighbourhood_name -> {field: value} + neighbourhood_data: dict[str, dict[str, Decimal | int | None]] = { + col: {} for col in neighbourhood_cols + } + + # Process each row to extract indicator values + for row in raw_records: + characteristic = str(row.get(char_col, "")).lower().strip() + + # Check if this row matches any indicator we care about + for indicator_pattern, field_name in self.CENSUS_INDICATOR_MAPPING.items(): + if indicator_pattern in characteristic: + # Extract values for each neighbourhood + for col in neighbourhood_cols: + value = row.get(col) + if value is not None and value != "": + try: + # Clean and convert value + str_val = str(value).replace(",", "").replace("$", "") + str_val = str_val.replace("%", "").strip() + if str_val and str_val not in ("x", "X", "F", ".."): + numeric_val = Decimal(str_val) + # Only store if not already set (first match wins) + if field_name not in neighbourhood_data[col]: + neighbourhood_data[col][ + field_name + ] = numeric_val + except (ValueError, TypeError): + pass + break # Move to next row after matching + + # Convert to CensusRecord objects + records = [] + unmatched = [] + + for neighbourhood_name, data in neighbourhood_data.items(): + if not data: + continue + + # Match neighbourhood name to ID + neighbourhood_id = self._match_neighbourhood_id(neighbourhood_name) + if neighbourhood_id is None: + unmatched.append(neighbourhood_name) + continue + + try: + pop_val = data.get("population") + population = int(pop_val) if pop_val is not None else None + + record = CensusRecord( + neighbourhood_id=neighbourhood_id, + census_year=year, + population=population, + population_density=data.get("population_density"), + median_household_income=data.get("median_household_income"), + average_household_income=data.get("average_household_income"), + unemployment_rate=data.get("unemployment_rate"), + pct_bachelors_or_higher=data.get("pct_bachelors_or_higher"), + pct_owner_occupied=data.get("pct_owner_occupied"), + pct_renter_occupied=data.get("pct_renter_occupied"), + median_age=data.get("median_age"), + average_dwelling_value=data.get("average_dwelling_value"), + ) + records.append(record) + except Exception as e: + logger.debug(f"Skipping neighbourhood {neighbourhood_name}: {e}") + + if unmatched: + logger.warning( + f"Could not match {len(unmatched)} neighbourhoods: {unmatched[:5]}..." + ) + + logger.info(f"Parsed {len(records)} census records for year {year}") + return records def get_parks(self) -> list[AmenityRecord]: """Fetch park locations. diff --git a/portfolio_app/toronto/services/__init__.py b/portfolio_app/toronto/services/__init__.py new file mode 100644 index 0000000..4a1f0ae --- /dev/null +++ b/portfolio_app/toronto/services/__init__.py @@ -0,0 +1,33 @@ +"""Data service layer for Toronto neighbourhood dashboard.""" + +from .geometry_service import ( + get_cmhc_zones_geojson, + get_neighbourhoods_geojson, +) +from .neighbourhood_service import ( + get_amenities_data, + get_city_averages, + get_demographics_data, + get_housing_data, + get_neighbourhood_details, + get_neighbourhood_list, + get_overview_data, + get_rankings, + get_safety_data, +) + +__all__ = [ + # Neighbourhood data + "get_overview_data", + "get_housing_data", + "get_safety_data", + "get_demographics_data", + "get_amenities_data", + "get_neighbourhood_details", + "get_neighbourhood_list", + "get_rankings", + "get_city_averages", + # Geometry + "get_neighbourhoods_geojson", + "get_cmhc_zones_geojson", +] diff --git a/portfolio_app/toronto/services/geometry_service.py b/portfolio_app/toronto/services/geometry_service.py new file mode 100644 index 0000000..959ab7f --- /dev/null +++ b/portfolio_app/toronto/services/geometry_service.py @@ -0,0 +1,176 @@ +"""Service layer for generating GeoJSON from PostGIS geometry.""" + +import json +from functools import lru_cache +from typing import Any + +import pandas as pd +from sqlalchemy import text + +from portfolio_app.toronto.models import get_engine + + +def _execute_query(sql: str, params: dict[str, Any] | None = None) -> pd.DataFrame: + """Execute SQL query and return DataFrame.""" + engine = get_engine() + with engine.connect() as conn: + return pd.read_sql(text(sql), conn, params=params) + + +@lru_cache(maxsize=8) +def get_neighbourhoods_geojson(year: int = 2021) -> dict[str, Any]: + """Get GeoJSON FeatureCollection for all neighbourhoods. + + Queries mart_neighbourhood_overview for geometries and basic properties. + + Args: + year: Year to query for joining properties. + + Returns: + GeoJSON FeatureCollection dictionary. + """ + # Query geometries with ST_AsGeoJSON + sql = """ + SELECT + neighbourhood_id, + neighbourhood_name, + ST_AsGeoJSON(geometry)::json as geom, + population, + livability_score + FROM mart_neighbourhood_overview + WHERE year = :year + AND geometry IS NOT NULL + """ + + try: + df = _execute_query(sql, {"year": year}) + except Exception: + # Table might not exist or have data yet + return _empty_geojson() + + if df.empty: + return _empty_geojson() + + # Build GeoJSON features + features = [] + for _, row in df.iterrows(): + geom = row["geom"] + if geom is None: + continue + + # Handle geometry that might be a string or dict + if isinstance(geom, str): + geom = json.loads(geom) + + feature = { + "type": "Feature", + "id": row["neighbourhood_id"], + "properties": { + "neighbourhood_id": int(row["neighbourhood_id"]), + "neighbourhood_name": row["neighbourhood_name"], + "population": int(row["population"]) + if pd.notna(row["population"]) + else None, + "livability_score": float(row["livability_score"]) + if pd.notna(row["livability_score"]) + else None, + }, + "geometry": geom, + } + features.append(feature) + + return { + "type": "FeatureCollection", + "features": features, + } + + +@lru_cache(maxsize=4) +def get_cmhc_zones_geojson() -> dict[str, Any]: + """Get GeoJSON FeatureCollection for CMHC zones. + + Queries dim_cmhc_zone for zone geometries. + + Returns: + GeoJSON FeatureCollection dictionary. + """ + sql = """ + SELECT + zone_code, + zone_name, + ST_AsGeoJSON(geometry)::json as geom + FROM dim_cmhc_zone + WHERE geometry IS NOT NULL + """ + + try: + df = _execute_query(sql, {}) + except Exception: + return _empty_geojson() + + if df.empty: + return _empty_geojson() + + features = [] + for _, row in df.iterrows(): + geom = row["geom"] + if geom is None: + continue + + if isinstance(geom, str): + geom = json.loads(geom) + + feature = { + "type": "Feature", + "id": row["zone_code"], + "properties": { + "zone_code": row["zone_code"], + "zone_name": row["zone_name"], + }, + "geometry": geom, + } + features.append(feature) + + return { + "type": "FeatureCollection", + "features": features, + } + + +def get_neighbourhood_geometry(neighbourhood_id: int) -> dict[str, Any] | None: + """Get GeoJSON geometry for a single neighbourhood. + + Args: + neighbourhood_id: The neighbourhood ID. + + Returns: + GeoJSON geometry dict, or None if not found. + """ + sql = """ + SELECT ST_AsGeoJSON(geometry)::json as geom + FROM dim_neighbourhood + WHERE neighbourhood_id = :neighbourhood_id + AND geometry IS NOT NULL + """ + + try: + df = _execute_query(sql, {"neighbourhood_id": neighbourhood_id}) + except Exception: + return None + + if df.empty: + return None + + geom = df.iloc[0]["geom"] + if isinstance(geom, str): + result: dict[str, Any] = json.loads(geom) + return result + return dict(geom) if geom is not None else None + + +def _empty_geojson() -> dict[str, Any]: + """Return an empty GeoJSON FeatureCollection.""" + return { + "type": "FeatureCollection", + "features": [], + } diff --git a/portfolio_app/toronto/services/neighbourhood_service.py b/portfolio_app/toronto/services/neighbourhood_service.py new file mode 100644 index 0000000..73f1221 --- /dev/null +++ b/portfolio_app/toronto/services/neighbourhood_service.py @@ -0,0 +1,392 @@ +"""Service layer for querying neighbourhood data from dbt marts.""" + +from functools import lru_cache +from typing import Any + +import pandas as pd +from sqlalchemy import text + +from portfolio_app.toronto.models import get_engine + + +def _execute_query(sql: str, params: dict[str, Any] | None = None) -> pd.DataFrame: + """Execute SQL query and return DataFrame. + + Args: + sql: SQL query string. + params: Query parameters. + + Returns: + pandas DataFrame with results, or empty DataFrame on error. + """ + try: + engine = get_engine() + with engine.connect() as conn: + return pd.read_sql(text(sql), conn, params=params) + except Exception: + # Return empty DataFrame on connection or query error + return pd.DataFrame() + + +def get_overview_data(year: int = 2021) -> pd.DataFrame: + """Get overview data for all neighbourhoods. + + Queries mart_neighbourhood_overview for livability scores and components. + + Args: + year: Census year to query. + + Returns: + DataFrame with columns: neighbourhood_id, neighbourhood_name, + livability_score, safety_score, affordability_score, amenity_score, + population, median_household_income, etc. + """ + sql = """ + SELECT + neighbourhood_id, + neighbourhood_name, + year, + population, + median_household_income, + livability_score, + safety_score, + affordability_score, + amenity_score, + crime_rate_per_100k, + rent_to_income_pct, + avg_rent_2bed, + total_amenities_per_1000 + FROM mart_neighbourhood_overview + WHERE year = :year + ORDER BY livability_score DESC NULLS LAST + """ + return _execute_query(sql, {"year": year}) + + +def get_housing_data(year: int = 2021) -> pd.DataFrame: + """Get housing data for all neighbourhoods. + + Queries mart_neighbourhood_housing for affordability metrics. + + Args: + year: Year to query. + + Returns: + DataFrame with columns: neighbourhood_id, neighbourhood_name, + avg_rent_2bed, vacancy_rate, rent_to_income_pct, affordability_index, etc. + """ + sql = """ + SELECT + neighbourhood_id, + neighbourhood_name, + year, + pct_owner_occupied, + pct_renter_occupied, + average_dwelling_value, + median_household_income, + avg_rent_bachelor, + avg_rent_1bed, + avg_rent_2bed, + avg_rent_3bed, + vacancy_rate, + total_rental_units, + rent_to_income_pct, + is_affordable, + affordability_index, + rent_yoy_change_pct, + income_quintile + FROM mart_neighbourhood_housing + WHERE year = :year + ORDER BY affordability_index ASC NULLS LAST + """ + return _execute_query(sql, {"year": year}) + + +def get_safety_data(year: int = 2021) -> pd.DataFrame: + """Get safety/crime data for all neighbourhoods. + + Queries mart_neighbourhood_safety for crime statistics. + + Args: + year: Year to query. + + Returns: + DataFrame with columns: neighbourhood_id, neighbourhood_name, + total_crime_rate, violent_crime_rate, property_crime_rate, etc. + """ + sql = """ + SELECT + neighbourhood_id, + neighbourhood_name, + year, + total_crimes, + crime_rate_per_100k as total_crime_rate, + violent_crimes, + violent_crime_rate, + property_crimes, + property_crime_rate, + theft_crimes, + theft_rate, + crime_yoy_change_pct, + crime_trend + FROM mart_neighbourhood_safety + WHERE year = :year + ORDER BY total_crime_rate ASC NULLS LAST + """ + return _execute_query(sql, {"year": year}) + + +def get_demographics_data(year: int = 2021) -> pd.DataFrame: + """Get demographic data for all neighbourhoods. + + Queries mart_neighbourhood_demographics for population/income metrics. + + Args: + year: Census year to query. + + Returns: + DataFrame with columns: neighbourhood_id, neighbourhood_name, + population, median_age, median_income, diversity_index, etc. + """ + sql = """ + SELECT + neighbourhood_id, + neighbourhood_name, + census_year as year, + population, + population_density, + population_change_pct, + median_household_income, + average_household_income, + income_quintile, + median_age, + pct_under_18, + pct_18_to_64, + pct_65_plus, + pct_bachelors_or_higher, + unemployment_rate, + diversity_index + FROM mart_neighbourhood_demographics + WHERE census_year = :year + ORDER BY population DESC NULLS LAST + """ + return _execute_query(sql, {"year": year}) + + +def get_amenities_data(year: int = 2021) -> pd.DataFrame: + """Get amenities data for all neighbourhoods. + + Queries mart_neighbourhood_amenities for parks, schools, transit. + + Args: + year: Year to query. + + Returns: + DataFrame with columns: neighbourhood_id, neighbourhood_name, + amenity_score, parks_per_capita, schools_per_capita, transit_score, etc. + """ + sql = """ + SELECT + neighbourhood_id, + neighbourhood_name, + year, + park_count, + parks_per_1000, + school_count, + schools_per_1000, + childcare_count, + childcare_per_1000, + total_amenities, + total_amenities_per_1000, + amenity_score, + amenity_rank + FROM mart_neighbourhood_amenities + WHERE year = :year + ORDER BY amenity_score DESC NULLS LAST + """ + return _execute_query(sql, {"year": year}) + + +def get_neighbourhood_details( + neighbourhood_id: int, year: int = 2021 +) -> dict[str, Any]: + """Get detailed data for a single neighbourhood. + + Combines data from all mart tables for a complete neighbourhood profile. + + Args: + neighbourhood_id: The neighbourhood ID. + year: Year to query. + + Returns: + Dictionary with all metrics for the neighbourhood. + """ + sql = """ + SELECT + o.neighbourhood_id, + o.neighbourhood_name, + o.year, + o.population, + o.median_household_income, + o.livability_score, + o.safety_score, + o.affordability_score, + o.amenity_score, + s.total_crimes, + s.crime_rate_per_100k, + s.violent_crime_rate, + s.property_crime_rate, + h.avg_rent_2bed, + h.vacancy_rate, + h.rent_to_income_pct, + h.affordability_index, + h.pct_owner_occupied, + h.pct_renter_occupied, + d.median_age, + d.diversity_index, + d.unemployment_rate, + d.pct_bachelors_or_higher, + a.park_count, + a.school_count, + a.total_amenities + FROM mart_neighbourhood_overview o + LEFT JOIN mart_neighbourhood_safety s + ON o.neighbourhood_id = s.neighbourhood_id + AND o.year = s.year + LEFT JOIN mart_neighbourhood_housing h + ON o.neighbourhood_id = h.neighbourhood_id + AND o.year = h.year + LEFT JOIN mart_neighbourhood_demographics d + ON o.neighbourhood_id = d.neighbourhood_id + AND o.year = d.census_year + LEFT JOIN mart_neighbourhood_amenities a + ON o.neighbourhood_id = a.neighbourhood_id + AND o.year = a.year + WHERE o.neighbourhood_id = :neighbourhood_id + AND o.year = :year + """ + df = _execute_query(sql, {"neighbourhood_id": neighbourhood_id, "year": year}) + + if df.empty: + return {} + + return {str(k): v for k, v in df.iloc[0].to_dict().items()} + + +@lru_cache(maxsize=32) +def get_neighbourhood_list(year: int = 2021) -> list[dict[str, Any]]: + """Get list of all neighbourhoods for dropdown selectors. + + Args: + year: Year to query. + + Returns: + List of dicts with neighbourhood_id, name, and population. + """ + sql = """ + SELECT DISTINCT + neighbourhood_id, + neighbourhood_name, + population + FROM mart_neighbourhood_overview + WHERE year = :year + ORDER BY neighbourhood_name + """ + df = _execute_query(sql, {"year": year}) + if df.empty: + return [] + return list(df.to_dict("records")) # type: ignore[arg-type] + + +def get_rankings( + metric: str, + year: int = 2021, + top_n: int = 10, + ascending: bool = True, +) -> pd.DataFrame: + """Get top/bottom neighbourhoods for a specific metric. + + Args: + metric: Column name to rank by. + year: Year to query. + top_n: Number of top and bottom records. + ascending: If True, rank from lowest to highest (good for crime, rent). + + Returns: + DataFrame with top and bottom neighbourhoods. + """ + # Map metrics to their source tables + table_map = { + "livability_score": "mart_neighbourhood_overview", + "safety_score": "mart_neighbourhood_overview", + "affordability_score": "mart_neighbourhood_overview", + "amenity_score": "mart_neighbourhood_overview", + "crime_rate_per_100k": "mart_neighbourhood_safety", + "total_crime_rate": "mart_neighbourhood_safety", + "avg_rent_2bed": "mart_neighbourhood_housing", + "affordability_index": "mart_neighbourhood_housing", + "population": "mart_neighbourhood_demographics", + "median_household_income": "mart_neighbourhood_demographics", + } + + table = table_map.get(metric, "mart_neighbourhood_overview") + year_col = "census_year" if "demographics" in table else "year" + + order = "ASC" if ascending else "DESC" + reverse_order = "DESC" if ascending else "ASC" + + sql = f""" + ( + SELECT neighbourhood_id, neighbourhood_name, {metric}, 'bottom' as rank_group + FROM {table} + WHERE {year_col} = :year AND {metric} IS NOT NULL + ORDER BY {metric} {order} + LIMIT :top_n + ) + UNION ALL + ( + SELECT neighbourhood_id, neighbourhood_name, {metric}, 'top' as rank_group + FROM {table} + WHERE {year_col} = :year AND {metric} IS NOT NULL + ORDER BY {metric} {reverse_order} + LIMIT :top_n + ) + """ + return _execute_query(sql, {"year": year, "top_n": top_n}) + + +def get_city_averages(year: int = 2021) -> dict[str, Any]: + """Get city-wide average metrics. + + Args: + year: Year to query. + + Returns: + Dictionary with city averages for key metrics. + """ + sql = """ + SELECT + AVG(livability_score) as avg_livability_score, + AVG(safety_score) as avg_safety_score, + AVG(affordability_score) as avg_affordability_score, + AVG(amenity_score) as avg_amenity_score, + SUM(population) as total_population, + AVG(median_household_income) as avg_median_income, + AVG(crime_rate_per_100k) as avg_crime_rate, + AVG(avg_rent_2bed) as avg_rent_2bed, + AVG(rent_to_income_pct) as avg_rent_to_income + FROM mart_neighbourhood_overview + WHERE year = :year + """ + df = _execute_query(sql, {"year": year}) + + if df.empty: + return {} + + result: dict[str, Any] = {str(k): v for k, v in df.iloc[0].to_dict().items()} + # Round numeric values + for key, value in result.items(): + if pd.notna(value) and isinstance(value, float): + result[key] = round(value, 2) + + return result diff --git a/scripts/data/__init__.py b/scripts/data/__init__.py new file mode 100644 index 0000000..ef9b2c8 --- /dev/null +++ b/scripts/data/__init__.py @@ -0,0 +1 @@ +"""Data loading scripts for the portfolio app.""" diff --git a/scripts/data/load_toronto_data.py b/scripts/data/load_toronto_data.py new file mode 100644 index 0000000..63d5353 --- /dev/null +++ b/scripts/data/load_toronto_data.py @@ -0,0 +1,367 @@ +#!/usr/bin/env python3 +"""Load Toronto neighbourhood data into the database. + +Usage: + python scripts/data/load_toronto_data.py [OPTIONS] + +Options: + --skip-fetch Skip API fetching, only run dbt + --skip-dbt Skip dbt run, only load data + --dry-run Show what would be done without executing + -v, --verbose Enable verbose logging + +This script orchestrates: +1. Fetching data from Toronto Open Data and CMHC APIs +2. Loading data into PostgreSQL fact tables +3. Running dbt to transform staging -> intermediate -> marts + +Exit codes: + 0 = Success + 1 = Error +""" + +import argparse +import logging +import subprocess +import sys +from datetime import date +from pathlib import Path +from typing import Any + +# Add project root to path +PROJECT_ROOT = Path(__file__).parent.parent.parent +sys.path.insert(0, str(PROJECT_ROOT)) + +from portfolio_app.toronto.loaders import ( # noqa: E402 + get_session, + load_amenities, + load_census_data, + load_crime_data, + load_neighbourhoods, + load_time_dimension, +) +from portfolio_app.toronto.parsers import ( # noqa: E402 + TorontoOpenDataParser, + TorontoPoliceParser, +) +from portfolio_app.toronto.schemas import Neighbourhood # noqa: E402 + +# Configure logging +logging.basicConfig( + level=logging.INFO, + format="%(asctime)s - %(levelname)s - %(message)s", + datefmt="%H:%M:%S", +) +logger = logging.getLogger(__name__) + + +class DataPipeline: + """Orchestrates data loading from APIs to database to dbt.""" + + def __init__(self, dry_run: bool = False, verbose: bool = False): + self.dry_run = dry_run + self.verbose = verbose + self.stats: dict[str, int] = {} + + if verbose: + logging.getLogger().setLevel(logging.DEBUG) + + def fetch_and_load(self) -> bool: + """Fetch data from APIs and load into database. + + Returns: + True if successful, False otherwise. + """ + logger.info("Starting data fetch and load pipeline...") + + try: + with get_session() as session: + # 1. Load time dimension first (for date keys) + self._load_time_dimension(session) + + # 2. Load neighbourhoods (required for foreign keys) + self._load_neighbourhoods(session) + + # 3. Load census data + self._load_census(session) + + # 4. Load crime data + self._load_crime(session) + + # 5. Load amenities + self._load_amenities(session) + + session.commit() + logger.info("All data committed to database") + + self._print_stats() + return True + + except Exception as e: + logger.error(f"Pipeline failed: {e}") + if self.verbose: + import traceback + + traceback.print_exc() + return False + + def _load_time_dimension(self, session: Any) -> None: + """Load time dimension with date range for dashboard.""" + logger.info("Loading time dimension...") + + if self.dry_run: + logger.info( + " [DRY RUN] Would load time dimension 2019-01-01 to 2025-12-01" + ) + return + + count = load_time_dimension( + start_date=date(2019, 1, 1), + end_date=date(2025, 12, 1), + session=session, + ) + self.stats["time_dimension"] = count + logger.info(f" Loaded {count} time dimension records") + + def _load_neighbourhoods(self, session: Any) -> None: + """Fetch and load neighbourhood boundaries.""" + logger.info("Fetching neighbourhoods from Toronto Open Data...") + + if self.dry_run: + logger.info(" [DRY RUN] Would fetch and load neighbourhoods") + return + + import json + + parser = TorontoOpenDataParser() + raw_neighbourhoods = parser.get_neighbourhoods() + + # Convert NeighbourhoodRecord to Neighbourhood schema + neighbourhoods = [] + for n in raw_neighbourhoods: + # Convert GeoJSON geometry dict to WKT if present + geometry_wkt = None + if n.geometry: + # Store as GeoJSON string for PostGIS ST_GeomFromGeoJSON + geometry_wkt = json.dumps(n.geometry) + + neighbourhood = Neighbourhood( + neighbourhood_id=n.area_id, + name=n.area_name, + geometry_wkt=geometry_wkt, + population=None, # Will be filled from census data + land_area_sqkm=None, + pop_density_per_sqkm=None, + census_year=2021, + ) + neighbourhoods.append(neighbourhood) + + count = load_neighbourhoods(neighbourhoods, session) + self.stats["neighbourhoods"] = count + logger.info(f" Loaded {count} neighbourhoods") + + def _load_census(self, session: Any) -> None: + """Fetch and load census profile data.""" + logger.info("Fetching census profiles from Toronto Open Data...") + + if self.dry_run: + logger.info(" [DRY RUN] Would fetch and load census data") + return + + parser = TorontoOpenDataParser() + census_records = parser.get_census_profiles(year=2021) + + if not census_records: + logger.warning(" No census records fetched") + return + + count = load_census_data(census_records, session) + self.stats["census"] = count + logger.info(f" Loaded {count} census records") + + def _load_crime(self, session: Any) -> None: + """Fetch and load crime statistics.""" + logger.info("Fetching crime data from Toronto Police Service...") + + if self.dry_run: + logger.info(" [DRY RUN] Would fetch and load crime data") + return + + parser = TorontoPoliceParser() + crime_records = parser.get_crime_rates() + + if not crime_records: + logger.warning(" No crime records fetched") + return + + count = load_crime_data(crime_records, session) + self.stats["crime"] = count + logger.info(f" Loaded {count} crime records") + + def _load_amenities(self, session: Any) -> None: + """Fetch and load amenity data (parks, schools, childcare).""" + logger.info("Fetching amenities from Toronto Open Data...") + + if self.dry_run: + logger.info(" [DRY RUN] Would fetch and load amenity data") + return + + parser = TorontoOpenDataParser() + total_count = 0 + + # Fetch parks + try: + parks = parser.get_parks() + if parks: + count = load_amenities(parks, year=2024, session=session) + total_count += count + logger.info(f" Loaded {count} park amenities") + except Exception as e: + logger.warning(f" Failed to load parks: {e}") + + # Fetch schools + try: + schools = parser.get_schools() + if schools: + count = load_amenities(schools, year=2024, session=session) + total_count += count + logger.info(f" Loaded {count} school amenities") + except Exception as e: + logger.warning(f" Failed to load schools: {e}") + + # Fetch childcare centres + try: + childcare = parser.get_childcare_centres() + if childcare: + count = load_amenities(childcare, year=2024, session=session) + total_count += count + logger.info(f" Loaded {count} childcare amenities") + except Exception as e: + logger.warning(f" Failed to load childcare: {e}") + + self.stats["amenities"] = total_count + + def run_dbt(self) -> bool: + """Run dbt to transform data. + + Returns: + True if successful, False otherwise. + """ + logger.info("Running dbt transformations...") + + dbt_project_dir = PROJECT_ROOT / "dbt" + + if not dbt_project_dir.exists(): + logger.error(f"dbt project directory not found: {dbt_project_dir}") + return False + + if self.dry_run: + logger.info(" [DRY RUN] Would run: dbt run") + logger.info(" [DRY RUN] Would run: dbt test") + return True + + try: + # Run dbt models + logger.info(" Running dbt run...") + result = subprocess.run( + ["dbt", "run"], + cwd=dbt_project_dir, + capture_output=True, + text=True, + ) + + if result.returncode != 0: + logger.error(f"dbt run failed:\n{result.stderr}") + if self.verbose: + logger.debug(f"dbt output:\n{result.stdout}") + return False + + logger.info(" dbt run completed successfully") + + # Run dbt tests + logger.info(" Running dbt test...") + result = subprocess.run( + ["dbt", "test"], + cwd=dbt_project_dir, + capture_output=True, + text=True, + ) + + if result.returncode != 0: + logger.warning(f"dbt test had failures:\n{result.stderr}") + # Don't fail on test failures, just warn + else: + logger.info(" dbt test completed successfully") + + return True + + except FileNotFoundError: + logger.error( + "dbt not found in PATH. Install with: pip install dbt-postgres" + ) + return False + except Exception as e: + logger.error(f"dbt execution failed: {e}") + return False + + def _print_stats(self) -> None: + """Print loading statistics.""" + if not self.stats: + return + + logger.info("Loading statistics:") + for key, count in self.stats.items(): + logger.info(f" {key}: {count} records") + + +def main() -> int: + """Main entry point for the data loading script.""" + parser = argparse.ArgumentParser( + description="Load Toronto neighbourhood data into the database", + formatter_class=argparse.RawDescriptionHelpFormatter, + epilog=__doc__, + ) + + parser.add_argument( + "--skip-fetch", + action="store_true", + help="Skip API fetching, only run dbt", + ) + parser.add_argument( + "--skip-dbt", + action="store_true", + help="Skip dbt run, only load data", + ) + parser.add_argument( + "--dry-run", + action="store_true", + help="Show what would be done without executing", + ) + parser.add_argument( + "-v", + "--verbose", + action="store_true", + help="Enable verbose logging", + ) + + args = parser.parse_args() + + if args.skip_fetch and args.skip_dbt: + logger.error("Cannot skip both fetch and dbt - nothing to do") + return 1 + + pipeline = DataPipeline(dry_run=args.dry_run, verbose=args.verbose) + + # Execute pipeline stages + if not args.skip_fetch and not pipeline.fetch_and_load(): + return 1 + + if not args.skip_dbt and not pipeline.run_dbt(): + return 1 + + logger.info("Pipeline completed successfully!") + return 0 + + +if __name__ == "__main__": + sys.exit(main())