diff --git a/Change-V04.0.0%3A-Proposal.md b/unnamed.md similarity index 72% rename from Change-V04.0.0%3A-Proposal.md rename to unnamed.md index adec60a..30d8231 100644 --- a/Change-V04.0.0%3A-Proposal.md +++ b/unnamed.md @@ -1,655 +1,666 @@ -# MCP Data Platform — Architecture Reference - -*Plugin taxonomy, server responsibilities, and interaction patterns for Leo data marketplace* - ------ - -## Overview - -Two plugins serving distinct domains, designed for independent or combined use. - -|Plugin |Servers |Domain | -|-----------------|---------------------------------|-----------------------------------------| -|**data-platform**|pandas-mcp, postgres-mcp, dbt-mcp|Ingestion, storage, transformation | -|**viz-platform** |dmc-mcp, dash-mcp |Component validation, dashboards, theming| - -**Key principles:** - -- MCP servers are independent processes—they do not import each other -- Claude orchestrates cross-server data flow at runtime -- Plugins ship multiple servers; projects load only what they need -- Claude.md defines project-specific workflows spanning plugins - ------ - -## Component Definitions - -|Component Type|Definition |Runtime Context | -|--------------|--------------------------------------------------------------------------------------------------------------------|----------------------------------------------------| -|**MCP Server**|Standalone service exposing tools via Model Context Protocol. One server = one domain responsibility. |Long-running process, spawned by Claude Desktop/Code| -|**Tool** |Single callable function within an MCP server. Atomic operation with defined input schema and output. |Invoked per-request by LLM | -|**Resource** |Read-only data exposed by MCP server (files, schemas, configs). Discoverable but not executable. |Static or cached | -|**Agent** |Orchestration layer that chains multiple tool calls across servers. Lives in Claude reasoning, not in MCP servers.|LLM-driven, multi-step | -|**Command** |User-facing shortcut (e.g., `/ingest`) that triggers predefined tool sequences. |Chat interface trigger | - ------ - -## Plugin: data-platform - -### Server Loading - -Single plugin ships all three servers. Which servers load is determined by project config—not environment variables. - -|Server |Default|Optional| -|------------|-------|--------| -|pandas-mcp |✓ |— | -|postgres-mcp|✓ |— | -|dbt-mcp |— |✓ | - -**Example project configs:** - -```yaml -# Web app project (no dbt) -mcp_servers: - - pandas-mcp - - postgres-mcp -``` - -```yaml -# Data engineering project (full stack) -mcp_servers: - - pandas-mcp - - postgres-mcp - - dbt-mcp -``` - -Agents check server availability at runtime. If dbt-mcp is not loaded, dbt-related steps are skipped or surface “not available for this project.” - ------ - -### Server: pandas-mcp (Data Shaping Layer) - -**Responsibility:** File ingestion, data profiling, schema inference, and utility shaping operations. - -**Philosophy:** SQL-first for persistent transforms (use dbt). Pandas for: - -- Pre-database ingestion (profiling, validation, schema inference) -- Visualization prep (reshaping query results for chart formats) -- Ad-hoc operations (prototyping, merging with local files) - -#### Tool Categories - -|Category |Tools |Description | -|---------|-----------------------------------------------------------------|-----------------------------------| -|Ingestion|`read_file`, `write_file`, `detect_encoding` |File I/O with format auto-detection| -|Profiling|`profile`, `validate`, `sample` |Data quality assessment | -|Schema |`infer_schema` |Generate DDL from data structure | -|Shaping |`reshape`, `pivot`, `melt`, `merge`, `add_columns`, `filter_rows`|Transform any data reference | - -#### Data Reference Sources - -pandas-mcp accepts `data_ref` from multiple origins: - -|Source |How It Arrives | -|------------------|-------------------------| -|Local file |`read_file` tool | -|Query result |Passed from postgres-mcp | -|dbt model output |Passed from dbt-mcp | -|Previous transform|Chained from shaping tool| - -#### When to Use Shaping Tools - -|Scenario |Use pandas-mcp|Use SQL/dbt| -|--------------------------------------|--------------|-----------| -|Pivot for heatmap chart |✓ |— | -|Join query result with local CSV |✓ |— | -|Prototype transform before formalizing|✓ |— | -|Persistent aggregation in pipeline |— |✓ | -|Reusable business logic |— |✓ | -|Needs version control + testing |— |✓ | - ------ - -### Server: postgres-mcp (Database Layer) - -**Responsibility:** Data loading, querying, schema management, performance analysis, and geospatial operations. - -#### Tool Categories - -|Category|Tools |Description | -|--------|------------------------------------------------------------------------------------|--------------------------------------| -|Query |`list_schemas`, `list_tables`, `get_table_schema`, `execute_query`, `query_geometry`|Read operations | -|Analysis|`explain_query`, `recommend_indexes`, `health_check` |Performance insights | -|Write |`execute_write`, `load_dataframe` |Data modification | -|DDL |`execute_ddl`, `get_schema_snapshot` |Schema management with change tracking| - -#### DDL Change Tracking - -`execute_ddl` returns structured output for downstream automation: - -```json -{ - "success": true, - "operation": "CREATE TABLE", - "affected_objects": [ - { - "type": "table", - "schema": "public", - "name": "customer_orders", - "change": "created" - } - ], - "timestamp": "2025-01-22T14:30:00Z" -} -``` - -This enables documentation updates, ERD regeneration (via Mermaid Chart MCP), or other automated responses. - ------ - -### Server: dbt-mcp (Transform Layer) - -**Responsibility:** Model execution, lineage, documentation, and YAML generation for local dbt-core projects. - -**Note:** Official dbt-mcp is Cloud-only. This server wraps local dbt-core CLI. - -#### Tool Categories - -|Category |Tools |Description | -|-------------|-----------------------------------------------|------------------------| -|Discovery |`parse_manifest`, `list_models`, `list_sources`|Project exploration | -|Model |`get_model`, `get_lineage`, `compile_sql` |Model inspection | -|Execution |`run_model`, `test_model`, `get_run_results` |dbt CLI wrapper | -|Documentation|`generate_yaml` |Auto-generate schema.yml| - -#### Lineage Output - -`get_lineage` outputs Mermaid-formatted DAG, compatible with existing Mermaid Chart MCP for rendering. - ------ - -### Internal Dependency Flow (data-platform) - -``` -files → pandas-mcp → postgres-mcp ↔ dbt-mcp - ↑______________| - (query results for reshaping) -``` - -|Flow |Description | -|-----------------|------------------------------------| -|files → pandas |Entry point for raw data | -|pandas → postgres|Schema inference, bulk loading | -|postgres ↔ dbt |dbt queries marts, postgres executes| -|postgres → pandas|Query results for reshaping | -|dbt → pandas |Model outputs for visualization prep| - ------ - -### Agents (data-platform) - -|Agent |Trigger |Sequence | -|----------------|--------------------------|----------------------------------------------------------------------------| -|`data_ingestion`|User provides file |read_file → profile → infer_schema → execute_ddl → load_dataframe → validate| -|`model_analysis`|User asks about dbt model |get_model → get_lineage → explain_query → test_model → synthesize | -|`full_pipeline` |File to materialized model|data_ingestion → create dbt model → run_model | - -**Behavior when dbt-mcp absent:** - -|Agent |Behavior | -|----------------|-------------------------------------| -|`data_ingestion`|Runs fully (no dbt steps) | -|`model_analysis`|Skipped—surfaces “dbt not configured”| -|`full_pipeline` |Stops after load, prompts user | - ------ - -### Commands (data-platform) - -|Command |Maps To | -|--------------------------------|-------------------------------| -|`/ingest {file}` |`data_ingestion` agent | -|`/profile {file}` |`pandas-mcp.profile` | -|`/pivot {data} by {cols}` |`pandas-mcp.pivot` | -|`/merge {left} {right} on {key}`|`pandas-mcp.merge` | -|`/explain {query}` |`postgres-mcp.explain_query` | -|`/schema {table}` |`postgres-mcp.get_table_schema`| -|`/lineage {model}` |`dbt-mcp.get_lineage` | -|`/run {model}` |`dbt-mcp.run_model` | -|`/test {model}` |`dbt-mcp.test_model` | - -dbt commands return graceful “dbt-mcp not loaded” when unavailable. - ------ - -## Plugin: viz-platform - -### Servers - -|Server |Responsibility | -|--------|---------------------------------------------------------| -|dmc-mcp |Version-locked component registry, prop validation | -|dash-mcp|Charts, layouts, pages, theming—validates against dmc-mcp| - ------ - -### Server: dmc-mcp (Component Constraint Layer) - -**Responsibility:** Single source of truth for Dash Mantine Components API. Prevents Claude from hallucinating deprecated props or non-existent components. - -**Problem solved:** DMC versions introduce breaking changes. Claude training data mixes versions. Runtime errors from invalid props waste cycles. - -#### Tool Categories - -|Category |Tools |Description | -|-------------|---------------------|-------------------------------------| -|Discovery |`list_components` |What exists in installed version | -|Introspection|`get_component_props`|Valid props, types, defaults | -|Validation |`validate_component` |Check component definition before use| - -#### Usage Pattern - -Claude queries dmc-mcp first: - -1. “What props does `dmc.Select` accept?” → `get_component_props` -1. Build component with valid props -1. Pass to dash-mcp for rendering - -dash-mcp validates against dmc-mcp before rendering. Invalid components fail fast with actionable errors. - ------ - -### Server: dash-mcp (Visualization Layer) - -**Responsibility:** Chart generation, dashboard layouts, page structure, theming system, and export. - -**Philosophy:** Single server, multiple concerns. Tools are namespaced but share context (theme tokens flow to charts automatically). - -#### Tool Categories - -|Category |Tools |Description | -|----------|--------------------------------------------------------------------|-------------------------------| -|`chart_*` |`chart_create`, `chart_configure_interaction` |Data visualization (Plotly) | -|`layout_*`|`layout_create`, `layout_add_filter`, `layout_set_grid` |Dashboard composition | -|`page_*` |`page_create`, `page_add_navbar`, `page_set_auth` |App-level structure | -|`theme_*` |`theme_create`, `theme_extend`, `theme_validate`, `theme_export_css`|Design tokens, component styles| - -#### Design Token Structure - -Themes are built from design tokens—single source of truth for visual consistency: - -```yaml -tokens: - colors: - primary: "#228be6" - secondary: "#868e96" - background: - base: "#ffffff" - subtle: "#f8f9fa" - text: - primary: "#212529" - muted: "#868e96" - - spacing: - xs: "4px" - sm: "8px" - md: "16px" - lg: "24px" - - typography: - fontFamily: "Inter, sans-serif" - fontSize: - sm: "14px" - md: "16px" - - radii: - sm: "4px" - md: "8px" -``` - -#### Component Style Registry - -Per-component overrides ensuring consistency: - -|Component |Registered Style |Purpose | -|--------------|------------------------------|-------------------------------| -|`kpi_card` |Shadow, padding, border-radius|All KPIs look identical | -|`data_table` |Header bg, row hover, border |Tables share appearance | -|`filter_panel`|Background, spacing, alignment|Filters positioned consistently| -|`chart_card` |Title typography, padding |Chart containers unified | - ------ - -### Internal Dependency Flow (viz-platform) - -``` -dmc-mcp ← dash-mcp - ↑ | - └──────────┘ - (validation before render) -``` - -dash-mcp always validates component definitions against dmc-mcp. No direct data dependency—data comes from external sources. - ------ - -### Agents (viz-platform) - -|Agent |Trigger |Sequence | -|-----------------|----------------------------------|----------------------------------------------------------------------| -|`theme_setup` |New project or brand consistency |list_themes → create_theme → register_component_style → validate_theme| -|`layout_builder` |User wants dashboard structure |create_layout → add_filter → apply_theme → preview | -|`component_check`|Before rendering any DMC component|get_component_props → validate_component → proceed or error | - ------ - -### Commands (viz-platform) - -|Command |Maps To | -|-----------------------|--------------------------------------------| -|`/chart {type}` |`dash-mcp.chart_create` (expects data input)| -|`/dashboard {template}`|`layout_builder` agent | -|`/theme {name}` |`dash-mcp.theme_apply` | -|`/theme new {name}` |`dash-mcp.theme_create` | -|`/theme css {name}` |`dash-mcp.theme_export_css` | -|`/component {name}` |`dmc-mcp.get_component_props` | - ------ - -## Cross-Plugin Interactions - -### How It Works - -MCP servers do not call each other. Claude orchestrates: - -1. Server A returns output to Claude -1. Claude interprets and determines next step -1. Claude passes relevant data to Server B - -### Documentation Layers - -|Layer |Location |Purpose | -|------------------|-----------------------|------------------------------------| -|Plugin docs |Each plugin’s README.md|Declares inputs/outputs | -|Claude.md |Project root |Cross-plugin agents for this project| -|contract-validator|Separate plugin |Validates compatibility | -|doc-guardian |Separate plugin |Catches drift within each project | - -### Interface Contracts - -Each plugin declares what it produces and accepts: - -**data-platform outputs:** - -- `data_ref`: In-memory DataFrame reference -- `query_result`: Row set from postgres-mcp -- `model_output`: Materialized table reference from dbt-mcp -- `schema_snapshot`: Full schema state for documentation - -**viz-platform inputs:** - -- Accepts `data_ref`, `query_result`, or `model_output` as data source -- Validates all DMC components against dmc-mcp before rendering - -### Cross-Plugin Agents (defined in Claude.md) - -|Agent |Trigger |Sequence | -|--------------------|-----------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------| -|`dashboard_builder` |User requests visualization of database content|postgres-mcp.execute_query → pandas-mcp.pivot (if needed) → dmc-mcp.validate → dash-mcp.chart_create → dash-mcp.layout_create| -|`visualization_prep`|Query result needs reshaping |postgres-mcp.execute_query → pandas-mcp.reshape → dash-mcp.chart_create | - -### Validation: contract-validator - -Separate plugin for cross-plugin validation. See **Plugin: contract-validator** section for full specification. - -**Key distinction from doc-guardian:** - -- doc-guardian: “did code change break docs?” (within a project) -- contract-validator: “do plugins work together?” (across plugins) - ------ - -## Plugin: contract-validator - -### Purpose - -Validates cross-plugin compatibility and Claude.md agent definitions. Ensures plugins can actually work together before runtime failures occur. - -**Problem solved:** Plugins declare interfaces in README. Claude.md references tools across plugins. Without validation: - -- Agents reference tools that don’t exist -- viz-platform expects input format data-platform doesn’t produce -- Plugin updates break workflows silently - ------ - -### What It Reads - -|Source |Purpose | -|------------------|----------------------------------------------------| -|Plugin README.md |Extract declared inputs/outputs | -|Claude.md |Extract agent definitions and tool references | -|MCP server schemas|Verify tools actually exist with expected signatures| - ------ - -### Tool Categories - -|Category|Tools |Description | -|--------|---------------------------------------------------------------------|---------------------------------| -|Parse |`parse_plugin_interface`, `parse_claude_md_agents` |Extract structured data from docs| -|Validate|`validate_compatibility`, `validate_agent_refs`, `validate_data_flow`|Check contracts match | -|Report |`generate_compatibility_report`, `list_issues` |Output findings | - -#### Tool Details - -**`parse_plugin_interface`** - -- Input: Plugin path or README content -- Output: Structured interface (inputs accepted, outputs produced, tool names) - -**`parse_claude_md_agents`** - -- Input: Claude.md path or content -- Output: List of agents with their tool sequences - -**`validate_compatibility`** - -- Input: Two plugin interfaces -- Output: Compatibility report (what A produces that B accepts, gaps) - -**`validate_agent_refs`** - -- Input: Agent definition, list of available plugins -- Output: Missing tools, invalid sequences - -**`validate_data_flow`** - -- Input: Agent sequence -- Output: Verification that each step output matches next step expected input - ------ - -### Agents (contract-validator) - -|Agent |Trigger |Sequence | -|-----------------|-------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------| -|`full_validation`|User runs `/validate-contracts`|parse all plugin interfaces → parse Claude.md → validate_compatibility for each pair → validate_agent_refs for each agent → generate_compatibility_report| -|`agent_check` |User runs `/check-agent {name}`|parse_claude_md_agents → find agent → validate_agent_refs → validate_data_flow → report issues | - ------ - -### Commands - -|Command |Maps To |Description | -|---------------------|----------------------------------------|--------------------------------------| -|`/validate-contracts`|`full_validation` agent |Full project validation | -|`/check-agent {name}`|`agent_check` agent |Validate single agent definition | -|`/list-interfaces` |`parse_plugin_interface` for all plugins|Show what each plugin produces/accepts| - ------ - -### Output Format - -**Compatibility Report:** - -``` -## Contract Validation Report - -### Plugin Interfaces -- data-platform: produces [data_ref, query_result, model_output, schema_snapshot] -- viz-platform: accepts [data_ref, query_result, model_output] - -### Compatibility Matrix -| Producer | Consumer | Status | -|----------|----------|--------| -| data-platform → viz-platform | ✓ Compatible | All outputs accepted | - -### Agent Validation -| Agent | Status | Issues | -|-------|--------|--------| -| dashboard_builder | ✓ Valid | — | -| model_analysis | ⚠ Warning | dbt-mcp optional; agent fails if not loaded | - -### Issues Found -- None - -### Warnings -- Agent `model_analysis` depends on optional server `dbt-mcp` -``` - -**Issue Types:** - -|Type |Severity|Example | -|-------------------|--------|------------------------------------------------------------------------| -|Missing tool |Error |Agent references `pandas-mcp.transform` but tool is `pandas-mcp.reshape`| -|Interface mismatch |Error |viz-platform expects `chart_data` but data-platform produces `data_ref` | -|Optional dependency|Warning |Agent uses dbt-mcp which may not be loaded | -|Undeclared output |Warning |Plugin produces output not listed in README | - ------ - -### Integration with doc-guardian - -**Separation of concerns:** - -|Plugin |Scope |Trigger | -|------------------|----------------------------------|------------------------| -|doc-guardian |Code ↔ docs drift within a project|PostToolUse (Write/Edit)| -|contract-validator|Plugin ↔ plugin compatibility |On-demand or CI hook | - -contract-validator does NOT watch for file changes. It runs on-demand or as CI step. - -**Potential future integration:** doc-guardian could trigger contract-validator when Claude.md or plugin README changes. Not required for v1. - ------ - -## Diagramming Approach - -No diagram-mcp server. Use existing Mermaid Chart MCP. - -**For ERDs:** - -- postgres-mcp exposes schema metadata via `get_schema_snapshot` -- Claude generates Mermaid syntax -- Mermaid Chart MCP renders - -**For dbt lineage:** - -- dbt-mcp.get_lineage outputs Mermaid-formatted DAG -- Mermaid Chart MCP renders - -This avoids the complexity of draw.io XML generation while maintaining documentation capability. - ------ - -## Implementation Order - -|Phase|Plugin |Server |Rationale | -|-----|------------------|------------|-----------------------------------------------| -|1 |data-platform |pandas-mcp |Entry point, no dependencies | -|2 |data-platform |postgres-mcp|Load from Phase 1, query capabilities | -|3 |data-platform |dbt-mcp |Transform layer, requires postgres-mcp | -|4 |viz-platform |dmc-mcp |Constraint layer, no dependencies | -|5 |viz-platform |dash-mcp |Visualization, validates against dmc-mcp | -|6 |contract-validator|— |Validates all above, requires stable interfaces| - -**Notes:** - -- Phases 1-3 (data-platform) and 4-5 (viz-platform) can proceed in parallel -- contract-validator (Phase 6) should wait until plugin interfaces stabilize -- doc-guardian already exists; update scope documentation only - ------ - -## Open Questions - -### Data Reference Passing - -How do servers share `data_ref` objects? Options: - -- **Temporary files with URIs**: Portable but I/O overhead -- **Arrow IPC**: Efficient but requires both servers to support -- **Recommendation**: Arrow IPC for efficiency, file fallback for compatibility - -### Authentication - -Should postgres-mcp handle connection strings directly, or use a secrets manager pattern? - -### Theme Storage - -Where do custom themes persist? - -- Local config file (`~/.dash-mcp/themes/`) -- Project-level (alongside dbt_project.yml) -- Database table (for shared team themes) - -### dbt Project Discovery - -Auto-detect `dbt_project.yml` in common locations, or require explicit path? - ------ - -## Technology Stack - -|Layer |Technology |Notes | -|---------------|-----------------------|---------------------------| -|MCP Framework |FastMCP |Or manual MCP SDK | -|Python |3.11+ |Type hints, async support | -|Data Processing|pandas |Core DataFrame ops | -|Arrow |pyarrow |Parquet, efficient memory | -|Database |psycopg |Async-ready Postgres driver| -|Geospatial |geoalchemy2 |PostGIS integration | -|dbt |dbt-core |CLI wrapper | -|Visualization |plotly |Figure generation | -|UI Components |dash-mantine-components|Version-locked via dmc-mcp | - ------ - -## Summary - -### Core Plugins - -|Plugin |Servers/Scope |Key Characteristic | -|------------------|---------------------------------------|------------------------------------------------------| -|data-platform |pandas-mcp, postgres-mcp, dbt-mcp |Optional server loading per project | -|viz-platform |dmc-mcp, dash-mcp |dmc-mcp validates before dash-mcp renders | -|contract-validator|Interface parsing, compatibility checks|Validates cross-plugin contracts and agent definitions| - -### Supporting Plugins (Existing) - -|Plugin |Purpose | -|-----------------|------------------------------------| -|doc-guardian |Code-to-docs drift (unchanged scope)| -|Mermaid Chart MCP|Diagram rendering | - -### Interaction Model - -``` -Plugin READMEs → declare inputs/outputs -Claude.md → define cross-plugin agents -contract-validator → validate compatibility -doc-guardian → catch drift within projects -``` - +> **Type:** Change Proposal +> **Version:** V04.0.0 +> **Status:** Implemented +> **Date:** 2026-01-25 + +## Implementations + +- [Change V04.0.0: Proposal (Implementation 1)](Change-V04.0.0:-Proposal-(Implementation-1)) - data-platform plugin + +--- + +# MCP Data Platform — Architecture Reference + +*Plugin taxonomy, server responsibilities, and interaction patterns for Leo data marketplace* + +----- + +## Overview + +Two plugins serving distinct domains, designed for independent or combined use. + +|Plugin |Servers |Domain | +|-----------------|---------------------------------|-----------------------------------------| +|**data-platform**|pandas-mcp, postgres-mcp, dbt-mcp|Ingestion, storage, transformation | +|**viz-platform** |dmc-mcp, dash-mcp |Component validation, dashboards, theming| + +**Key principles:** + +- MCP servers are independent processes—they do not import each other +- Claude orchestrates cross-server data flow at runtime +- Plugins ship multiple servers; projects load only what they need +- Claude.md defines project-specific workflows spanning plugins + +----- + +## Component Definitions + +|Component Type|Definition |Runtime Context | +|--------------|--------------------------------------------------------------------------------------------------------------------|----------------------------------------------------| +|**MCP Server**|Standalone service exposing tools via Model Context Protocol. One server = one domain responsibility. |Long-running process, spawned by Claude Desktop/Code| +|**Tool** |Single callable function within an MCP server. Atomic operation with defined input schema and output. |Invoked per-request by LLM | +|**Resource** |Read-only data exposed by MCP server (files, schemas, configs). Discoverable but not executable. |Static or cached | +|**Agent** |Orchestration layer that chains multiple tool calls across servers. Lives in Claude reasoning, not in MCP servers.|LLM-driven, multi-step | +|**Command** |User-facing shortcut (e.g., `/ingest`) that triggers predefined tool sequences. |Chat interface trigger | + +----- + +## Plugin: data-platform + +### Server Loading + +Single plugin ships all three servers. Which servers load is determined by project config—not environment variables. + +|Server |Default|Optional| +|------------|-------|--------| +|pandas-mcp |✓ |— | +|postgres-mcp|✓ |— | +|dbt-mcp |— |✓ | + +**Example project configs:** + +```yaml +# Web app project (no dbt) +mcp_servers: + - pandas-mcp + - postgres-mcp +``` + +```yaml +# Data engineering project (full stack) +mcp_servers: + - pandas-mcp + - postgres-mcp + - dbt-mcp +``` + +Agents check server availability at runtime. If dbt-mcp is not loaded, dbt-related steps are skipped or surface "not available for this project." + +----- + +### Server: pandas-mcp (Data Shaping Layer) + +**Responsibility:** File ingestion, data profiling, schema inference, and utility shaping operations. + +**Philosophy:** SQL-first for persistent transforms (use dbt). Pandas for: + +- Pre-database ingestion (profiling, validation, schema inference) +- Visualization prep (reshaping query results for chart formats) +- Ad-hoc operations (prototyping, merging with local files) + +#### Tool Categories + +|Category |Tools |Description | +|---------|------------------------------------------------------------------|---------------------------------| +|Ingestion|`read_file`, `write_file`, `detect_encoding` |File I/O with format auto-detection| +|Profiling|`profile`, `validate`, `sample` |Data quality assessment | +|Schema |`infer_schema` |Generate DDL from data structure | +|Shaping |`reshape`, `pivot`, `melt`, `merge`, `add_columns`, `filter_rows` |Transform any data reference | + +#### Data Reference Sources + +pandas-mcp accepts `data_ref` from multiple origins: + +|Source |How It Arrives | +|------------------|-------------------------| +|Local file |`read_file` tool | +|Query result |Passed from postgres-mcp | +|dbt model output |Passed from dbt-mcp | +|Previous transform|Chained from shaping tool| + +#### When to Use Shaping Tools + +|Scenario |Use pandas-mcp|Use SQL/dbt| +|--------------------------------------|--------------|-----------| +|Pivot for heatmap chart |✓ |— | +|Join query result with local CSV |✓ |— | +|Prototype transform before formalizing|✓ |— | +|Persistent aggregation in pipeline |— |✓ | +|Reusable business logic |— |✓ | +|Needs version control + testing |— |✓ | + +----- + +### Server: postgres-mcp (Database Layer) + +**Responsibility:** Data loading, querying, schema management, performance analysis, and geospatial operations. + +#### Tool Categories + +|Category|Tools |Description | +|--------|---------------------------------------------------------------------------------------------------|-------------------------------------| +|Query |`list_schemas`, `list_tables`, `get_table_schema`, `execute_query`, `query_geometry` |Read operations | +|Analysis|`explain_query`, `recommend_indexes`, `health_check` |Performance insights | +|Write |`execute_write`, `load_dataframe` |Data modification | +|DDL |`execute_ddl`, `get_schema_snapshot` |Schema management with change tracking| + +#### DDL Change Tracking + +`execute_ddl` returns structured output for downstream automation: + +```json +{ + "success": true, + "operation": "CREATE TABLE", + "affected_objects": [ + { + "type": "table", + "schema": "public", + "name": "customer_orders", + "change": "created" + } + ], + "timestamp": "2025-01-22T14:30:00Z" +} +``` + +This enables documentation updates, ERD regeneration (via Mermaid Chart MCP), or other automated responses. + +----- + +### Server: dbt-mcp (Transform Layer) + +**Responsibility:** Model execution, lineage, documentation, and YAML generation for local dbt-core projects. + +**Note:** Official dbt-mcp is Cloud-only. This server wraps local dbt-core CLI. + +#### Tool Categories + +|Category |Tools |Description | +|-------------|-------------------------------------------------|------------------------| +|Discovery |`parse_manifest`, `list_models`, `list_sources` |Project exploration | +|Model |`get_model`, `get_lineage`, `compile_sql` |Model inspection | +|Execution |`run_model`, `test_model`, `get_run_results` |dbt CLI wrapper | +|Documentation|`generate_yaml` |Auto-generate schema.yml| + +#### Lineage Output + +`get_lineage` outputs Mermaid-formatted DAG, compatible with existing Mermaid Chart MCP for rendering. + +----- + +### Internal Dependency Flow (data-platform) + +``` +files → pandas-mcp → postgres-mcp ↔ dbt-mcp + ↑______________| + (query results for reshaping) +``` + +|Flow |Description | +|-----------------|-------------------------------------| +|files → pandas |Entry point for raw data | +|pandas → postgres|Schema inference, bulk loading | +|postgres ↔ dbt |dbt queries marts, postgres executes | +|postgres → pandas|Query results for reshaping | +|dbt → pandas |Model outputs for visualization prep | + +----- + +### Agents (data-platform) + +|Agent |Trigger |Sequence | +|----------------|--------------------------|----------------------------------------------------------------------------------| +|`data_ingestion`|User provides file |read_file → profile → infer_schema → execute_ddl → load_dataframe → validate | +|`model_analysis`|User asks about dbt model |get_model → get_lineage → explain_query → test_model → synthesize | +|`full_pipeline` |File to materialized model|data_ingestion → create dbt model → run_model | + +**Behavior when dbt-mcp absent:** + +|Agent |Behavior | +|----------------|-----------------------------------| +|`data_ingestion`|Runs fully (no dbt steps) | +|`model_analysis`|Skipped—surfaces "dbt not configured"| +|`full_pipeline` |Stops after load, prompts user | + +----- + +### Commands (data-platform) + +|Command |Maps To | +|--------------------------------|-------------------------------| +|`/ingest {file}` |`data_ingestion` agent | +|`/profile {file}` |`pandas-mcp.profile` | +|`/pivot {data} by {cols}` |`pandas-mcp.pivot` | +|`/merge {left} {right} on {key}`|`pandas-mcp.merge` | +|`/explain {query}` |`postgres-mcp.explain_query` | +|`/schema {table}` |`postgres-mcp.get_table_schema`| +|`/lineage {model}` |`dbt-mcp.get_lineage` | +|`/run {model}` |`dbt-mcp.run_model` | +|`/test {model}` |`dbt-mcp.test_model` | + +dbt commands return graceful "dbt-mcp not loaded" when unavailable. + +----- + +## Plugin: viz-platform + +### Servers + +|Server |Responsibility | +|--------|-----------------------------------------------------------| +|dmc-mcp |Version-locked component registry, prop validation | +|dash-mcp|Charts, layouts, pages, theming—validates against dmc-mcp | + +----- + +### Server: dmc-mcp (Component Constraint Layer) + +**Responsibility:** Single source of truth for Dash Mantine Components API. Prevents Claude from hallucinating deprecated props or non-existent components. + +**Problem solved:** DMC versions introduce breaking changes. Claude training data mixes versions. Runtime errors from invalid props waste cycles. + +#### Tool Categories + +|Category |Tools |Description | +|-------------|----------------------|-----------------------------------------| +|Discovery |`list_components` |What exists in installed version | +|Introspection|`get_component_props` |Valid props, types, defaults | +|Validation |`validate_component` |Check component definition before use | + +#### Usage Pattern + +Claude queries dmc-mcp first: + +1. "What props does `dmc.Select` accept?" → `get_component_props` +1. Build component with valid props +1. Pass to dash-mcp for rendering + +dash-mcp validates against dmc-mcp before rendering. Invalid components fail fast with actionable errors. + +----- + +### Server: dash-mcp (Visualization Layer) + +**Responsibility:** Chart generation, dashboard layouts, page structure, theming system, and export. + +**Philosophy:** Single server, multiple concerns. Tools are namespaced but share context (theme tokens flow to charts automatically). + +#### Tool Categories + +|Category |Tools |Description | +|----------|--------------------------------------------------------------------|---------------------------------| +|`chart_*` |`chart_create`, `chart_configure_interaction` |Data visualization (Plotly) | +|`layout_*`|`layout_create`, `layout_add_filter`, `layout_set_grid` |Dashboard composition | +|`page_*` |`page_create`, `page_add_navbar`, `page_set_auth` |App-level structure | +|`theme_*` |`theme_create`, `theme_extend`, `theme_validate`, `theme_export_css`|Design tokens, component styles | + +#### Design Token Structure + +Themes are built from design tokens—single source of truth for visual consistency: + +```yaml +tokens: + colors: + primary: "#228be6" + secondary: "#868e96" + background: + base: "#ffffff" + subtle: "#f8f9fa" + text: + primary: "#212529" + muted: "#868e96" + + spacing: + xs: "4px" + sm: "8px" + md: "16px" + lg: "24px" + + typography: + fontFamily: "Inter, sans-serif" + fontSize: + sm: "14px" + md: "16px" + + radii: + sm: "4px" + md: "8px" +``` + +#### Component Style Registry + +Per-component overrides ensuring consistency: + +|Component |Registered Style |Purpose | +|--------------|------------------------------|-------------------------------| +|`kpi_card` |Shadow, padding, border-radius|All KPIs look identical | +|`data_table` |Header bg, row hover, border |Tables share appearance | +|`filter_panel`|Background, spacing, alignment|Filters positioned consistently| +|`chart_card` |Title typography, padding |Chart containers unified | + +----- + +### Internal Dependency Flow (viz-platform) + +``` +dmc-mcp ← dash-mcp + ↑ | + └──────────┘ + (validation before render) +``` + +dash-mcp always validates component definitions against dmc-mcp. No direct data dependency—data comes from external sources. + +----- + +### Agents (viz-platform) + +|Agent |Trigger |Sequence | +|-----------------|-----------------------------------|------------------------------------------------------------------------------------| +|`theme_setup` |New project or brand consistency |list_themes → create_theme → register_component_style → validate_theme | +|`layout_builder` |User wants dashboard structure |create_layout → add_filter → apply_theme → preview | +|`component_check`|Before rendering any DMC component |get_component_props → validate_component → proceed or error | + +----- + +### Commands (viz-platform) + +|Command |Maps To | +|-----------------------|--------------------------------------------| +|`/chart {type}` |`dash-mcp.chart_create` (expects data input)| +|`/dashboard {template}`|`layout_builder` agent | +|`/theme {name}` |`dash-mcp.theme_apply` | +|`/theme new {name}` |`dash-mcp.theme_create` | +|`/theme css {name}` |`dash-mcp.theme_export_css` | +|`/component {name}` |`dmc-mcp.get_component_props` | + +----- + +## Cross-Plugin Interactions + +### How It Works + +MCP servers do not call each other. Claude orchestrates: + +1. Server A returns output to Claude +1. Claude interprets and determines next step +1. Claude passes relevant data to Server B + +### Documentation Layers + +|Layer |Location |Purpose | +|------------------|-------------------------|---------------------------------------| +|Plugin docs |Each plugin's README.md |Declares inputs/outputs | +|Claude.md |Project root |Cross-plugin agents for this project | +|contract-validator|Separate plugin |Validates compatibility | +|doc-guardian |Separate plugin |Catches drift within each project | + +### Interface Contracts + +Each plugin declares what it produces and accepts: + +**data-platform outputs:** + +- `data_ref`: In-memory DataFrame reference +- `query_result`: Row set from postgres-mcp +- `model_output`: Materialized table reference from dbt-mcp +- `schema_snapshot`: Full schema state for documentation + +**viz-platform inputs:** + +- Accepts `data_ref`, `query_result`, or `model_output` as data source +- Validates all DMC components against dmc-mcp before rendering + +### Cross-Plugin Agents (defined in Claude.md) + +|Agent |Trigger |Sequence | +|--------------------|---------------------------------------------------|----------------------------------------------------------------------------------------------------------------------| +|`dashboard_builder` |User requests visualization of database content |postgres-mcp.execute_query → pandas-mcp.pivot (if needed) → dmc-mcp.validate → dash-mcp.chart_create → dash-mcp.layout_create| +|`visualization_prep`|Query result needs reshaping |postgres-mcp.execute_query → pandas-mcp.reshape → dash-mcp.chart_create | + +### Validation: contract-validator + +Separate plugin for cross-plugin validation. See **Plugin: contract-validator** section for full specification. + +**Key distinction from doc-guardian:** + +- doc-guardian: "did code change break docs?" (within a project) +- contract-validator: "do plugins work together?" (across plugins) + +----- + +## Plugin: contract-validator + +### Purpose + +Validates cross-plugin compatibility and Claude.md agent definitions. Ensures plugins can actually work together before runtime failures occur. + +**Problem solved:** Plugins declare interfaces in README. Claude.md references tools across plugins. Without validation: + +- Agents reference tools that don't exist +- viz-platform expects input format data-platform doesn't produce +- Plugin updates break workflows silently + +----- + +### What It Reads + +|Source |Purpose | +|------------------|-------------------------------------------------| +|Plugin README.md |Extract declared inputs/outputs | +|Claude.md |Extract agent definitions and tool references | +|MCP server schemas|Verify tools actually exist with expected signatures| + +----- + +### Tool Categories + +|Category|Tools |Description | +|--------|---------------------------------------------------------------------|---------------------------------| +|Parse |`parse_plugin_interface`, `parse_claude_md_agents` |Extract structured data from docs| +|Validate|`validate_compatibility`, `validate_agent_refs`, `validate_data_flow`|Check contracts match | +|Report |`generate_compatibility_report`, `list_issues` |Output findings | + +#### Tool Details + +**`parse_plugin_interface`** + +- Input: Plugin path or README content +- Output: Structured interface (inputs accepted, outputs produced, tool names) + +**`parse_claude_md_agents`** + +- Input: Claude.md path or content +- Output: List of agents with their tool sequences + +**`validate_compatibility`** + +- Input: Two plugin interfaces +- Output: Compatibility report (what A produces that B accepts, gaps) + +**`validate_agent_refs`** + +- Input: Agent definition, list of available plugins +- Output: Missing tools, invalid sequences + +**`validate_data_flow`** + +- Input: Agent sequence +- Output: Verification that each step output matches next step expected input + +----- + +### Agents (contract-validator) + +|Agent |Trigger |Sequence | +|-----------------|---------------------------------|---------------------------------------------------------------------------------------------------------------------------------------| +|`full_validation`|User runs `/validate-contracts` |parse all plugin interfaces → parse Claude.md → validate_compatibility for each pair → validate_agent_refs for each agent → generate_compatibility_report| +|`agent_check` |User runs `/check-agent {name}` |parse_claude_md_agents → find agent → validate_agent_refs → validate_data_flow → report issues | + +----- + +### Commands + +|Command |Maps To |Description | +|---------------------|---------------------------------------|-----------------------------------| +|`/validate-contracts`|`full_validation` agent |Full project validation | +|`/check-agent {name}`|`agent_check` agent |Validate single agent definition | +|`/list-interfaces` |`parse_plugin_interface` for all plugins|Show what each plugin produces/accepts| + +----- + +### Output Format + +**Compatibility Report:** + +``` +## Contract Validation Report + +### Plugin Interfaces +- data-platform: produces [data_ref, query_result, model_output, schema_snapshot] +- viz-platform: accepts [data_ref, query_result, model_output] + +### Compatibility Matrix +| Producer | Consumer | Status | +|----------|----------|--------| +| data-platform → viz-platform | ✓ Compatible | All outputs accepted | + +### Agent Validation +| Agent | Status | Issues | +|-------|--------|--------| +| dashboard_builder | ✓ Valid | — | +| model_analysis | ⚠ Warning | dbt-mcp optional; agent fails if not loaded | + +### Issues Found +- None + +### Warnings +- Agent `model_analysis` depends on optional server `dbt-mcp` +``` + +**Issue Types:** + +|Type |Severity|Example | +|-------------------|--------|--------------------------------------------------------------------------------| +|Missing tool |Error |Agent references `pandas-mcp.transform` but tool is `pandas-mcp.reshape` | +|Interface mismatch |Error |viz-platform expects `chart_data` but data-platform produces `data_ref` | +|Optional dependency|Warning |Agent uses dbt-mcp which may not be loaded | +|Undeclared output |Warning |Plugin produces output not listed in README | + +----- + +### Integration with doc-guardian + +**Separation of concerns:** + +|Plugin |Scope |Trigger | +|------------------|----------------------------------|----------------------| +|doc-guardian |Code → docs drift within a project|PostToolUse (Write/Edit)| +|contract-validator|Plugin → plugin compatibility |On-demand or CI hook | + +contract-validator does NOT watch for file changes. It runs on-demand or as CI step. + +**Potential future integration:** doc-guardian could trigger contract-validator when Claude.md or plugin README changes. Not required for v1. + +----- + +## Diagramming Approach + +No diagram-mcp server. Use existing Mermaid Chart MCP. + +**For ERDs:** + +- postgres-mcp exposes schema metadata via `get_schema_snapshot` +- Claude generates Mermaid syntax +- Mermaid Chart MCP renders + +**For dbt lineage:** + +- dbt-mcp.get_lineage outputs Mermaid-formatted DAG +- Mermaid Chart MCP renders + +This avoids the complexity of draw.io XML generation while maintaining documentation capability. + +----- + +## Implementation Order + +|Phase|Plugin |Server |Rationale | +|-----|------------------|------------|------------------------------------------------| +|1 |data-platform |pandas-mcp |Entry point, no dependencies | +|2 |data-platform |postgres-mcp|Load from Phase 1, query capabilities | +|3 |data-platform |dbt-mcp |Transform layer, requires postgres-mcp | +|4 |viz-platform |dmc-mcp |Constraint layer, no dependencies | +|5 |viz-platform |dash-mcp |Visualization, validates against dmc-mcp | +|6 |contract-validator|— |Validates all above, requires stable interfaces | + +**Notes:** + +- Phases 1-3 (data-platform) and 4-5 (viz-platform) can proceed in parallel +- contract-validator (Phase 6) should wait until plugin interfaces stabilize +- doc-guardian already exists; update scope documentation only + +----- + +## Open Questions + +### Data Reference Passing + +How do servers share `data_ref` objects? Options: + +- **Temporary files with URIs**: Portable but I/O overhead +- **Arrow IPC**: Efficient but requires both servers to support +- **Recommendation**: Arrow IPC for efficiency, file fallback for compatibility + +### Authentication + +Should postgres-mcp handle connection strings directly, or use a secrets manager pattern? + +### Theme Storage + +Where do custom themes persist? + +- Local config file (`~/.dash-mcp/themes/`) +- Project-level (alongside dbt_project.yml) +- Database table (for shared team themes) + +### dbt Project Discovery + +Auto-detect `dbt_project.yml` in common locations, or require explicit path? + +----- + +## Technology Stack + +|Layer |Technology |Notes | +|---------------|-----------------------|---------------------------| +|MCP Framework |FastMCP |Or manual MCP SDK | +|Python |3.11+ |Type hints, async support | +|Data Processing|pandas |Core DataFrame ops | +|Arrow |pyarrow |Parquet, efficient memory | +|Database |psycopg |Async-ready Postgres driver| +|Geospatial |geoalchemy2 |PostGIS integration | +|dbt |dbt-core |CLI wrapper | +|Visualization |plotly |Figure generation | +|UI Components |dash-mantine-components|Version-locked via dmc-mcp | + +----- + +## Summary + +### Core Plugins + +|Plugin |Servers/Scope |Key Characteristic | +|------------------|-------------------------------------------|---------------------------------------------------| +|data-platform |pandas-mcp, postgres-mcp, dbt-mcp |Optional server loading per project | +|viz-platform |dmc-mcp, dash-mcp |dmc-mcp validates before dash-mcp renders | +|contract-validator|Interface parsing, compatibility checks |Validates cross-plugin contracts and agent definitions| + +### Supporting Plugins (Existing) + +|Plugin |Purpose | +|-----------------|------------------------------------| +|doc-guardian |Code-to-docs drift (unchanged scope)| +|Mermaid Chart MCP|Diagram rendering | + +### Interaction Model + +``` +Plugin READMEs → declare inputs/outputs +Claude.md → define cross-plugin agents +contract-validator → validate compatibility +doc-guardian → catch drift within projects +``` + **Flow:** Plugins declare interfaces. Claude.md defines workflows. contract-validator enforces compatibility. doc-guardian handles internal drift. \ No newline at end of file