Add Change V04.0.0: Proposal

2026-01-25 15:31:30 +00:00
parent 6dff0e7a2b
commit 09bc26c471
1 changed files with 655 additions and 0 deletions
--- a/Change-V04.0.0%3A-Proposal.md
+++ b/Change-V04.0.0%3A-Proposal.md
@@ -0,0 +1,655 @@
+# MCP Data Platform — Architecture Reference
+
+*Plugin taxonomy, server responsibilities, and interaction patterns for Leo’s data marketplace*
+
+-----
+
+## Overview
+
+Two plugins serving distinct domains, designed for independent or combined use.
+
+|Plugin           |Servers                          |Domain                                   |
+|-----------------|---------------------------------|-----------------------------------------|
+|**data-platform**|pandas-mcp, postgres-mcp, dbt-mcp|Ingestion, storage, transformation       |
+|**viz-platform** |dmc-mcp, dash-mcp                |Component validation, dashboards, theming|
+
+**Key principles:**
+
+- MCP servers are independent processes—they don’t import each other
+- Claude orchestrates cross-server data flow at runtime
+- Plugins ship multiple servers; projects load only what they need
+- Claude.md defines project-specific workflows spanning plugins
+
+-----
+
+## Component Definitions
+
+|Component Type|Definition                                                                                                          |Runtime Context                                     |
+|--------------|--------------------------------------------------------------------------------------------------------------------|----------------------------------------------------|
+|**MCP Server**|Standalone service exposing tools via Model Context Protocol. One server = one domain responsibility.               |Long-running process, spawned by Claude Desktop/Code|
+|**Tool**      |Single callable function within an MCP server. Atomic operation with defined input schema and output.               |Invoked per-request by LLM                          |
+|**Resource**  |Read-only data exposed by MCP server (files, schemas, configs). Discoverable but not executable.                    |Static or cached                                    |
+|**Agent**     |Orchestration layer that chains multiple tool calls across servers. Lives in Claude’s reasoning, not in MCP servers.|LLM-driven, multi-step                              |
+|**Command**   |User-facing shortcut (e.g., `/ingest`) that triggers predefined tool sequences.                                     |Chat interface trigger                              |
+
+-----
+
+## Plugin: data-platform
+
+### Server Loading
+
+Single plugin ships all three servers. Which servers load is determined by project config—not environment variables.
+
+|Server      |Default|Optional|
+|------------|-------|--------|
+|pandas-mcp  |✓      |—       |
+|postgres-mcp|✓      |—       |
+|dbt-mcp     |—      |✓       |
+
+**Example project configs:**
+
+```yaml
+# Web app project (no dbt)
+mcp_servers:
+  - pandas-mcp
+  - postgres-mcp
+```
+
+```yaml
+# Data engineering project (full stack)
+mcp_servers:
+  - pandas-mcp
+  - postgres-mcp
+  - dbt-mcp
+```
+
+Agents check server availability at runtime. If dbt-mcp isn’t loaded, dbt-related steps are skipped or surface “not available for this project.”
+
+-----
+
+### Server: pandas-mcp (Data Shaping Layer)
+
+**Responsibility:** File ingestion, data profiling, schema inference, and utility shaping operations.
+
+**Philosophy:** SQL-first for persistent transforms (use dbt). Pandas for:
+
+- Pre-database ingestion (profiling, validation, schema inference)
+- Visualization prep (reshaping query results for chart formats)
+- Ad-hoc operations (prototyping, merging with local files)
+
+#### Tool Categories
+
+|Category |Tools                                                            |Description                        |
+|---------|-----------------------------------------------------------------|-----------------------------------|
+|Ingestion|`read_file`, `write_file`, `detect_encoding`                     |File I/O with format auto-detection|
+|Profiling|`profile`, `validate`, `sample`                                  |Data quality assessment            |
+|Schema   |`infer_schema`                                                   |Generate DDL from data structure   |
+|Shaping  |`reshape`, `pivot`, `melt`, `merge`, `add_columns`, `filter_rows`|Transform any data reference       |
+
+#### Data Reference Sources
+
+pandas-mcp accepts `data_ref` from multiple origins:
+
+|Source            |How It Arrives           |
+|------------------|-------------------------|
+|Local file        |`read_file` tool         |
+|Query result      |Passed from postgres-mcp |
+|dbt model output  |Passed from dbt-mcp      |
+|Previous transform|Chained from shaping tool|
+
+#### When to Use Shaping Tools
+
+|Scenario                              |Use pandas-mcp|Use SQL/dbt|
+|--------------------------------------|--------------|-----------|
+|Pivot for heatmap chart               |✓             |—          |
+|Join query result with local CSV      |✓             |—          |
+|Prototype transform before formalizing|✓             |—          |
+|Persistent aggregation in pipeline    |—             |✓          |
+|Reusable business logic               |—             |✓          |
+|Needs version control + testing       |—             |✓          |
+
+-----
+
+### Server: postgres-mcp (Database Layer)
+
+**Responsibility:** Data loading, querying, schema management, performance analysis, and geospatial operations.
+
+#### Tool Categories
+
+|Category|Tools                                                                               |Description                           |
+|--------|------------------------------------------------------------------------------------|--------------------------------------|
+|Query   |`list_schemas`, `list_tables`, `get_table_schema`, `execute_query`, `query_geometry`|Read operations                       |
+|Analysis|`explain_query`, `recommend_indexes`, `health_check`                                |Performance insights                  |
+|Write   |`execute_write`, `load_dataframe`                                                   |Data modification                     |
+|DDL     |`execute_ddl`, `get_schema_snapshot`                                                |Schema management with change tracking|
+
+#### DDL Change Tracking
+
+`execute_ddl` returns structured output for downstream automation:
+
+```json
+{
+  "success": true,
+  "operation": "CREATE TABLE",
+  "affected_objects": [
+    {
+      "type": "table",
+      "schema": "public",
+      "name": "customer_orders",
+      "change": "created"
+    }
+  ],
+  "timestamp": "2025-01-22T14:30:00Z"
+}
+```
+
+This enables documentation updates, ERD regeneration (via Mermaid Chart MCP), or other automated responses.
+
+-----
+
+### Server: dbt-mcp (Transform Layer)
+
+**Responsibility:** Model execution, lineage, documentation, and YAML generation for local dbt-core projects.
+
+**Note:** Official dbt-mcp is Cloud-only. This server wraps local dbt-core CLI.
+
+#### Tool Categories
+
+|Category     |Tools                                          |Description             |
+|-------------|-----------------------------------------------|------------------------|
+|Discovery    |`parse_manifest`, `list_models`, `list_sources`|Project exploration     |
+|Model        |`get_model`, `get_lineage`, `compile_sql`      |Model inspection        |
+|Execution    |`run_model`, `test_model`, `get_run_results`   |dbt CLI wrapper         |
+|Documentation|`generate_yaml`                                |Auto-generate schema.yml|
+
+#### Lineage Output
+
+`get_lineage` outputs Mermaid-formatted DAG, compatible with existing Mermaid Chart MCP for rendering.
+
+-----
+
+### Internal Dependency Flow (data-platform)
+
+```
+files → pandas-mcp → postgres-mcp ↔ dbt-mcp
+              ↑______________|
+           (query results for reshaping)
+```
+
+|Flow             |Description                         |
+|-----------------|------------------------------------|
+|files → pandas   |Entry point for raw data            |
+|pandas → postgres|Schema inference, bulk loading      |
+|postgres ↔ dbt   |dbt queries marts, postgres executes|
+|postgres → pandas|Query results for reshaping         |
+|dbt → pandas     |Model outputs for visualization prep|
+
+-----
+
+### Agents (data-platform)
+
+|Agent           |Trigger                   |Sequence                                                                    |
+|----------------|--------------------------|----------------------------------------------------------------------------|
+|`data_ingestion`|User provides file        |read_file → profile → infer_schema → execute_ddl → load_dataframe → validate|
+|`model_analysis`|User asks about dbt model |get_model → get_lineage → explain_query → test_model → synthesize           |
+|`full_pipeline` |File to materialized model|data_ingestion → create dbt model → run_model                               |
+
+**Behavior when dbt-mcp absent:**
+
+|Agent           |Behavior                             |
+|----------------|-------------------------------------|
+|`data_ingestion`|Runs fully (no dbt steps)            |
+|`model_analysis`|Skipped—surfaces “dbt not configured”|
+|`full_pipeline` |Stops after load, prompts user       |
+
+-----
+
+### Commands (data-platform)
+
+|Command                         |Maps To                        |
+|--------------------------------|-------------------------------|
+|`/ingest {file}`                |`data_ingestion` agent         |
+|`/profile {file}`               |`pandas-mcp.profile`           |
+|`/pivot {data} by {cols}`       |`pandas-mcp.pivot`             |
+|`/merge {left} {right} on {key}`|`pandas-mcp.merge`             |
+|`/explain {query}`              |`postgres-mcp.explain_query`   |
+|`/schema {table}`               |`postgres-mcp.get_table_schema`|
+|`/lineage {model}`              |`dbt-mcp.get_lineage`          |
+|`/run {model}`                  |`dbt-mcp.run_model`            |
+|`/test {model}`                 |`dbt-mcp.test_model`           |
+
+dbt commands return graceful “dbt-mcp not loaded” when unavailable.
+
+-----
+
+## Plugin: viz-platform
+
+### Servers
+
+|Server  |Responsibility                                           |
+|--------|---------------------------------------------------------|
+|dmc-mcp |Version-locked component registry, prop validation       |
+|dash-mcp|Charts, layouts, pages, theming—validates against dmc-mcp|
+
+-----
+
+### Server: dmc-mcp (Component Constraint Layer)
+
+**Responsibility:** Single source of truth for Dash Mantine Components API. Prevents Claude from hallucinating deprecated props or non-existent components.
+
+**Problem solved:** DMC versions introduce breaking changes. Claude’s training data mixes versions. Runtime errors from invalid props waste cycles.
+
+#### Tool Categories
+
+|Category     |Tools                |Description                          |
+|-------------|---------------------|-------------------------------------|
+|Discovery    |`list_components`    |What exists in installed version     |
+|Introspection|`get_component_props`|Valid props, types, defaults         |
+|Validation   |`validate_component` |Check component definition before use|
+
+#### Usage Pattern
+
+Claude queries dmc-mcp first:
+
+1. “What props does `dmc.Select` accept?” → `get_component_props`
+1. Build component with valid props
+1. Pass to dash-mcp for rendering
+
+dash-mcp validates against dmc-mcp before rendering. Invalid components fail fast with actionable errors.
+
+-----
+
+### Server: dash-mcp (Visualization Layer)
+
+**Responsibility:** Chart generation, dashboard layouts, page structure, theming system, and export.
+
+**Philosophy:** Single server, multiple concerns. Tools are namespaced but share context (theme tokens flow to charts automatically).
+
+#### Tool Categories
+
+|Category  |Tools                                                               |Description                    |
+|----------|--------------------------------------------------------------------|-------------------------------|
+|`chart_*` |`chart_create`, `chart_configure_interaction`                       |Data visualization (Plotly)    |
+|`layout_*`|`layout_create`, `layout_add_filter`, `layout_set_grid`             |Dashboard composition          |
+|`page_*`  |`page_create`, `page_add_navbar`, `page_set_auth`                   |App-level structure            |
+|`theme_*` |`theme_create`, `theme_extend`, `theme_validate`, `theme_export_css`|Design tokens, component styles|
+
+#### Design Token Structure
+
+Themes are built from design tokens—single source of truth for visual consistency:
+
+```yaml
+tokens:
+  colors:
+    primary: "#228be6"
+    secondary: "#868e96"
+    background:
+      base: "#ffffff"
+      subtle: "#f8f9fa"
+    text:
+      primary: "#212529"
+      muted: "#868e96"
+  
+  spacing:
+    xs: "4px"
+    sm: "8px"
+    md: "16px"
+    lg: "24px"
+  
+  typography:
+    fontFamily: "Inter, sans-serif"
+    fontSize:
+      sm: "14px"
+      md: "16px"
+  
+  radii:
+    sm: "4px"
+    md: "8px"
+```
+
+#### Component Style Registry
+
+Per-component overrides ensuring consistency:
+
+|Component     |Registered Style              |Purpose                        |
+|--------------|------------------------------|-------------------------------|
+|`kpi_card`    |Shadow, padding, border-radius|All KPIs look identical        |
+|`data_table`  |Header bg, row hover, border  |Tables share appearance        |
+|`filter_panel`|Background, spacing, alignment|Filters positioned consistently|
+|`chart_card`  |Title typography, padding     |Chart containers unified       |
+
+-----
+
+### Internal Dependency Flow (viz-platform)
+
+```
+dmc-mcp ← dash-mcp
+   ↑          |
+   └──────────┘
+   (validation before render)
+```
+
+dash-mcp always validates component definitions against dmc-mcp. No direct data dependency—data comes from external sources.
+
+-----
+
+### Agents (viz-platform)
+
+|Agent            |Trigger                           |Sequence                                                              |
+|-----------------|----------------------------------|----------------------------------------------------------------------|
+|`theme_setup`    |New project or brand consistency  |list_themes → create_theme → register_component_style → validate_theme|
+|`layout_builder` |User wants dashboard structure    |create_layout → add_filter → apply_theme → preview                    |
+|`component_check`|Before rendering any DMC component|get_component_props → validate_component → proceed or error           |
+
+-----
+
+### Commands (viz-platform)
+
+|Command                |Maps To                                     |
+|-----------------------|--------------------------------------------|
+|`/chart {type}`        |`dash-mcp.chart_create` (expects data input)|
+|`/dashboard {template}`|`layout_builder` agent                      |
+|`/theme {name}`        |`dash-mcp.theme_apply`                      |
+|`/theme new {name}`    |`dash-mcp.theme_create`                     |
+|`/theme css {name}`    |`dash-mcp.theme_export_css`                 |
+|`/component {name}`    |`dmc-mcp.get_component_props`               |
+
+-----
+
+## Cross-Plugin Interactions
+
+### How It Works
+
+MCP servers don’t call each other. Claude orchestrates:
+
+1. Server A returns output to Claude
+1. Claude interprets and determines next step
+1. Claude passes relevant data to Server B
+
+### Documentation Layers
+
+|Layer             |Location               |Purpose                             |
+|------------------|-----------------------|------------------------------------|
+|Plugin docs       |Each plugin’s README.md|Declares inputs/outputs             |
+|Claude.md         |Project root           |Cross-plugin agents for this project|
+|contract-validator|Separate plugin        |Validates compatibility             |
+|doc-guardian      |Separate plugin        |Catches drift within each project   |
+
+### Interface Contracts
+
+Each plugin declares what it produces and accepts:
+
+**data-platform outputs:**
+
+- `data_ref`: In-memory DataFrame reference
+- `query_result`: Row set from postgres-mcp
+- `model_output`: Materialized table reference from dbt-mcp
+- `schema_snapshot`: Full schema state for documentation
+
+**viz-platform inputs:**
+
+- Accepts `data_ref`, `query_result`, or `model_output` as data source
+- Validates all DMC components against dmc-mcp before rendering
+
+### Cross-Plugin Agents (defined in Claude.md)
+
+|Agent               |Trigger                                        |Sequence                                                                                                                     |
+|--------------------|-----------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------|
+|`dashboard_builder` |User requests visualization of database content|postgres-mcp.execute_query → pandas-mcp.pivot (if needed) → dmc-mcp.validate → dash-mcp.chart_create → dash-mcp.layout_create|
+|`visualization_prep`|Query result needs reshaping                   |postgres-mcp.execute_query → pandas-mcp.reshape → dash-mcp.chart_create                                                      |
+
+### Validation: contract-validator
+
+Separate plugin for cross-plugin validation. See **Plugin: contract-validator** section for full specification.
+
+**Key distinction from doc-guardian:**
+
+- doc-guardian: “did code change break docs?” (within a project)
+- contract-validator: “do plugins work together?” (across plugins)
+
+-----
+
+## Plugin: contract-validator
+
+### Purpose
+
+Validates cross-plugin compatibility and Claude.md agent definitions. Ensures plugins can actually work together before runtime failures occur.
+
+**Problem solved:** Plugins declare interfaces in README. Claude.md references tools across plugins. Without validation:
+
+- Agents reference tools that don’t exist
+- viz-platform expects input format data-platform doesn’t produce
+- Plugin updates break workflows silently
+
+-----
+
+### What It Reads
+
+|Source            |Purpose                                             |
+|------------------|----------------------------------------------------|
+|Plugin README.md  |Extract declared inputs/outputs                     |
+|Claude.md         |Extract agent definitions and tool references       |
+|MCP server schemas|Verify tools actually exist with expected signatures|
+
+-----
+
+### Tool Categories
+
+|Category|Tools                                                                |Description                      |
+|--------|---------------------------------------------------------------------|---------------------------------|
+|Parse   |`parse_plugin_interface`, `parse_claude_md_agents`                   |Extract structured data from docs|
+|Validate|`validate_compatibility`, `validate_agent_refs`, `validate_data_flow`|Check contracts match            |
+|Report  |`generate_compatibility_report`, `list_issues`                       |Output findings                  |
+
+#### Tool Details
+
+**`parse_plugin_interface`**
+
+- Input: Plugin path or README content
+- Output: Structured interface (inputs accepted, outputs produced, tool names)
+
+**`parse_claude_md_agents`**
+
+- Input: Claude.md path or content
+- Output: List of agents with their tool sequences
+
+**`validate_compatibility`**
+
+- Input: Two plugin interfaces
+- Output: Compatibility report (what A produces that B accepts, gaps)
+
+**`validate_agent_refs`**
+
+- Input: Agent definition, list of available plugins
+- Output: Missing tools, invalid sequences
+
+**`validate_data_flow`**
+
+- Input: Agent sequence
+- Output: Verification that each step’s output matches next step’s expected input
+
+-----
+
+### Agents (contract-validator)
+
+|Agent            |Trigger                        |Sequence                                                                                                                                                 |
+|-----------------|-------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------|
+|`full_validation`|User runs `/validate-contracts`|parse all plugin interfaces → parse Claude.md → validate_compatibility for each pair → validate_agent_refs for each agent → generate_compatibility_report|
+|`agent_check`    |User runs `/check-agent {name}`|parse_claude_md_agents → find agent → validate_agent_refs → validate_data_flow → report issues                                                           |
+
+-----
+
+### Commands
+
+|Command              |Maps To                                 |Description                           |
+|---------------------|----------------------------------------|--------------------------------------|
+|`/validate-contracts`|`full_validation` agent                 |Full project validation               |
+|`/check-agent {name}`|`agent_check` agent                     |Validate single agent definition      |
+|`/list-interfaces`   |`parse_plugin_interface` for all plugins|Show what each plugin produces/accepts|
+
+-----
+
+### Output Format
+
+**Compatibility Report:**
+
+```
+## Contract Validation Report
+
+### Plugin Interfaces
+- data-platform: produces [data_ref, query_result, model_output, schema_snapshot]
+- viz-platform: accepts [data_ref, query_result, model_output]
+
+### Compatibility Matrix
+| Producer | Consumer | Status |
+|----------|----------|--------|
+| data-platform → viz-platform | ✓ Compatible | All outputs accepted |
+
+### Agent Validation
+| Agent | Status | Issues |
+|-------|--------|--------|
+| dashboard_builder | ✓ Valid | — |
+| model_analysis | ⚠ Warning | dbt-mcp optional; agent fails if not loaded |
+
+### Issues Found
+- None
+
+### Warnings
+- Agent `model_analysis` depends on optional server `dbt-mcp`
+```
+
+**Issue Types:**
+
+|Type               |Severity|Example                                                                 |
+|-------------------|--------|------------------------------------------------------------------------|
+|Missing tool       |Error   |Agent references `pandas-mcp.transform` but tool is `pandas-mcp.reshape`|
+|Interface mismatch |Error   |viz-platform expects `chart_data` but data-platform produces `data_ref` |
+|Optional dependency|Warning |Agent uses dbt-mcp which may not be loaded                              |
+|Undeclared output  |Warning |Plugin produces output not listed in README                             |
+
+-----
+
+### Integration with doc-guardian
+
+**Separation of concerns:**
+
+|Plugin            |Scope                             |Trigger                 |
+|------------------|----------------------------------|------------------------|
+|doc-guardian      |Code ↔ docs drift within a project|PostToolUse (Write/Edit)|
+|contract-validator|Plugin ↔ plugin compatibility     |On-demand or CI hook    |
+
+contract-validator does NOT watch for file changes. It runs on-demand or as CI step.
+
+**Potential future integration:** doc-guardian could trigger contract-validator when Claude.md or plugin README changes. Not required for v1.
+
+-----
+
+## Diagramming Approach
+
+No diagram-mcp server. Use existing Mermaid Chart MCP.
+
+**For ERDs:**
+
+- postgres-mcp exposes schema metadata via `get_schema_snapshot`
+- Claude generates Mermaid syntax
+- Mermaid Chart MCP renders
+
+**For dbt lineage:**
+
+- dbt-mcp.get_lineage outputs Mermaid-formatted DAG
+- Mermaid Chart MCP renders
+
+This avoids the complexity of draw.io XML generation while maintaining documentation capability.
+
+-----
+
+## Implementation Order
+
+|Phase|Plugin            |Server      |Rationale                                      |
+|-----|------------------|------------|-----------------------------------------------|
+|1    |data-platform     |pandas-mcp  |Entry point, no dependencies                   |
+|2    |data-platform     |postgres-mcp|Load from Phase 1, query capabilities          |
+|3    |data-platform     |dbt-mcp     |Transform layer, requires postgres-mcp         |
+|4    |viz-platform      |dmc-mcp     |Constraint layer, no dependencies              |
+|5    |viz-platform      |dash-mcp    |Visualization, validates against dmc-mcp       |
+|6    |contract-validator|—           |Validates all above, requires stable interfaces|
+
+**Notes:**
+
+- Phases 1-3 (data-platform) and 4-5 (viz-platform) can proceed in parallel
+- contract-validator (Phase 6) should wait until plugin interfaces stabilize
+- doc-guardian already exists; update scope documentation only
+
+-----
+
+## Open Questions
+
+### Data Reference Passing
+
+How do servers share `data_ref` objects? Options:
+
+- **Temporary files with URIs**: Portable but I/O overhead
+- **Arrow IPC**: Efficient but requires both servers to support
+- **Recommendation**: Arrow IPC for efficiency, file fallback for compatibility
+
+### Authentication
+
+Should postgres-mcp handle connection strings directly, or use a secrets manager pattern?
+
+### Theme Storage
+
+Where do custom themes persist?
+
+- Local config file (`~/.dash-mcp/themes/`)
+- Project-level (alongside dbt_project.yml)
+- Database table (for shared team themes)
+
+### dbt Project Discovery
+
+Auto-detect `dbt_project.yml` in common locations, or require explicit path?
+
+-----
+
+## Technology Stack
+
+|Layer          |Technology             |Notes                      |
+|---------------|-----------------------|---------------------------|
+|MCP Framework  |FastMCP                |Or manual MCP SDK          |
+|Python         |3.11+                  |Type hints, async support  |
+|Data Processing|pandas                 |Core DataFrame ops         |
+|Arrow          |pyarrow                |Parquet, efficient memory  |
+|Database       |psycopg                |Async-ready Postgres driver|
+|Geospatial     |geoalchemy2            |PostGIS integration        |
+|dbt            |dbt-core               |CLI wrapper                |
+|Visualization  |plotly                 |Figure generation          |
+|UI Components  |dash-mantine-components|Version-locked via dmc-mcp |
+
+-----
+
+## Summary
+
+### Core Plugins
+
+|Plugin            |Servers/Scope                          |Key Characteristic                                    |
+|------------------|---------------------------------------|------------------------------------------------------|
+|data-platform     |pandas-mcp, postgres-mcp, dbt-mcp      |Optional server loading per project                   |
+|viz-platform      |dmc-mcp, dash-mcp                      |dmc-mcp validates before dash-mcp renders             |
+|contract-validator|Interface parsing, compatibility checks|Validates cross-plugin contracts and agent definitions|
+
+### Supporting Plugins (Existing)
+
+|Plugin           |Purpose                             |
+|-----------------|------------------------------------|
+|doc-guardian     |Code-to-docs drift (unchanged scope)|
+|Mermaid Chart MCP|Diagram rendering                   |
+
+### Interaction Model
+
+```
+Plugin READMEs     →  declare inputs/outputs
+Claude.md          →  define cross-plugin agents  
+contract-validator →  validate compatibility
+doc-guardian       →  catch drift within projects
+```
+
+**Flow:** Plugins declare interfaces. Claude.md defines workflows. contract-validator enforces compatibility. doc-guardian handles internal drift.