Add Change V04.0.0: Proposal

2026-01-25 15:31:30 +00:00
parent 6dff0e7a2b
commit 09bc26c471

@@ -0,0 +1,655 @@
# MCP Data Platform — Architecture Reference
*Plugin taxonomy, server responsibilities, and interaction patterns for Leos data marketplace*
-----
## Overview
Two plugins serving distinct domains, designed for independent or combined use.
|Plugin |Servers |Domain |
|-----------------|---------------------------------|-----------------------------------------|
|**data-platform**|pandas-mcp, postgres-mcp, dbt-mcp|Ingestion, storage, transformation |
|**viz-platform** |dmc-mcp, dash-mcp |Component validation, dashboards, theming|
**Key principles:**
- MCP servers are independent processes—they dont import each other
- Claude orchestrates cross-server data flow at runtime
- Plugins ship multiple servers; projects load only what they need
- Claude.md defines project-specific workflows spanning plugins
-----
## Component Definitions
|Component Type|Definition |Runtime Context |
|--------------|--------------------------------------------------------------------------------------------------------------------|----------------------------------------------------|
|**MCP Server**|Standalone service exposing tools via Model Context Protocol. One server = one domain responsibility. |Long-running process, spawned by Claude Desktop/Code|
|**Tool** |Single callable function within an MCP server. Atomic operation with defined input schema and output. |Invoked per-request by LLM |
|**Resource** |Read-only data exposed by MCP server (files, schemas, configs). Discoverable but not executable. |Static or cached |
|**Agent** |Orchestration layer that chains multiple tool calls across servers. Lives in Claudes reasoning, not in MCP servers.|LLM-driven, multi-step |
|**Command** |User-facing shortcut (e.g., `/ingest`) that triggers predefined tool sequences. |Chat interface trigger |
-----
## Plugin: data-platform
### Server Loading
Single plugin ships all three servers. Which servers load is determined by project config—not environment variables.
|Server |Default|Optional|
|------------|-------|--------|
|pandas-mcp |✓ |— |
|postgres-mcp|✓ |— |
|dbt-mcp |— |✓ |
**Example project configs:**
```yaml
# Web app project (no dbt)
mcp_servers:
- pandas-mcp
- postgres-mcp
```
```yaml
# Data engineering project (full stack)
mcp_servers:
- pandas-mcp
- postgres-mcp
- dbt-mcp
```
Agents check server availability at runtime. If dbt-mcp isnt loaded, dbt-related steps are skipped or surface “not available for this project.”
-----
### Server: pandas-mcp (Data Shaping Layer)
**Responsibility:** File ingestion, data profiling, schema inference, and utility shaping operations.
**Philosophy:** SQL-first for persistent transforms (use dbt). Pandas for:
- Pre-database ingestion (profiling, validation, schema inference)
- Visualization prep (reshaping query results for chart formats)
- Ad-hoc operations (prototyping, merging with local files)
#### Tool Categories
|Category |Tools |Description |
|---------|-----------------------------------------------------------------|-----------------------------------|
|Ingestion|`read_file`, `write_file`, `detect_encoding` |File I/O with format auto-detection|
|Profiling|`profile`, `validate`, `sample` |Data quality assessment |
|Schema |`infer_schema` |Generate DDL from data structure |
|Shaping |`reshape`, `pivot`, `melt`, `merge`, `add_columns`, `filter_rows`|Transform any data reference |
#### Data Reference Sources
pandas-mcp accepts `data_ref` from multiple origins:
|Source |How It Arrives |
|------------------|-------------------------|
|Local file |`read_file` tool |
|Query result |Passed from postgres-mcp |
|dbt model output |Passed from dbt-mcp |
|Previous transform|Chained from shaping tool|
#### When to Use Shaping Tools
|Scenario |Use pandas-mcp|Use SQL/dbt|
|--------------------------------------|--------------|-----------|
|Pivot for heatmap chart |✓ |— |
|Join query result with local CSV |✓ |— |
|Prototype transform before formalizing|✓ |— |
|Persistent aggregation in pipeline |— |✓ |
|Reusable business logic |— |✓ |
|Needs version control + testing |— |✓ |
-----
### Server: postgres-mcp (Database Layer)
**Responsibility:** Data loading, querying, schema management, performance analysis, and geospatial operations.
#### Tool Categories
|Category|Tools |Description |
|--------|------------------------------------------------------------------------------------|--------------------------------------|
|Query |`list_schemas`, `list_tables`, `get_table_schema`, `execute_query`, `query_geometry`|Read operations |
|Analysis|`explain_query`, `recommend_indexes`, `health_check` |Performance insights |
|Write |`execute_write`, `load_dataframe` |Data modification |
|DDL |`execute_ddl`, `get_schema_snapshot` |Schema management with change tracking|
#### DDL Change Tracking
`execute_ddl` returns structured output for downstream automation:
```json
{
"success": true,
"operation": "CREATE TABLE",
"affected_objects": [
{
"type": "table",
"schema": "public",
"name": "customer_orders",
"change": "created"
}
],
"timestamp": "2025-01-22T14:30:00Z"
}
```
This enables documentation updates, ERD regeneration (via Mermaid Chart MCP), or other automated responses.
-----
### Server: dbt-mcp (Transform Layer)
**Responsibility:** Model execution, lineage, documentation, and YAML generation for local dbt-core projects.
**Note:** Official dbt-mcp is Cloud-only. This server wraps local dbt-core CLI.
#### Tool Categories
|Category |Tools |Description |
|-------------|-----------------------------------------------|------------------------|
|Discovery |`parse_manifest`, `list_models`, `list_sources`|Project exploration |
|Model |`get_model`, `get_lineage`, `compile_sql` |Model inspection |
|Execution |`run_model`, `test_model`, `get_run_results` |dbt CLI wrapper |
|Documentation|`generate_yaml` |Auto-generate schema.yml|
#### Lineage Output
`get_lineage` outputs Mermaid-formatted DAG, compatible with existing Mermaid Chart MCP for rendering.
-----
### Internal Dependency Flow (data-platform)
```
files → pandas-mcp → postgres-mcp ↔ dbt-mcp
↑______________|
(query results for reshaping)
```
|Flow |Description |
|-----------------|------------------------------------|
|files → pandas |Entry point for raw data |
|pandas → postgres|Schema inference, bulk loading |
|postgres ↔ dbt |dbt queries marts, postgres executes|
|postgres → pandas|Query results for reshaping |
|dbt → pandas |Model outputs for visualization prep|
-----
### Agents (data-platform)
|Agent |Trigger |Sequence |
|----------------|--------------------------|----------------------------------------------------------------------------|
|`data_ingestion`|User provides file |read_file → profile → infer_schema → execute_ddl → load_dataframe → validate|
|`model_analysis`|User asks about dbt model |get_model → get_lineage → explain_query → test_model → synthesize |
|`full_pipeline` |File to materialized model|data_ingestion → create dbt model → run_model |
**Behavior when dbt-mcp absent:**
|Agent |Behavior |
|----------------|-------------------------------------|
|`data_ingestion`|Runs fully (no dbt steps) |
|`model_analysis`|Skipped—surfaces “dbt not configured”|
|`full_pipeline` |Stops after load, prompts user |
-----
### Commands (data-platform)
|Command |Maps To |
|--------------------------------|-------------------------------|
|`/ingest {file}` |`data_ingestion` agent |
|`/profile {file}` |`pandas-mcp.profile` |
|`/pivot {data} by {cols}` |`pandas-mcp.pivot` |
|`/merge {left} {right} on {key}`|`pandas-mcp.merge` |
|`/explain {query}` |`postgres-mcp.explain_query` |
|`/schema {table}` |`postgres-mcp.get_table_schema`|
|`/lineage {model}` |`dbt-mcp.get_lineage` |
|`/run {model}` |`dbt-mcp.run_model` |
|`/test {model}` |`dbt-mcp.test_model` |
dbt commands return graceful “dbt-mcp not loaded” when unavailable.
-----
## Plugin: viz-platform
### Servers
|Server |Responsibility |
|--------|---------------------------------------------------------|
|dmc-mcp |Version-locked component registry, prop validation |
|dash-mcp|Charts, layouts, pages, theming—validates against dmc-mcp|
-----
### Server: dmc-mcp (Component Constraint Layer)
**Responsibility:** Single source of truth for Dash Mantine Components API. Prevents Claude from hallucinating deprecated props or non-existent components.
**Problem solved:** DMC versions introduce breaking changes. Claudes training data mixes versions. Runtime errors from invalid props waste cycles.
#### Tool Categories
|Category |Tools |Description |
|-------------|---------------------|-------------------------------------|
|Discovery |`list_components` |What exists in installed version |
|Introspection|`get_component_props`|Valid props, types, defaults |
|Validation |`validate_component` |Check component definition before use|
#### Usage Pattern
Claude queries dmc-mcp first:
1. “What props does `dmc.Select` accept?” → `get_component_props`
1. Build component with valid props
1. Pass to dash-mcp for rendering
dash-mcp validates against dmc-mcp before rendering. Invalid components fail fast with actionable errors.
-----
### Server: dash-mcp (Visualization Layer)
**Responsibility:** Chart generation, dashboard layouts, page structure, theming system, and export.
**Philosophy:** Single server, multiple concerns. Tools are namespaced but share context (theme tokens flow to charts automatically).
#### Tool Categories
|Category |Tools |Description |
|----------|--------------------------------------------------------------------|-------------------------------|
|`chart_*` |`chart_create`, `chart_configure_interaction` |Data visualization (Plotly) |
|`layout_*`|`layout_create`, `layout_add_filter`, `layout_set_grid` |Dashboard composition |
|`page_*` |`page_create`, `page_add_navbar`, `page_set_auth` |App-level structure |
|`theme_*` |`theme_create`, `theme_extend`, `theme_validate`, `theme_export_css`|Design tokens, component styles|
#### Design Token Structure
Themes are built from design tokens—single source of truth for visual consistency:
```yaml
tokens:
colors:
primary: "#228be6"
secondary: "#868e96"
background:
base: "#ffffff"
subtle: "#f8f9fa"
text:
primary: "#212529"
muted: "#868e96"
spacing:
xs: "4px"
sm: "8px"
md: "16px"
lg: "24px"
typography:
fontFamily: "Inter, sans-serif"
fontSize:
sm: "14px"
md: "16px"
radii:
sm: "4px"
md: "8px"
```
#### Component Style Registry
Per-component overrides ensuring consistency:
|Component |Registered Style |Purpose |
|--------------|------------------------------|-------------------------------|
|`kpi_card` |Shadow, padding, border-radius|All KPIs look identical |
|`data_table` |Header bg, row hover, border |Tables share appearance |
|`filter_panel`|Background, spacing, alignment|Filters positioned consistently|
|`chart_card` |Title typography, padding |Chart containers unified |
-----
### Internal Dependency Flow (viz-platform)
```
dmc-mcp ← dash-mcp
↑ |
└──────────┘
(validation before render)
```
dash-mcp always validates component definitions against dmc-mcp. No direct data dependency—data comes from external sources.
-----
### Agents (viz-platform)
|Agent |Trigger |Sequence |
|-----------------|----------------------------------|----------------------------------------------------------------------|
|`theme_setup` |New project or brand consistency |list_themes → create_theme → register_component_style → validate_theme|
|`layout_builder` |User wants dashboard structure |create_layout → add_filter → apply_theme → preview |
|`component_check`|Before rendering any DMC component|get_component_props → validate_component → proceed or error |
-----
### Commands (viz-platform)
|Command |Maps To |
|-----------------------|--------------------------------------------|
|`/chart {type}` |`dash-mcp.chart_create` (expects data input)|
|`/dashboard {template}`|`layout_builder` agent |
|`/theme {name}` |`dash-mcp.theme_apply` |
|`/theme new {name}` |`dash-mcp.theme_create` |
|`/theme css {name}` |`dash-mcp.theme_export_css` |
|`/component {name}` |`dmc-mcp.get_component_props` |
-----
## Cross-Plugin Interactions
### How It Works
MCP servers dont call each other. Claude orchestrates:
1. Server A returns output to Claude
1. Claude interprets and determines next step
1. Claude passes relevant data to Server B
### Documentation Layers
|Layer |Location |Purpose |
|------------------|-----------------------|------------------------------------|
|Plugin docs |Each plugins README.md|Declares inputs/outputs |
|Claude.md |Project root |Cross-plugin agents for this project|
|contract-validator|Separate plugin |Validates compatibility |
|doc-guardian |Separate plugin |Catches drift within each project |
### Interface Contracts
Each plugin declares what it produces and accepts:
**data-platform outputs:**
- `data_ref`: In-memory DataFrame reference
- `query_result`: Row set from postgres-mcp
- `model_output`: Materialized table reference from dbt-mcp
- `schema_snapshot`: Full schema state for documentation
**viz-platform inputs:**
- Accepts `data_ref`, `query_result`, or `model_output` as data source
- Validates all DMC components against dmc-mcp before rendering
### Cross-Plugin Agents (defined in Claude.md)
|Agent |Trigger |Sequence |
|--------------------|-----------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------|
|`dashboard_builder` |User requests visualization of database content|postgres-mcp.execute_query → pandas-mcp.pivot (if needed) → dmc-mcp.validate → dash-mcp.chart_create → dash-mcp.layout_create|
|`visualization_prep`|Query result needs reshaping |postgres-mcp.execute_query → pandas-mcp.reshape → dash-mcp.chart_create |
### Validation: contract-validator
Separate plugin for cross-plugin validation. See **Plugin: contract-validator** section for full specification.
**Key distinction from doc-guardian:**
- doc-guardian: “did code change break docs?” (within a project)
- contract-validator: “do plugins work together?” (across plugins)
-----
## Plugin: contract-validator
### Purpose
Validates cross-plugin compatibility and Claude.md agent definitions. Ensures plugins can actually work together before runtime failures occur.
**Problem solved:** Plugins declare interfaces in README. Claude.md references tools across plugins. Without validation:
- Agents reference tools that dont exist
- viz-platform expects input format data-platform doesnt produce
- Plugin updates break workflows silently
-----
### What It Reads
|Source |Purpose |
|------------------|----------------------------------------------------|
|Plugin README.md |Extract declared inputs/outputs |
|Claude.md |Extract agent definitions and tool references |
|MCP server schemas|Verify tools actually exist with expected signatures|
-----
### Tool Categories
|Category|Tools |Description |
|--------|---------------------------------------------------------------------|---------------------------------|
|Parse |`parse_plugin_interface`, `parse_claude_md_agents` |Extract structured data from docs|
|Validate|`validate_compatibility`, `validate_agent_refs`, `validate_data_flow`|Check contracts match |
|Report |`generate_compatibility_report`, `list_issues` |Output findings |
#### Tool Details
**`parse_plugin_interface`**
- Input: Plugin path or README content
- Output: Structured interface (inputs accepted, outputs produced, tool names)
**`parse_claude_md_agents`**
- Input: Claude.md path or content
- Output: List of agents with their tool sequences
**`validate_compatibility`**
- Input: Two plugin interfaces
- Output: Compatibility report (what A produces that B accepts, gaps)
**`validate_agent_refs`**
- Input: Agent definition, list of available plugins
- Output: Missing tools, invalid sequences
**`validate_data_flow`**
- Input: Agent sequence
- Output: Verification that each steps output matches next steps expected input
-----
### Agents (contract-validator)
|Agent |Trigger |Sequence |
|-----------------|-------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------|
|`full_validation`|User runs `/validate-contracts`|parse all plugin interfaces → parse Claude.md → validate_compatibility for each pair → validate_agent_refs for each agent → generate_compatibility_report|
|`agent_check` |User runs `/check-agent {name}`|parse_claude_md_agents → find agent → validate_agent_refs → validate_data_flow → report issues |
-----
### Commands
|Command |Maps To |Description |
|---------------------|----------------------------------------|--------------------------------------|
|`/validate-contracts`|`full_validation` agent |Full project validation |
|`/check-agent {name}`|`agent_check` agent |Validate single agent definition |
|`/list-interfaces` |`parse_plugin_interface` for all plugins|Show what each plugin produces/accepts|
-----
### Output Format
**Compatibility Report:**
```
## Contract Validation Report
### Plugin Interfaces
- data-platform: produces [data_ref, query_result, model_output, schema_snapshot]
- viz-platform: accepts [data_ref, query_result, model_output]
### Compatibility Matrix
| Producer | Consumer | Status |
|----------|----------|--------|
| data-platform → viz-platform | ✓ Compatible | All outputs accepted |
### Agent Validation
| Agent | Status | Issues |
|-------|--------|--------|
| dashboard_builder | ✓ Valid | — |
| model_analysis | ⚠ Warning | dbt-mcp optional; agent fails if not loaded |
### Issues Found
- None
### Warnings
- Agent `model_analysis` depends on optional server `dbt-mcp`
```
**Issue Types:**
|Type |Severity|Example |
|-------------------|--------|------------------------------------------------------------------------|
|Missing tool |Error |Agent references `pandas-mcp.transform` but tool is `pandas-mcp.reshape`|
|Interface mismatch |Error |viz-platform expects `chart_data` but data-platform produces `data_ref` |
|Optional dependency|Warning |Agent uses dbt-mcp which may not be loaded |
|Undeclared output |Warning |Plugin produces output not listed in README |
-----
### Integration with doc-guardian
**Separation of concerns:**
|Plugin |Scope |Trigger |
|------------------|----------------------------------|------------------------|
|doc-guardian |Code ↔ docs drift within a project|PostToolUse (Write/Edit)|
|contract-validator|Plugin ↔ plugin compatibility |On-demand or CI hook |
contract-validator does NOT watch for file changes. It runs on-demand or as CI step.
**Potential future integration:** doc-guardian could trigger contract-validator when Claude.md or plugin README changes. Not required for v1.
-----
## Diagramming Approach
No diagram-mcp server. Use existing Mermaid Chart MCP.
**For ERDs:**
- postgres-mcp exposes schema metadata via `get_schema_snapshot`
- Claude generates Mermaid syntax
- Mermaid Chart MCP renders
**For dbt lineage:**
- dbt-mcp.get_lineage outputs Mermaid-formatted DAG
- Mermaid Chart MCP renders
This avoids the complexity of draw.io XML generation while maintaining documentation capability.
-----
## Implementation Order
|Phase|Plugin |Server |Rationale |
|-----|------------------|------------|-----------------------------------------------|
|1 |data-platform |pandas-mcp |Entry point, no dependencies |
|2 |data-platform |postgres-mcp|Load from Phase 1, query capabilities |
|3 |data-platform |dbt-mcp |Transform layer, requires postgres-mcp |
|4 |viz-platform |dmc-mcp |Constraint layer, no dependencies |
|5 |viz-platform |dash-mcp |Visualization, validates against dmc-mcp |
|6 |contract-validator|— |Validates all above, requires stable interfaces|
**Notes:**
- Phases 1-3 (data-platform) and 4-5 (viz-platform) can proceed in parallel
- contract-validator (Phase 6) should wait until plugin interfaces stabilize
- doc-guardian already exists; update scope documentation only
-----
## Open Questions
### Data Reference Passing
How do servers share `data_ref` objects? Options:
- **Temporary files with URIs**: Portable but I/O overhead
- **Arrow IPC**: Efficient but requires both servers to support
- **Recommendation**: Arrow IPC for efficiency, file fallback for compatibility
### Authentication
Should postgres-mcp handle connection strings directly, or use a secrets manager pattern?
### Theme Storage
Where do custom themes persist?
- Local config file (`~/.dash-mcp/themes/`)
- Project-level (alongside dbt_project.yml)
- Database table (for shared team themes)
### dbt Project Discovery
Auto-detect `dbt_project.yml` in common locations, or require explicit path?
-----
## Technology Stack
|Layer |Technology |Notes |
|---------------|-----------------------|---------------------------|
|MCP Framework |FastMCP |Or manual MCP SDK |
|Python |3.11+ |Type hints, async support |
|Data Processing|pandas |Core DataFrame ops |
|Arrow |pyarrow |Parquet, efficient memory |
|Database |psycopg |Async-ready Postgres driver|
|Geospatial |geoalchemy2 |PostGIS integration |
|dbt |dbt-core |CLI wrapper |
|Visualization |plotly |Figure generation |
|UI Components |dash-mantine-components|Version-locked via dmc-mcp |
-----
## Summary
### Core Plugins
|Plugin |Servers/Scope |Key Characteristic |
|------------------|---------------------------------------|------------------------------------------------------|
|data-platform |pandas-mcp, postgres-mcp, dbt-mcp |Optional server loading per project |
|viz-platform |dmc-mcp, dash-mcp |dmc-mcp validates before dash-mcp renders |
|contract-validator|Interface parsing, compatibility checks|Validates cross-plugin contracts and agent definitions|
### Supporting Plugins (Existing)
|Plugin |Purpose |
|-----------------|------------------------------------|
|doc-guardian |Code-to-docs drift (unchanged scope)|
|Mermaid Chart MCP|Diagram rendering |
### Interaction Model
```
Plugin READMEs → declare inputs/outputs
Claude.md → define cross-plugin agents
contract-validator → validate compatibility
doc-guardian → catch drift within projects
```
**Flow:** Plugins declare interfaces. Claude.md defines workflows. contract-validator enforces compatibility. doc-guardian handles internal drift.