refactor: extract skills from commands across 8 plugins

Refactored commands to extract reusable skills following the Commands → Skills separation pattern. Each command is now <50 lines and references skill files for detailed knowledge. Plugins refactored: - claude-config-maintainer: 5 commands → 7 skills - code-sentinel: 3 commands → 2 skills - contract-validator: 5 commands → 6 skills - data-platform: 10 commands → 6 skills - doc-guardian: 5 commands → 6 skills (replaced nested dir) - git-flow: 8 commands → 7 skills Skills contain: workflows, validation rules, conventions, reference data, tool documentation Commands now contain: YAML frontmatter, agent assignment, skills list, brief workflow steps, parameters Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 17:32:24 -05:00
parent aad02ef2d9
commit 7c8a20c804
71 changed files with 3896 additions and 3690 deletions
--- a/plugins/data-platform/skills/data-profiling.md
+++ b/plugins/data-platform/skills/data-profiling.md
@@ -0,0 +1,72 @@
+# Data Profiling
+
+## Profiling Workflow
+
+1. **Get data reference** via `list_data`
+2. **Generate statistics** via `describe`
+3. **Analyze quality** (nulls, duplicates, types, outliers)
+4. **Calculate score** and generate report
+
+## Quality Checks
+
+### Null Analysis
+- Calculate null percentage per column
+- **PASS**: < 5% nulls
+- **WARN**: 5-20% nulls
+- **FAIL**: > 20% nulls
+
+### Duplicate Detection
+- Check for fully duplicated rows
+- **PASS**: 0% duplicates
+- **WARN**: < 1% duplicates
+- **FAIL**: >= 1% duplicates
+
+### Type Consistency
+- Identify mixed-type columns
+- Flag numeric columns with string values
+- **PASS**: Consistent types
+- **FAIL**: Mixed types detected
+
+### Outlier Detection (IQR Method)
+- Calculate Q1, Q3, IQR = Q3 - Q1
+- Outliers: values < Q1 - 1.5*IQR or > Q3 + 1.5*IQR
+- **PASS**: < 1% outliers
+- **WARN**: 1-5% outliers
+- **FAIL**: > 5% outliers
+
+## Quality Scoring
+
+| Component | Weight | Formula |
+|-----------|--------|---------|
+| Nulls | 30% | 100 - (avg_null_pct * 2) |
+| Duplicates | 20% | 100 - (dup_pct * 50) |
+| Type consistency | 25% | 100 if clean, 0 if mixed |
+| Outliers | 25% | 100 - (avg_outlier_pct * 10) |
+
+Final score: Weighted average, capped at 0-100
+
+## Report Format
+
+```
+=== Data Quality Report ===
+Dataset: [data_ref]
+Rows: X | Columns: Y
+Overall Score: XX/100 [PASS/WARN/FAIL]
+
+--- Column Analysis ---
+| Column | Nulls | Dups | Type | Outliers | Status |
+|--------|-------|------|------|----------|--------|
+| col1   | X.X%  | -    | type | X.X%     | PASS   |
+
+--- Issues Found ---
+[WARN/FAIL] Column 'X': Issue description
+
+--- Recommendations ---
+1. Suggested remediation steps
+```
+
+## Strict Mode
+
+With `--strict` flag:
+- **WARN** at 1% nulls (vs 5%)
+- **FAIL** at 5% nulls (vs 20%)
--- a/plugins/data-platform/skills/dbt-workflow.md
+++ b/plugins/data-platform/skills/dbt-workflow.md
@@ -0,0 +1,85 @@
+# dbt Workflow
+
+## Pre-Validation (MANDATORY)
+
+**Always run `dbt_parse` before any dbt operation.**
+
+This validates:
+- dbt_project.yml syntax
+- Model SQL syntax
+- schema.yml definitions
+- Deprecated syntax (dbt 1.9+)
+
+If validation fails, show errors and STOP.
+
+## Model Selection Syntax
+
+| Pattern | Meaning |
+|---------|---------|
+| `model_name` | Single model |
+| `+model_name` | Model and upstream dependencies |
+| `model_name+` | Model and downstream dependents |
+| `+model_name+` | Model with all dependencies |
+| `tag:name` | Models with specific tag |
+| `path:models/staging` | Models in path |
+| `test_type:schema` | Schema tests only |
+| `test_type:data` | Data tests only |
+
+## Execution Workflow
+
+1. **Parse**: `dbt_parse` - Validate project
+2. **Run**: `dbt_run` - Execute models
+3. **Test**: `dbt_test` - Run tests
+4. **Build**: `dbt_build` - Run + test together
+
+## Test Types
+
+### Schema Tests
+Defined in `schema.yml`:
+- `unique` - No duplicate values
+- `not_null` - No null values
+- `accepted_values` - Value in allowed list
+- `relationships` - Foreign key integrity
+
+### Data Tests
+Custom SQL in `tests/` directory:
+- Return rows that fail assertion
+- Zero rows = pass, any rows = fail
+
+## Materialization Types
+
+| Type | Description |
+|------|-------------|
+| `view` | Virtual table, always fresh |
+| `table` | Physical table, full rebuild |
+| `incremental` | Append/merge new rows only |
+| `ephemeral` | CTE, no physical object |
+
+## Exit Codes
+
+| Code | Meaning |
+|------|---------|
+| 0 | Success |
+| 1 | Test/run failure |
+| 2 | dbt error (parse failure) |
+
+## Result Formatting
+
+```
+=== dbt [Operation] Results ===
+Project: [project_name]
+Selection: [selection_pattern]
+
+--- Summary ---
+Total: X models/tests
+PASS:  X (%)
+FAIL:  X (%)
+WARN:  X (%)
+SKIP:  X (%)
+
+--- Details ---
+[Model/Test details with status]
+
+--- Failure Details ---
+[Error messages and remediation]
+```
--- a/plugins/data-platform/skills/lineage-analysis.md
+++ b/plugins/data-platform/skills/lineage-analysis.md
@@ -0,0 +1,73 @@
+# Lineage Analysis
+
+## Lineage Workflow
+
+1. **Get lineage data** via `dbt_lineage`
+2. **Build dependency graph** (upstream + downstream)
+3. **Visualize** (ASCII tree or Mermaid)
+4. **Report** critical path and refresh implications
+
+## ASCII Tree Format
+
+```
+Sources:
+  |-- raw_customers (source)
+  |-- raw_orders (source)
+
+model_name (materialization)
+  |-- upstream:
+  |   |-- stg_model (view)
+  |       |-- raw_source (source)
+  |-- downstream:
+      |-- fct_model (incremental)
+      |-- rpt_model (table)
+```
+
+## Mermaid Diagram Format
+
+```mermaid
+flowchart LR
+    subgraph Sources
+        raw_data[(raw_data)]
+    end
+
+    subgraph Staging
+        stg_model[stg_model]
+    end
+
+    subgraph Marts
+        dim_model{{dim_model}}
+    end
+
+    raw_data --> stg_model
+    stg_model --> dim_model
+```
+
+## Mermaid Node Shapes
+
+| Materialization | Shape | Syntax |
+|-----------------|-------|--------|
+| source | Cylinder | `[(name)]` |
+| view | Rectangle | `[name]` |
+| table | Double braces | `{{name}}` |
+| incremental | Hexagon | `{{name}}` |
+| ephemeral | Dashed | `[/name/]` |
+
+## Mermaid Options
+
+| Flag | Description |
+|------|-------------|
+| `--direction TB` | Top-to-bottom (default: LR) |
+| `--depth N` | Limit lineage depth |
+
+## Styling Target Model
+
+```mermaid
+style target_model fill:#f96,stroke:#333,stroke-width:2px
+```
+
+## Usage Tips
+
+1. **Documentation**: Copy Mermaid to README.md
+2. **GitHub/GitLab**: Both render Mermaid natively
+3. **Live Editor**: https://mermaid.live for interactive editing
--- a/plugins/data-platform/skills/mcp-tools-reference.md
+++ b/plugins/data-platform/skills/mcp-tools-reference.md
@@ -0,0 +1,69 @@
+# MCP Tools Reference
+
+## pandas Tools
+
+| Tool | Description |
+|------|-------------|
+| `read_csv` | Load CSV file into DataFrame |
+| `read_parquet` | Load Parquet file into DataFrame |
+| `read_json` | Load JSON/JSONL file into DataFrame |
+| `to_csv` | Export DataFrame to CSV |
+| `to_parquet` | Export DataFrame to Parquet |
+| `describe` | Get statistical summary (count, mean, std, min, max) |
+| `head` | Preview first N rows |
+| `tail` | Preview last N rows |
+| `filter` | Filter rows by condition |
+| `select` | Select specific columns |
+| `groupby` | Aggregate data by columns |
+| `join` | Join two DataFrames |
+| `list_data` | List all loaded DataFrames |
+| `drop_data` | Remove DataFrame from memory |
+
+## PostgreSQL Tools
+
+| Tool | Description |
+|------|-------------|
+| `pg_connect` | Establish database connection |
+| `pg_query` | Execute SELECT query, return DataFrame |
+| `pg_execute` | Execute INSERT/UPDATE/DELETE |
+| `pg_tables` | List tables in schema |
+| `pg_columns` | Get column info for table |
+| `pg_schemas` | List available schemas |
+
+## PostGIS Tools
+
+| Tool | Description |
+|------|-------------|
+| `st_tables` | List tables with geometry columns |
+| `st_geometry_type` | Get geometry type for column |
+| `st_srid` | Get SRID for geometry column |
+| `st_extent` | Get bounding box for geometry |
+
+## dbt Tools
+
+| Tool | Description |
+|------|-------------|
+| `dbt_parse` | Validate project (ALWAYS RUN FIRST) |
+| `dbt_run` | Execute models |
+| `dbt_test` | Run tests |
+| `dbt_build` | Run + test together |
+| `dbt_compile` | Compile SQL without execution |
+| `dbt_ls` | List dbt resources |
+| `dbt_docs_generate` | Generate documentation manifest |
+| `dbt_lineage` | Get model dependencies |
+
+## Tool Selection Guidelines
+
+**For data loading:**
+- Files: `read_csv`, `read_parquet`, `read_json`
+- Database: `pg_query`
+
+**For data exploration:**
+- Schema: `describe`, `pg_columns`, `st_tables`
+- Preview: `head`, `tail`
+- Available data: `list_data`, `pg_tables`
+
+**For dbt operations:**
+- Always start with `dbt_parse` for validation
+- Use `dbt_lineage` for dependency analysis
+- Use `dbt_compile` to see rendered SQL
--- a/plugins/data-platform/skills/setup-workflow.md
+++ b/plugins/data-platform/skills/setup-workflow.md
@@ -0,0 +1,108 @@
+# Setup Workflow
+
+## Important Context
+
+- **This workflow uses Bash, Read, Write, AskUserQuestion tools** - NOT MCP tools
+- **MCP tools won't work until after setup + session restart**
+- **PostgreSQL and dbt are optional** - pandas tools work without them
+
+## Phase 1: Environment Validation
+
+### Check Python Version
+```bash
+python3 --version
+```
+Requires Python 3.10+. If below, stop and inform user.
+
+## Phase 2: MCP Server Setup
+
+### Locate MCP Server
+Check both paths:
+```bash
+# Installed marketplace
+ls -la ~/.claude/plugins/marketplaces/leo-claude-mktplace/mcp-servers/data-platform/
+
+# Source
+ls -la ~/claude-plugins-work/mcp-servers/data-platform/
+```
+
+### Check/Create Virtual Environment
+```bash
+# Check
+ls -la /path/to/mcp-servers/data-platform/.venv/bin/python
+
+# Create if missing
+cd /path/to/mcp-servers/data-platform
+python3 -m venv .venv
+source .venv/bin/activate
+pip install --upgrade pip
+pip install -r requirements.txt
+deactivate
+```
+
+## Phase 3: PostgreSQL Configuration (Optional)
+
+### Config Location
+`~/.config/claude/postgres.env`
+
+### Config Format
+```bash
+# PostgreSQL Configuration
+POSTGRES_URL=postgresql://user:pass@host:5432/db
+```
+
+Set permissions: `chmod 600 ~/.config/claude/postgres.env`
+
+### Test Connection
+```bash
+source ~/.config/claude/postgres.env && python3 -c "
+import asyncio, asyncpg
+async def test():
+    conn = await asyncpg.connect('$POSTGRES_URL', timeout=5)
+    ver = await conn.fetchval('SELECT version()')
+    await conn.close()
+    print(f'SUCCESS: {ver.split(\",\")[0]}')
+asyncio.run(test())
+"
+```
+
+## Phase 4: dbt Configuration (Optional)
+
+dbt is **project-level** (auto-detected via `dbt_project.yml`).
+
+For subdirectory projects, set in `.env`:
+```
+DBT_PROJECT_DIR=./transform
+DBT_PROFILES_DIR=~/.dbt
+```
+
+### Check dbt Installation
+```bash
+dbt --version
+```
+
+## Phase 5: Validation
+
+### Verify MCP Server
+```bash
+cd /path/to/mcp-servers/data-platform
+.venv/bin/python -c "from mcp_server.server import DataPlatformMCPServer; print('OK')"
+```
+
+## Memory Limits
+
+Default: 100,000 rows per DataFrame
+
+Override in project `.env`:
+```
+DATA_PLATFORM_MAX_ROWS=500000
+```
+
+For larger datasets:
+- Use chunked processing (`chunk_size` parameter)
+- Filter data before loading
+- Store to Parquet for efficient re-loading
+
+## Session Restart
+
+After setup, restart Claude Code session for MCP tools to become available.
--- a/plugins/data-platform/skills/visual-header.md
+++ b/plugins/data-platform/skills/visual-header.md
@@ -0,0 +1,45 @@
+# Visual Header
+
+## Standard Format
+
+Display at the start of every command execution:
+
+```
+----------------------------------------------------------------------+
+|  DATA-PLATFORM - [Command Name]                                      |
+----------------------------------------------------------------------+
+```
+
+## Command Headers
+
+| Command | Header Text |
+|---------|-------------|
+| initial-setup | Setup Wizard |
+| ingest | Ingest |
+| profile | Data Profile |
+| schema | Schema Explorer |
+| data-quality | Data Quality |
+| run | dbt Run |
+| dbt-test | dbt Tests |
+| lineage | Lineage |
+| lineage-viz | Lineage Visualization |
+| explain | Model Explanation |
+
+## Summary Box Format
+
+For completion summaries:
+
+```
+============================================================+
+|            DATA-PLATFORM [OPERATION] COMPLETE              |
+============================================================+
+| Component:         [Status]                                |
+| Component:         [Status]                                |
+============================================================+
+```
+
+## Status Indicators
+
+- Success: `[check]` or `Ready`
+- Warning: `[!]` or `Partial`
+- Failure: `[X]` or `Failed`