feat(plugins): implement Sprint 4 commands (#241-#258)

Sprint 4 - Plugin Commands implementation adding 18 new user-facing commands across 8 plugins as part of V5.2.0 Plugin Enhancements. **projman:** - #241: /sprint-diagram - Mermaid visualization of sprint issues **pr-review:** - #242: Confidence threshold config (PR_REVIEW_CONFIDENCE_THRESHOLD) - #243: /pr-diff - Formatted diff with inline review comments **data-platform:** - #244: /data-quality - DataFrame quality checks (nulls, duplicates, outliers) - #245: /lineage-viz - dbt lineage as Mermaid diagrams - #246: /dbt-test - Formatted dbt test runner **viz-platform:** - #247: /chart-export - Export charts to PNG/SVG/PDF via kaleido - #248: /accessibility-check - Color blind validation (WCAG contrast) - #249: /breakpoints - Responsive layout configuration **contract-validator:** - #250: /dependency-graph - Plugin dependency visualization **doc-guardian:** - #251: /changelog-gen - Generate changelog from conventional commits - #252: /doc-coverage - Documentation coverage metrics - #253: /stale-docs - Flag outdated documentation **claude-config-maintainer:** - #254: /config-diff - Track CLAUDE.md changes over time - #255: /config-lint - 31 lint rules for CLAUDE.md best practices **cmdb-assistant:** - #256: /cmdb-topology - Infrastructure topology diagrams - #257: /change-audit - NetBox audit trail queries - #258: /ip-conflicts - Detect IP conflicts and overlaps Closes #241, #242, #243, #244, #245, #246, #247, #248, #249, #250, #251, #252, #253, #254, #255, #256, #257, #258 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-28 12:02:26 -05:00
parent 8a957b1b69
commit 9698e8724d
36 changed files with 4295 additions and 22 deletions
--- a/plugins/data-platform/commands/data-quality.md
+++ b/plugins/data-platform/commands/data-quality.md
@@ -0,0 +1,103 @@
+# /data-quality - Data Quality Assessment
+
+Comprehensive data quality check for DataFrames with pass/warn/fail scoring.
+
+## Usage
+
+```
+/data-quality <data_ref> [--strict]
+```
+
+## Workflow
+
+1. **Get data reference**:
+   - If no data_ref provided, use `list_data` to show available options
+   - Validate the data_ref exists
+
+2. **Null analysis**:
+   - Calculate null percentage per column
+   - **PASS**: < 5% nulls
+   - **WARN**: 5-20% nulls
+   - **FAIL**: > 20% nulls
+
+3. **Duplicate detection**:
+   - Check for fully duplicated rows
+   - **PASS**: 0% duplicates
+   - **WARN**: < 1% duplicates
+   - **FAIL**: >= 1% duplicates
+
+4. **Type consistency**:
+   - Identify mixed-type columns (object columns with mixed content)
+   - Flag columns that could be numeric but contain strings
+   - **PASS**: All columns have consistent types
+   - **FAIL**: Mixed types detected
+
+5. **Outlier detection** (numeric columns):
+   - Use IQR method (values beyond 1.5 * IQR)
+   - Report percentage of outliers per column
+   - **PASS**: < 1% outliers
+   - **WARN**: 1-5% outliers
+   - **FAIL**: > 5% outliers
+
+6. **Generate quality report**:
+   - Overall quality score (0-100)
+   - Per-column breakdown
+   - Recommendations for remediation
+
+## Report Format
+
+```
+=== Data Quality Report ===
+Dataset: sales_data
+Rows: 10,000 | Columns: 15
+Overall Score: 82/100 [PASS]
+
+--- Column Analysis ---
+| Column       | Nulls | Dups | Type     | Outliers | Status |
+|--------------|-------|------|----------|----------|--------|
+| customer_id  | 0.0%  | -    | int64    | 0.2%     | PASS   |
+| email        | 2.3%  | -    | object   | -        | PASS   |
+| amount       | 15.2% | -    | float64  | 3.1%     | WARN   |
+| created_at   | 0.0%  | -    | datetime | -        | PASS   |
+
+--- Issues Found ---
+[WARN] Column 'amount': 15.2% null values (threshold: 5%)
+[WARN] Column 'amount': 3.1% outliers detected
+[FAIL] 1.2% duplicate rows detected (12 rows)
+
+--- Recommendations ---
+1. Investigate null values in 'amount' column
+2. Review outliers in 'amount' - may be data entry errors
+3. Remove or deduplicate 12 duplicate rows
+```
+
+## Options
+
+| Flag | Description |
+|------|-------------|
+| `--strict` | Use stricter thresholds (WARN at 1% nulls, FAIL at 5%) |
+
+## Examples
+
+```
+/data-quality sales_data
+/data-quality df_customers --strict
+```
+
+## Scoring
+
+| Component | Weight | Scoring |
+|-----------|--------|---------|
+| Nulls | 30% | 100 - (avg_null_pct * 2) |
+| Duplicates | 20% | 100 - (dup_pct * 50) |
+| Type consistency | 25% | 100 if clean, 0 if mixed |
+| Outliers | 25% | 100 - (avg_outlier_pct * 10) |
+
+Final score: Weighted average, capped at 0-100
+
+## Available Tools
+
+Use these MCP tools:
+- `describe` - Get statistical summary (for outlier detection)
+- `head` - Preview data
+- `list_data` - List available DataFrames