Add single-line visual headers to 66 command files across 10 plugins: - clarity-assist (2 commands): 💬 - claude-config-maintainer (5 commands): ⚙️ - cmdb-assistant (11 commands): 🖥️ - code-sentinel (3 commands): 🔒 - contract-validator (5 commands): ✅ - data-platform (10 commands): 📊 - doc-guardian (5 commands): 📝 - git-flow (8 commands): 🔀 - pr-review (7 commands): 🔍 - viz-platform (10 commands): 🎨 Each command now displays a consistent header at execution start: ┌────────────────────────────────────────────────────────────────┐ │ [icon] PLUGIN-NAME · Command Description │ └────────────────────────────────────────────────────────────────┘ Addresses #275 (other plugin commands visual output) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
3.3 KiB
3.3 KiB
/data-quality - Data Quality Assessment
Visual Output
When executing this command, display the plugin header:
┌──────────────────────────────────────────────────────────────────┐
│ 📊 DATA-PLATFORM · Data Quality │
└──────────────────────────────────────────────────────────────────┘
Then proceed with the assessment.
Comprehensive data quality check for DataFrames with pass/warn/fail scoring.
Usage
/data-quality <data_ref> [--strict]
Workflow
-
Get data reference:
- If no data_ref provided, use
list_datato show available options - Validate the data_ref exists
- If no data_ref provided, use
-
Null analysis:
- Calculate null percentage per column
- PASS: < 5% nulls
- WARN: 5-20% nulls
- FAIL: > 20% nulls
-
Duplicate detection:
- Check for fully duplicated rows
- PASS: 0% duplicates
- WARN: < 1% duplicates
- FAIL: >= 1% duplicates
-
Type consistency:
- Identify mixed-type columns (object columns with mixed content)
- Flag columns that could be numeric but contain strings
- PASS: All columns have consistent types
- FAIL: Mixed types detected
-
Outlier detection (numeric columns):
- Use IQR method (values beyond 1.5 * IQR)
- Report percentage of outliers per column
- PASS: < 1% outliers
- WARN: 1-5% outliers
- FAIL: > 5% outliers
-
Generate quality report:
- Overall quality score (0-100)
- Per-column breakdown
- Recommendations for remediation
Report Format
=== Data Quality Report ===
Dataset: sales_data
Rows: 10,000 | Columns: 15
Overall Score: 82/100 [PASS]
--- Column Analysis ---
| Column | Nulls | Dups | Type | Outliers | Status |
|--------------|-------|------|----------|----------|--------|
| customer_id | 0.0% | - | int64 | 0.2% | PASS |
| email | 2.3% | - | object | - | PASS |
| amount | 15.2% | - | float64 | 3.1% | WARN |
| created_at | 0.0% | - | datetime | - | PASS |
--- Issues Found ---
[WARN] Column 'amount': 15.2% null values (threshold: 5%)
[WARN] Column 'amount': 3.1% outliers detected
[FAIL] 1.2% duplicate rows detected (12 rows)
--- Recommendations ---
1. Investigate null values in 'amount' column
2. Review outliers in 'amount' - may be data entry errors
3. Remove or deduplicate 12 duplicate rows
Options
| Flag | Description |
|---|---|
--strict |
Use stricter thresholds (WARN at 1% nulls, FAIL at 5%) |
Examples
/data-quality sales_data
/data-quality df_customers --strict
Scoring
| Component | Weight | Scoring |
|---|---|---|
| Nulls | 30% | 100 - (avg_null_pct * 2) |
| Duplicates | 20% | 100 - (dup_pct * 50) |
| Type consistency | 25% | 100 if clean, 0 if mixed |
| Outliers | 25% | 100 - (avg_outlier_pct * 10) |
Final score: Weighted average, capped at 0-100
Available Tools
Use these MCP tools:
describe- Get statistical summary (for outlier detection)head- Preview datalist_data- List available DataFrames