Refactored commands to extract reusable skills following the Commands → Skills separation pattern. Each command is now <50 lines and references skill files for detailed knowledge. Plugins refactored: - claude-config-maintainer: 5 commands → 7 skills - code-sentinel: 3 commands → 2 skills - contract-validator: 5 commands → 6 skills - data-platform: 10 commands → 6 skills - doc-guardian: 5 commands → 6 skills (replaced nested dir) - git-flow: 8 commands → 7 skills Skills contain: workflows, validation rules, conventions, reference data, tool documentation Commands now contain: YAML frontmatter, agent assignment, skills list, brief workflow steps, parameters Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1.7 KiB
1.7 KiB
Data Profiling
Profiling Workflow
- Get data reference via
list_data - Generate statistics via
describe - Analyze quality (nulls, duplicates, types, outliers)
- Calculate score and generate report
Quality Checks
Null Analysis
- Calculate null percentage per column
- PASS: < 5% nulls
- WARN: 5-20% nulls
- FAIL: > 20% nulls
Duplicate Detection
- Check for fully duplicated rows
- PASS: 0% duplicates
- WARN: < 1% duplicates
- FAIL: >= 1% duplicates
Type Consistency
- Identify mixed-type columns
- Flag numeric columns with string values
- PASS: Consistent types
- FAIL: Mixed types detected
Outlier Detection (IQR Method)
- Calculate Q1, Q3, IQR = Q3 - Q1
- Outliers: values < Q1 - 1.5IQR or > Q3 + 1.5IQR
- PASS: < 1% outliers
- WARN: 1-5% outliers
- FAIL: > 5% outliers
Quality Scoring
| Component | Weight | Formula |
|---|---|---|
| Nulls | 30% | 100 - (avg_null_pct * 2) |
| Duplicates | 20% | 100 - (dup_pct * 50) |
| Type consistency | 25% | 100 if clean, 0 if mixed |
| Outliers | 25% | 100 - (avg_outlier_pct * 10) |
Final score: Weighted average, capped at 0-100
Report Format
=== Data Quality Report ===
Dataset: [data_ref]
Rows: X | Columns: Y
Overall Score: XX/100 [PASS/WARN/FAIL]
--- Column Analysis ---
| Column | Nulls | Dups | Type | Outliers | Status |
|--------|-------|------|------|----------|--------|
| col1 | X.X% | - | type | X.X% | PASS |
--- Issues Found ---
[WARN/FAIL] Column 'X': Issue description
--- Recommendations ---
1. Suggested remediation steps
Strict Mode
With --strict flag:
- WARN at 1% nulls (vs 5%)
- FAIL at 5% nulls (vs 20%)