diff --git a/RFC-Perf-Sentinel-Plugin.-.md b/RFC-Perf-Sentinel-Plugin.-.md new file mode 100644 index 0000000..fa660e3 --- /dev/null +++ b/RFC-Perf-Sentinel-Plugin.-.md @@ -0,0 +1,644 @@ +# RFC: perf-sentinel Plugin + +**Status:** Proposal +**Created:** 2026-01-28 +**Author:** Claude Code Analysis +**Priority:** Medium +**Estimated Effort:** 2 sprints +**Depends On:** None (can be implemented independently) + +--- + +## Executive Summary + +Create a new plugin dedicated to performance profiling, benchmarking, and optimization of the plugin/hook ecosystem. This plugin automates the kind of manual analysis performed in RFC-Hook-Efficiency-Improvements, providing continuous monitoring and regression detection. + +--- + +## Motivation + +### Problem Statement + +1. **Manual analysis is time-consuming** - The hook efficiency review required reading 15+ files and manual assessment +2. **No regression detection** - Performance degradations go unnoticed until they become painful +3. **No baseline tracking** - We don't know if changes improve or worsen performance +4. **Optimization suggestions require expertise** - Not all users can identify inefficiencies + +### Goals + +1. Automate performance audits +2. Establish and track performance baselines +3. Detect regressions before they accumulate +4. Provide actionable optimization recommendations +5. Keep monitoring overhead minimal (< 10ms per session) + +--- + +## Proposed Solution + +### Plugin Overview + +| Attribute | Value | +|-----------|-------| +| Name | `perf-sentinel` | +| Version | 1.0.0 | +| Category | Developer Tools | +| Dependencies | None (external tools optional) | + +### Directory Structure + +``` +plugins/perf-sentinel/ +├── .claude-plugin/ +│ └── plugin.json +├── commands/ +│ ├── perf-audit.md # Full performance audit +│ ├── perf-baseline.md # Establish performance baseline +│ ├── perf-compare.md # Compare against baseline +│ ├── benchmark-hooks.md # Benchmark specific hooks +│ ├── profile-session.md # Profile session startup/operations +│ └── perf-optimize.md # Generate optimization suggestions +├── hooks/ +│ ├── hooks.json +│ └── session-metrics.sh # Lightweight session-end metrics +├── agents/ +│ ├── perf-analyzer.md # Analyze data, generate recommendations +│ └── benchmark-runner.md # Run systematic benchmarks +├── scripts/ +│ ├── hook-profiler.sh # Profile individual hooks +│ ├── metrics-collector.sh # Collect system metrics +│ └── lib/ +│ ├── timing.sh # Timing utilities +│ ├── metrics.sh # Metrics collection/storage +│ └── reporting.sh # Report generation +├── skills/ +│ └── perf-patterns/ +│ └── common-issues.md # Knowledge base of common perf issues +└── README.md +``` + +--- + +## Feature Specifications + +### Command: `/perf-audit` + +**Purpose:** Automated performance audit (what we did manually for hooks) + +**Behavior:** +1. Enumerate all hooks across all plugins +2. Measure execution time of each hook (dry run with mock input) +3. Analyze hook scripts for known anti-patterns +4. Check for redundant operations across hooks +5. Generate report with recommendations + +**Output Example:** +``` +[perf-sentinel] Performance Audit - 2026-01-28 +============================================== + +HOOK ANALYSIS (16 hooks across 12 plugins) + +SessionStart Hooks (7): +┌─────────────────────────────────┬──────────┬─────────────────────────────┐ +│ Hook │ Time │ Concerns │ +├─────────────────────────────────┼──────────┼─────────────────────────────┤ +│ projman/startup-check.sh │ 342ms │ ⚠️ HTTP calls, version check │ +│ cmdb-assistant/startup-check.sh │ 287ms │ ⚠️ HTTP calls (NetBox API) │ +│ contract-validator/auto-valid.. │ 156ms │ MD5 hashing │ +│ data-platform/startup-check.sh │ 134ms │ ⚠️ Python + DB connection │ +│ claude-config-maintainer/enf.. │ 45ms │ File I/O │ +│ pr-review/startup-check.sh │ 38ms │ OK │ +│ viz-platform (inline echo) │ 2ms │ ⚠️ No value │ +└─────────────────────────────────┴──────────┴─────────────────────────────┘ +Total SessionStart overhead: 1,004ms + +PostToolUse Hooks (4): +┌─────────────────────────────────┬──────────┬─────────────────────────────┐ +│ Hook │ Time │ Concerns │ +├─────────────────────────────────┼──────────┼─────────────────────────────┤ +│ project-hygiene/cleanup.sh │ 89ms │ ⚠️ find across project │ +│ contract-validator/breaking-.. │ 67ms │ git operations │ +│ data-platform/schema-diff-ch.. │ 34ms │ OK │ +│ doc-guardian/notify.sh │ 12ms │ OK │ +└─────────────────────────────────┴──────────┴─────────────────────────────┘ +Avg PostToolUse overhead: 50ms per edit + +REDUNDANCY ANALYSIS + - MCP venv check: duplicated in 3 hooks + - Git remote check: duplicated in 2 hooks + - JSON parsing: 8 independent implementations + +ANTI-PATTERNS DETECTED + 1. HTTP calls in SessionStart (2 hooks) + 2. Full project find in PostToolUse (1 hook) + 3. Broad Bash matcher triggering on all commands (2 hooks) + +RECOMMENDATIONS (see /perf-optimize for details) + 1. Consolidate SessionStart hooks → est. -400ms + 2. Add cooldown to project-hygiene → est. -50ms/edit + 3. Remove viz-platform echo hook → est. -2ms + 4. Add early-exit to git-flow hooks → est. -20ms/cmd + +Overall Score: 62/100 (Needs Improvement) +``` + +**Anti-patterns to detect:** +| Pattern | Detection Method | Severity | +|---------|------------------|----------| +| HTTP calls in SessionStart | grep for `curl`, `wget`, API URLs | High | +| Full project `find` | grep for `find "$PROJECT"` or `find .` | High | +| Python spawn for simple tasks | grep for `python` in shell scripts | Medium | +| No early exit | Check if script bails out quickly for irrelevant input | Medium | +| Broad Bash matcher | Check hooks.json for `"matcher": "Bash"` | Medium | +| Duplicate logic | Compare function signatures across hooks | Low | + +--- + +### Command: `/perf-baseline` + +**Purpose:** Establish performance baseline for current version + +**Behavior:** +1. Run multiple session startup measurements (default: 10) +2. Run multiple edit cycle measurements (default: 10) +3. Calculate statistics (mean, stddev, p95) +4. Store baseline with version tag + +**Parameters:** +- `--runs N` - Number of measurement runs (default: 10) +- `--tag TAG` - Custom tag (default: version from marketplace.json) + +**Output Example:** +``` +[perf-sentinel] Creating baseline for v5.4.0 +============================================ + +Running session startup measurements (10 runs)... + Run 1: 487ms + Run 2: 512ms + ... + Run 10: 478ms + +Running edit cycle measurements (10 runs)... + Run 1: 156ms + ... + +Baseline Statistics: +┌────────────────────┬─────────┬─────────┬─────────┬─────────┐ +│ Metric │ Mean │ StdDev │ P50 │ P95 │ +├────────────────────┼─────────┼─────────┼─────────┼─────────┤ +│ Session startup │ 487ms │ 42ms │ 481ms │ 534ms │ +│ Edit cycle │ 156ms │ 23ms │ 152ms │ 189ms │ +│ Hook count │ 16 │ - │ - │ - │ +│ Total hook code │ 1,493 │ - │ - │ - │ +└────────────────────┴─────────┴─────────┴─────────┴─────────┘ + +Baseline saved: ~/.cache/claude-plugins/perf-sentinel/baselines/v5.4.0.json +``` + +**Storage Format:** +```json +{ + "version": "5.4.0", + "timestamp": "2026-01-28T12:00:00Z", + "measurements": { + "session_startup": { + "runs": [487, 512, 478, ...], + "mean": 487, + "stddev": 42, + "p50": 481, + "p95": 534 + }, + "edit_cycle": { + "runs": [156, 162, 148, ...], + "mean": 156, + "stddev": 23, + "p50": 152, + "p95": 189 + } + }, + "metadata": { + "hook_count": 16, + "total_hook_lines": 1493, + "plugins": ["projman", "pr-review", ...] + } +} +``` + +--- + +### Command: `/perf-compare` + +**Purpose:** Compare current performance against baseline + +**Behavior:** +1. Load baseline (latest or specified) +2. Run current measurements +3. Compare with statistical significance testing +4. Flag regressions exceeding threshold (default: 10%) + +**Parameters:** +- `--baseline TAG` - Specific baseline to compare against +- `--threshold N` - Regression threshold percentage (default: 10) + +**Output Example:** +``` +[perf-sentinel] Performance Comparison +====================================== +Baseline: v5.4.0 (2026-01-28) +Current: v5.5.0-dev + + Baseline Current Change Status +┌────────────────────┬───────────┬───────────┬───────────┬──────────┐ +│ Session startup │ 487ms │ 612ms │ +125ms │ ⚠️ +26% │ +│ Edit cycle │ 156ms │ 148ms │ -8ms │ ✅ -5% │ +│ Hook count │ 16 │ 18 │ +2 │ ℹ️ │ +└────────────────────┴───────────┴───────────┴───────────┴──────────┘ + +⚠️ REGRESSION DETECTED: Session startup + +Root Cause Analysis: + New hooks since baseline: + + data-platform/startup-check.sh (SessionStart) → +134ms + + data-platform/schema-diff-check.sh (PostToolUse) + + Modified hooks: + ~ projman/startup-check.sh → +12ms (added version check) + +Recommendation: + Review data-platform SessionStart hook - consider lazy loading + Run /perf-optimize for detailed suggestions +``` + +--- + +### Command: `/benchmark-hooks` + +**Purpose:** Detailed benchmarking of specific hooks + +**Behavior:** +1. Run specified hook(s) multiple times with mock input +2. Provide statistical analysis +3. Optional: use hyperfine if available + +**Parameters:** +- `HOOK_PATH` - Path to specific hook, or "all" +- `--runs N` - Number of runs (default: 20) +- `--warmup N` - Warmup runs (default: 3) + +**Output Example:** +``` +[perf-sentinel] Hook Benchmark: projman/startup-check.sh +======================================================== + +Configuration: + Runs: 20 (+ 3 warmup) + Input: mock SessionStart payload + Tool: hyperfine (detected) + +Results: + Mean: 342.3ms + StdDev: 28.1ms + Min: 298ms + Max: 412ms + P50: 338ms + P95: 389ms + +Time Breakdown (via strace): + Network I/O: 267ms (78%) ← Gitea API calls + File I/O: 42ms (12%) + Process: 33ms (10%) + +Recommendation: + Network calls dominate. Consider: + - Caching API responses + - Making calls async/lazy + - Reducing call frequency +``` + +--- + +### Command: `/perf-optimize` + +**Purpose:** Generate actionable optimization plan + +**Behavior:** +1. Run audit if not recently run +2. Analyze patterns and anti-patterns +3. Generate prioritized optimization plan +4. Estimate impact of each optimization + +**Output Example:** +``` +[perf-sentinel] Optimization Plan +================================= + +Based on audit from 2026-01-28 12:00 + +PRIORITY 1 - High Impact, Low Effort +──────────────────────────────────── +1. Remove viz-platform SessionStart hook + File: plugins/viz-platform/hooks/hooks.json + Action: Delete the hook (provides no value) + Impact: -2ms startup + Effort: 1 line change + +2. Make clarity-assist opt-in + File: plugins/clarity-assist/hooks/vagueness-check.sh + Action: Change default from true to false + Impact: -50ms per prompt (when disabled) + Effort: 1 line change + +PRIORITY 2 - High Impact, Medium Effort +─────────────────────────────────────── +3. Add early-exit to git-flow hooks + Files: plugins/git-flow/hooks/branch-check.sh + plugins/git-flow/hooks/commit-msg-check.sh + Action: Check for "git" in command before full parsing + Impact: -20ms per non-git bash command + Effort: ~10 lines each + +4. Add cooldown to project-hygiene + File: plugins/project-hygiene/hooks/cleanup.sh + Action: Skip if ran in last 5 minutes + Impact: -89ms per edit (when cooling down) + Effort: ~15 lines + +PRIORITY 3 - High Impact, High Effort +───────────────────────────────────── +5. Consolidate SessionStart hooks + Action: Create central dispatcher, remove individual hooks + Impact: -400ms startup (estimated) + Effort: New architecture, ~200 lines + +6. Create shared hook library + Action: Extract common JSON parsing, path filtering + Impact: Reduced maintenance, consistent behavior + Effort: ~100 lines library + refactor all hooks + +ESTIMATED TOTAL IMPACT +────────────────────── +If all optimizations implemented: + Session startup: 487ms → ~180ms (63% reduction) + Edit cycle: 156ms → ~60ms (62% reduction) + +Implementation order recommendation: + Sprint 1: Items 1-4 (quick wins) + Sprint 2: Items 5-6 (architecture) +``` + +--- + +### Hook: Session Metrics (Lightweight) + +**File:** `hooks/hooks.json` +```json +{ + "hooks": { + "SessionEnd": [ + { + "type": "command", + "command": "${CLAUDE_PLUGIN_ROOT}/hooks/session-metrics.sh" + } + ] + } +} +``` + +**Design Principles:** +1. **SessionEnd only** - No per-tool overhead +2. **Fast execution** - Must complete in < 10ms +3. **Minimal I/O** - Single file write +4. **No network** - Never make HTTP calls + +**Script:** `hooks/session-metrics.sh` +```bash +#!/bin/bash +# perf-sentinel session metrics collector +# MUST be lightweight - runs on every session end + +# Metrics directory +METRICS_DIR="$HOME/.cache/claude-plugins/perf-sentinel/sessions" +TODAY=$(date +%Y-%m-%d) +mkdir -p "$METRICS_DIR/$TODAY" 2>/dev/null + +# Collect only what's cheap to measure +SESSION_ID="${CLAUDE_SESSION_ID:-$(date +%s)}" +DURATION="${CLAUDE_SESSION_DURATION_MS:-0}" + +# Single atomic write +cat > "$METRICS_DIR/$TODAY/$SESSION_ID.json" << EOF +{"ts":"$(date -Iseconds)","dur":$DURATION,"id":"$SESSION_ID"} +EOF + +exit 0 +``` + +--- + +### Agent: perf-analyzer + +**Purpose:** AI-powered performance analysis and recommendations + +**Capabilities:** +1. Analyze collected metrics for trends +2. Identify anomalies and regressions +3. Generate natural language recommendations +4. Cross-reference with known performance patterns + +**Model:** `sonnet` (good balance of speed and analysis quality) + +--- + +### Agent: benchmark-runner + +**Purpose:** Execute systematic benchmarks + +**Capabilities:** +1. Run benchmarks with proper isolation +2. Handle warmup and cooldown +3. Statistical analysis of results +4. Integration with external tools when available + +**Model:** `haiku` (simple task execution) + +--- + +## Data Storage + +### Directory Structure +``` +~/.cache/claude-plugins/perf-sentinel/ +├── baselines/ +│ ├── v5.4.0.json +│ ├── v5.5.0.json +│ └── latest -> v5.5.0.json +├── sessions/ +│ ├── 2026-01-27/ +│ │ ├── session-001.json +│ │ └── session-002.json +│ └── 2026-01-28/ +│ └── session-001.json +├── audits/ +│ └── 2026-01-28.json +├── benchmarks/ +│ └── hooks-2026-01-28.json +└── config.json +``` + +### Retention Policy +- Sessions: 30 days +- Audits: 90 days +- Baselines: Indefinite +- Benchmarks: 90 days + +--- + +## External Tool Integration + +### Detection and Fallback + +```bash +# In scripts/lib/timing.sh + +benchmark_command() { + local cmd="$1" + local runs="${2:-10}" + + if command -v hyperfine &> /dev/null; then + # Use hyperfine for statistical benchmarking + hyperfine --runs "$runs" --warmup 3 --export-json /tmp/bench.json "$cmd" + parse_hyperfine_output /tmp/bench.json + else + # Fallback to basic timing + local total=0 + for i in $(seq 1 "$runs"); do + start=$(date +%s%N) + eval "$cmd" > /dev/null 2>&1 + end=$(date +%s%N) + elapsed=$(( (end - start) / 1000000 )) + total=$((total + elapsed)) + done + echo $((total / runs)) + fi +} +``` + +### Supported Tools + +| Tool | Purpose | Required | Detection | +|------|---------|----------|-----------| +| `time` (GNU) | Basic timing | Yes (built-in) | Always available | +| `hyperfine` | Statistical benchmarks | No | `command -v hyperfine` | +| `strace` | Syscall analysis | No | `command -v strace` | +| `ps` | Memory tracking | Yes (built-in) | Always available | +| `perf` | CPU profiling | No | `command -v perf` | + +--- + +## Configuration + +### Config File: `~/.cache/claude-plugins/perf-sentinel/config.json` + +```json +{ + "metrics": { + "enabled": true, + "retention_days": 30 + }, + "thresholds": { + "regression_percent": 10, + "slow_hook_ms": 100, + "slow_session_ms": 500 + }, + "external_tools": { + "prefer_hyperfine": true, + "use_strace": false + }, + "notifications": { + "on_regression": true, + "on_slow_session": false + } +} +``` + +### Environment Variables + +| Variable | Purpose | Default | +|----------|---------|---------| +| `PERF_SENTINEL_ENABLED` | Enable/disable metrics collection | `true` | +| `PERF_SENTINEL_VERBOSE` | Verbose output | `false` | +| `PERF_SENTINEL_THRESHOLD` | Regression threshold % | `10` | + +--- + +## Implementation Plan + +### Phase 1: Core Infrastructure (Sprint 1) +1. Create plugin structure +2. Implement `/perf-audit` command +3. Implement lightweight SessionEnd hook +4. Create scripts/lib/ utilities + +### Phase 2: Baseline System (Sprint 1) +5. Implement `/perf-baseline` command +6. Implement `/perf-compare` command +7. Set up data storage structure + +### Phase 3: Advanced Features (Sprint 2) +8. Implement `/benchmark-hooks` command +9. Implement `/perf-optimize` command +10. Create perf-analyzer agent +11. Add external tool integration + +### Phase 4: Polish (Sprint 2) +12. Add configuration system +13. Create README and documentation +14. Add to marketplace.json +15. Integration testing + +--- + +## Success Metrics + +| Metric | Target | +|--------|--------| +| Audit execution time | < 5 seconds | +| Metrics hook overhead | < 10ms | +| Regression detection accuracy | > 90% | +| False positive rate | < 5% | + +--- + +## Risks and Mitigations + +| Risk | Mitigation | +|------|------------| +| Metrics collection adds overhead | SessionEnd only, < 10ms budget | +| Benchmark results vary by system | Statistical analysis, warmup runs | +| External tools not available | Graceful fallback to built-in | +| Storage grows unbounded | Retention policy, auto-cleanup | + +--- + +## Open Questions + +1. Should `/perf-audit` run automatically on version changes? +2. Should we integrate with CI/CD for automated regression testing? +3. Should baseline comparisons be part of `/sprint-close`? +4. Should we track memory usage in addition to time? + +--- + +## Related RFCs + +- [RFC-Hook-Efficiency-Improvements](./RFC-Hook-Efficiency-Improvements) - The manual analysis that motivated this plugin + +--- + +## Changelog + +| Date | Change | +|------|--------| +| 2026-01-28 | Initial RFC created |