Add "RFC-Perf-Sentinel-Plugin"

2026-01-29 04:33:19 +00:00
parent 8f06f95784
commit 22f276388b
1 changed files with 644 additions and 0 deletions
--- a/RFC-Perf-Sentinel-Plugin.-.md
+++ b/RFC-Perf-Sentinel-Plugin.-.md
@@ -0,0 +1,644 @@
+# RFC: perf-sentinel Plugin
+
+**Status:** Proposal  
+**Created:** 2026-01-28  
+**Author:** Claude Code Analysis  
+**Priority:** Medium  
+**Estimated Effort:** 2 sprints  
+**Depends On:** None (can be implemented independently)
+
+---
+
+## Executive Summary
+
+Create a new plugin dedicated to performance profiling, benchmarking, and optimization of the plugin/hook ecosystem. This plugin automates the kind of manual analysis performed in RFC-Hook-Efficiency-Improvements, providing continuous monitoring and regression detection.
+
+---
+
+## Motivation
+
+### Problem Statement
+
+1. **Manual analysis is time-consuming** - The hook efficiency review required reading 15+ files and manual assessment
+2. **No regression detection** - Performance degradations go unnoticed until they become painful
+3. **No baseline tracking** - We don't know if changes improve or worsen performance
+4. **Optimization suggestions require expertise** - Not all users can identify inefficiencies
+
+### Goals
+
+1. Automate performance audits
+2. Establish and track performance baselines
+3. Detect regressions before they accumulate
+4. Provide actionable optimization recommendations
+5. Keep monitoring overhead minimal (< 10ms per session)
+
+---
+
+## Proposed Solution
+
+### Plugin Overview
+
+| Attribute | Value |
+|-----------|-------|
+| Name | `perf-sentinel` |
+| Version | 1.0.0 |
+| Category | Developer Tools |
+| Dependencies | None (external tools optional) |
+
+### Directory Structure
+
+```
+plugins/perf-sentinel/
+├── .claude-plugin/
+│   └── plugin.json
+├── commands/
+│   ├── perf-audit.md           # Full performance audit
+│   ├── perf-baseline.md        # Establish performance baseline
+│   ├── perf-compare.md         # Compare against baseline
+│   ├── benchmark-hooks.md      # Benchmark specific hooks
+│   ├── profile-session.md      # Profile session startup/operations
+│   └── perf-optimize.md        # Generate optimization suggestions
+├── hooks/
+│   ├── hooks.json
+│   └── session-metrics.sh      # Lightweight session-end metrics
+├── agents/
+│   ├── perf-analyzer.md        # Analyze data, generate recommendations
+│   └── benchmark-runner.md     # Run systematic benchmarks
+├── scripts/
+│   ├── hook-profiler.sh        # Profile individual hooks
+│   ├── metrics-collector.sh    # Collect system metrics
+│   └── lib/
+│       ├── timing.sh           # Timing utilities
+│       ├── metrics.sh          # Metrics collection/storage
+│       └── reporting.sh        # Report generation
+├── skills/
+│   └── perf-patterns/
+│       └── common-issues.md    # Knowledge base of common perf issues
+└── README.md
+```
+
+---
+
+## Feature Specifications
+
+### Command: `/perf-audit`
+
+**Purpose:** Automated performance audit (what we did manually for hooks)
+
+**Behavior:**
+1. Enumerate all hooks across all plugins
+2. Measure execution time of each hook (dry run with mock input)
+3. Analyze hook scripts for known anti-patterns
+4. Check for redundant operations across hooks
+5. Generate report with recommendations
+
+**Output Example:**
+```
+[perf-sentinel] Performance Audit - 2026-01-28
+==============================================
+
+HOOK ANALYSIS (16 hooks across 12 plugins)
+
+SessionStart Hooks (7):
+┌─────────────────────────────────┬──────────┬─────────────────────────────┐
+│ Hook                            │ Time     │ Concerns                    │
+├─────────────────────────────────┼──────────┼─────────────────────────────┤
+│ projman/startup-check.sh        │ 342ms    │ ⚠️ HTTP calls, version check │
+│ cmdb-assistant/startup-check.sh │ 287ms    │ ⚠️ HTTP calls (NetBox API)   │
+│ contract-validator/auto-valid.. │ 156ms    │ MD5 hashing                 │
+│ data-platform/startup-check.sh  │ 134ms    │ ⚠️ Python + DB connection    │
+│ claude-config-maintainer/enf..  │ 45ms     │ File I/O                    │
+│ pr-review/startup-check.sh      │ 38ms     │ OK                          │
+│ viz-platform (inline echo)      │ 2ms      │ ⚠️ No value                  │
+└─────────────────────────────────┴──────────┴─────────────────────────────┘
+Total SessionStart overhead: 1,004ms
+
+PostToolUse Hooks (4):
+┌─────────────────────────────────┬──────────┬─────────────────────────────┐
+│ Hook                            │ Time     │ Concerns                    │
+├─────────────────────────────────┼──────────┼─────────────────────────────┤
+│ project-hygiene/cleanup.sh      │ 89ms     │ ⚠️ find across project       │
+│ contract-validator/breaking-..  │ 67ms     │ git operations              │
+│ data-platform/schema-diff-ch..  │ 34ms     │ OK                          │
+│ doc-guardian/notify.sh          │ 12ms     │ OK                          │
+└─────────────────────────────────┴──────────┴─────────────────────────────┘
+Avg PostToolUse overhead: 50ms per edit
+
+REDUNDANCY ANALYSIS
+  - MCP venv check: duplicated in 3 hooks
+  - Git remote check: duplicated in 2 hooks
+  - JSON parsing: 8 independent implementations
+
+ANTI-PATTERNS DETECTED
+  1. HTTP calls in SessionStart (2 hooks)
+  2. Full project find in PostToolUse (1 hook)
+  3. Broad Bash matcher triggering on all commands (2 hooks)
+
+RECOMMENDATIONS (see /perf-optimize for details)
+  1. Consolidate SessionStart hooks → est. -400ms
+  2. Add cooldown to project-hygiene → est. -50ms/edit
+  3. Remove viz-platform echo hook → est. -2ms
+  4. Add early-exit to git-flow hooks → est. -20ms/cmd
+
+Overall Score: 62/100 (Needs Improvement)
+```
+
+**Anti-patterns to detect:**
+| Pattern | Detection Method | Severity |
+|---------|------------------|----------|
+| HTTP calls in SessionStart | grep for `curl`, `wget`, API URLs | High |
+| Full project `find` | grep for `find "$PROJECT"` or `find .` | High |
+| Python spawn for simple tasks | grep for `python` in shell scripts | Medium |
+| No early exit | Check if script bails out quickly for irrelevant input | Medium |
+| Broad Bash matcher | Check hooks.json for `"matcher": "Bash"` | Medium |
+| Duplicate logic | Compare function signatures across hooks | Low |
+
+---
+
+### Command: `/perf-baseline`
+
+**Purpose:** Establish performance baseline for current version
+
+**Behavior:**
+1. Run multiple session startup measurements (default: 10)
+2. Run multiple edit cycle measurements (default: 10)
+3. Calculate statistics (mean, stddev, p95)
+4. Store baseline with version tag
+
+**Parameters:**
+- `--runs N` - Number of measurement runs (default: 10)
+- `--tag TAG` - Custom tag (default: version from marketplace.json)
+
+**Output Example:**
+```
+[perf-sentinel] Creating baseline for v5.4.0
+============================================
+
+Running session startup measurements (10 runs)...
+  Run 1: 487ms
+  Run 2: 512ms
+  ...
+  Run 10: 478ms
+
+Running edit cycle measurements (10 runs)...
+  Run 1: 156ms
+  ...
+
+Baseline Statistics:
+┌────────────────────┬─────────┬─────────┬─────────┬─────────┐
+│ Metric             │ Mean    │ StdDev  │ P50     │ P95     │
+├────────────────────┼─────────┼─────────┼─────────┼─────────┤
+│ Session startup    │ 487ms   │ 42ms    │ 481ms   │ 534ms   │
+│ Edit cycle         │ 156ms   │ 23ms    │ 152ms   │ 189ms   │
+│ Hook count         │ 16      │ -       │ -       │ -       │
+│ Total hook code    │ 1,493   │ -       │ -       │ -       │
+└────────────────────┴─────────┴─────────┴─────────┴─────────┘
+
+Baseline saved: ~/.cache/claude-plugins/perf-sentinel/baselines/v5.4.0.json
+```
+
+**Storage Format:**
+```json
+{
+  "version": "5.4.0",
+  "timestamp": "2026-01-28T12:00:00Z",
+  "measurements": {
+    "session_startup": {
+      "runs": [487, 512, 478, ...],
+      "mean": 487,
+      "stddev": 42,
+      "p50": 481,
+      "p95": 534
+    },
+    "edit_cycle": {
+      "runs": [156, 162, 148, ...],
+      "mean": 156,
+      "stddev": 23,
+      "p50": 152,
+      "p95": 189
+    }
+  },
+  "metadata": {
+    "hook_count": 16,
+    "total_hook_lines": 1493,
+    "plugins": ["projman", "pr-review", ...]
+  }
+}
+```
+
+---
+
+### Command: `/perf-compare`
+
+**Purpose:** Compare current performance against baseline
+
+**Behavior:**
+1. Load baseline (latest or specified)
+2. Run current measurements
+3. Compare with statistical significance testing
+4. Flag regressions exceeding threshold (default: 10%)
+
+**Parameters:**
+- `--baseline TAG` - Specific baseline to compare against
+- `--threshold N` - Regression threshold percentage (default: 10)
+
+**Output Example:**
+```
+[perf-sentinel] Performance Comparison
+======================================
+Baseline: v5.4.0 (2026-01-28)
+Current:  v5.5.0-dev
+
+                      Baseline    Current     Change      Status
+┌────────────────────┬───────────┬───────────┬───────────┬──────────┐
+│ Session startup    │ 487ms     │ 612ms     │ +125ms    │ ⚠️ +26%   │
+│ Edit cycle         │ 156ms     │ 148ms     │ -8ms      │ ✅ -5%    │
+│ Hook count         │ 16        │ 18        │ +2        │ ℹ️        │
+└────────────────────┴───────────┴───────────┴───────────┴──────────┘
+
+⚠️ REGRESSION DETECTED: Session startup
+
+Root Cause Analysis:
+  New hooks since baseline:
+    + data-platform/startup-check.sh (SessionStart) → +134ms
+    + data-platform/schema-diff-check.sh (PostToolUse)
+  
+  Modified hooks:
+    ~ projman/startup-check.sh → +12ms (added version check)
+
+Recommendation:
+  Review data-platform SessionStart hook - consider lazy loading
+  Run /perf-optimize for detailed suggestions
+```
+
+---
+
+### Command: `/benchmark-hooks`
+
+**Purpose:** Detailed benchmarking of specific hooks
+
+**Behavior:**
+1. Run specified hook(s) multiple times with mock input
+2. Provide statistical analysis
+3. Optional: use hyperfine if available
+
+**Parameters:**
+- `HOOK_PATH` - Path to specific hook, or "all"
+- `--runs N` - Number of runs (default: 20)
+- `--warmup N` - Warmup runs (default: 3)
+
+**Output Example:**
+```
+[perf-sentinel] Hook Benchmark: projman/startup-check.sh
+========================================================
+
+Configuration:
+  Runs: 20 (+ 3 warmup)
+  Input: mock SessionStart payload
+  Tool: hyperfine (detected)
+
+Results:
+  Mean:     342.3ms
+  StdDev:   28.1ms
+  Min:      298ms
+  Max:      412ms
+  P50:      338ms
+  P95:      389ms
+
+Time Breakdown (via strace):
+  Network I/O:  267ms (78%)  ← Gitea API calls
+  File I/O:      42ms (12%)
+  Process:       33ms (10%)
+
+Recommendation:
+  Network calls dominate. Consider:
+  - Caching API responses
+  - Making calls async/lazy
+  - Reducing call frequency
+```
+
+---
+
+### Command: `/perf-optimize`
+
+**Purpose:** Generate actionable optimization plan
+
+**Behavior:**
+1. Run audit if not recently run
+2. Analyze patterns and anti-patterns
+3. Generate prioritized optimization plan
+4. Estimate impact of each optimization
+
+**Output Example:**
+```
+[perf-sentinel] Optimization Plan
+=================================
+
+Based on audit from 2026-01-28 12:00
+
+PRIORITY 1 - High Impact, Low Effort
+────────────────────────────────────
+1. Remove viz-platform SessionStart hook
+   File: plugins/viz-platform/hooks/hooks.json
+   Action: Delete the hook (provides no value)
+   Impact: -2ms startup
+   Effort: 1 line change
+
+2. Make clarity-assist opt-in
+   File: plugins/clarity-assist/hooks/vagueness-check.sh
+   Action: Change default from true to false
+   Impact: -50ms per prompt (when disabled)
+   Effort: 1 line change
+
+PRIORITY 2 - High Impact, Medium Effort
+───────────────────────────────────────
+3. Add early-exit to git-flow hooks
+   Files: plugins/git-flow/hooks/branch-check.sh
+          plugins/git-flow/hooks/commit-msg-check.sh
+   Action: Check for "git" in command before full parsing
+   Impact: -20ms per non-git bash command
+   Effort: ~10 lines each
+
+4. Add cooldown to project-hygiene
+   File: plugins/project-hygiene/hooks/cleanup.sh
+   Action: Skip if ran in last 5 minutes
+   Impact: -89ms per edit (when cooling down)
+   Effort: ~15 lines
+
+PRIORITY 3 - High Impact, High Effort
+─────────────────────────────────────
+5. Consolidate SessionStart hooks
+   Action: Create central dispatcher, remove individual hooks
+   Impact: -400ms startup (estimated)
+   Effort: New architecture, ~200 lines
+
+6. Create shared hook library
+   Action: Extract common JSON parsing, path filtering
+   Impact: Reduced maintenance, consistent behavior
+   Effort: ~100 lines library + refactor all hooks
+
+ESTIMATED TOTAL IMPACT
+──────────────────────
+If all optimizations implemented:
+  Session startup: 487ms → ~180ms (63% reduction)
+  Edit cycle: 156ms → ~60ms (62% reduction)
+
+Implementation order recommendation:
+  Sprint 1: Items 1-4 (quick wins)
+  Sprint 2: Items 5-6 (architecture)
+```
+
+---
+
+### Hook: Session Metrics (Lightweight)
+
+**File:** `hooks/hooks.json`
+```json
+{
+  "hooks": {
+    "SessionEnd": [
+      {
+        "type": "command",
+        "command": "${CLAUDE_PLUGIN_ROOT}/hooks/session-metrics.sh"
+      }
+    ]
+  }
+}
+```
+
+**Design Principles:**
+1. **SessionEnd only** - No per-tool overhead
+2. **Fast execution** - Must complete in < 10ms
+3. **Minimal I/O** - Single file write
+4. **No network** - Never make HTTP calls
+
+**Script:** `hooks/session-metrics.sh`
+```bash
+#!/bin/bash
+# perf-sentinel session metrics collector
+# MUST be lightweight - runs on every session end
+
+# Metrics directory
+METRICS_DIR="$HOME/.cache/claude-plugins/perf-sentinel/sessions"
+TODAY=$(date +%Y-%m-%d)
+mkdir -p "$METRICS_DIR/$TODAY" 2>/dev/null
+
+# Collect only what's cheap to measure
+SESSION_ID="${CLAUDE_SESSION_ID:-$(date +%s)}"
+DURATION="${CLAUDE_SESSION_DURATION_MS:-0}"
+
+# Single atomic write
+cat > "$METRICS_DIR/$TODAY/$SESSION_ID.json" << EOF
+{"ts":"$(date -Iseconds)","dur":$DURATION,"id":"$SESSION_ID"}
+EOF
+
+exit 0
+```
+
+---
+
+### Agent: perf-analyzer
+
+**Purpose:** AI-powered performance analysis and recommendations
+
+**Capabilities:**
+1. Analyze collected metrics for trends
+2. Identify anomalies and regressions
+3. Generate natural language recommendations
+4. Cross-reference with known performance patterns
+
+**Model:** `sonnet` (good balance of speed and analysis quality)
+
+---
+
+### Agent: benchmark-runner
+
+**Purpose:** Execute systematic benchmarks
+
+**Capabilities:**
+1. Run benchmarks with proper isolation
+2. Handle warmup and cooldown
+3. Statistical analysis of results
+4. Integration with external tools when available
+
+**Model:** `haiku` (simple task execution)
+
+---
+
+## Data Storage
+
+### Directory Structure
+```
+~/.cache/claude-plugins/perf-sentinel/
+├── baselines/
+│   ├── v5.4.0.json
+│   ├── v5.5.0.json
+│   └── latest -> v5.5.0.json
+├── sessions/
+│   ├── 2026-01-27/
+│   │   ├── session-001.json
+│   │   └── session-002.json
+│   └── 2026-01-28/
+│       └── session-001.json
+├── audits/
+│   └── 2026-01-28.json
+├── benchmarks/
+│   └── hooks-2026-01-28.json
+└── config.json
+```
+
+### Retention Policy
+- Sessions: 30 days
+- Audits: 90 days
+- Baselines: Indefinite
+- Benchmarks: 90 days
+
+---
+
+## External Tool Integration
+
+### Detection and Fallback
+
+```bash
+# In scripts/lib/timing.sh
+
+benchmark_command() {
+    local cmd="$1"
+    local runs="${2:-10}"
+    
+    if command -v hyperfine &> /dev/null; then
+        # Use hyperfine for statistical benchmarking
+        hyperfine --runs "$runs" --warmup 3 --export-json /tmp/bench.json "$cmd"
+        parse_hyperfine_output /tmp/bench.json
+    else
+        # Fallback to basic timing
+        local total=0
+        for i in $(seq 1 "$runs"); do
+            start=$(date +%s%N)
+            eval "$cmd" > /dev/null 2>&1
+            end=$(date +%s%N)
+            elapsed=$(( (end - start) / 1000000 ))
+            total=$((total + elapsed))
+        done
+        echo $((total / runs))
+    fi
+}
+```
+
+### Supported Tools
+
+| Tool | Purpose | Required | Detection |
+|------|---------|----------|-----------|
+| `time` (GNU) | Basic timing | Yes (built-in) | Always available |
+| `hyperfine` | Statistical benchmarks | No | `command -v hyperfine` |
+| `strace` | Syscall analysis | No | `command -v strace` |
+| `ps` | Memory tracking | Yes (built-in) | Always available |
+| `perf` | CPU profiling | No | `command -v perf` |
+
+---
+
+## Configuration
+
+### Config File: `~/.cache/claude-plugins/perf-sentinel/config.json`
+
+```json
+{
+  "metrics": {
+    "enabled": true,
+    "retention_days": 30
+  },
+  "thresholds": {
+    "regression_percent": 10,
+    "slow_hook_ms": 100,
+    "slow_session_ms": 500
+  },
+  "external_tools": {
+    "prefer_hyperfine": true,
+    "use_strace": false
+  },
+  "notifications": {
+    "on_regression": true,
+    "on_slow_session": false
+  }
+}
+```
+
+### Environment Variables
+
+| Variable | Purpose | Default |
+|----------|---------|---------|
+| `PERF_SENTINEL_ENABLED` | Enable/disable metrics collection | `true` |
+| `PERF_SENTINEL_VERBOSE` | Verbose output | `false` |
+| `PERF_SENTINEL_THRESHOLD` | Regression threshold % | `10` |
+
+---
+
+## Implementation Plan
+
+### Phase 1: Core Infrastructure (Sprint 1)
+1. Create plugin structure
+2. Implement `/perf-audit` command
+3. Implement lightweight SessionEnd hook
+4. Create scripts/lib/ utilities
+
+### Phase 2: Baseline System (Sprint 1)
+5. Implement `/perf-baseline` command
+6. Implement `/perf-compare` command
+7. Set up data storage structure
+
+### Phase 3: Advanced Features (Sprint 2)
+8. Implement `/benchmark-hooks` command
+9. Implement `/perf-optimize` command
+10. Create perf-analyzer agent
+11. Add external tool integration
+
+### Phase 4: Polish (Sprint 2)
+12. Add configuration system
+13. Create README and documentation
+14. Add to marketplace.json
+15. Integration testing
+
+---
+
+## Success Metrics
+
+| Metric | Target |
+|--------|--------|
+| Audit execution time | < 5 seconds |
+| Metrics hook overhead | < 10ms |
+| Regression detection accuracy | > 90% |
+| False positive rate | < 5% |
+
+---
+
+## Risks and Mitigations
+
+| Risk | Mitigation |
+|------|------------|
+| Metrics collection adds overhead | SessionEnd only, < 10ms budget |
+| Benchmark results vary by system | Statistical analysis, warmup runs |
+| External tools not available | Graceful fallback to built-in |
+| Storage grows unbounded | Retention policy, auto-cleanup |
+
+---
+
+## Open Questions
+
+1. Should `/perf-audit` run automatically on version changes?
+2. Should we integrate with CI/CD for automated regression testing?
+3. Should baseline comparisons be part of `/sprint-close`?
+4. Should we track memory usage in addition to time?
+
+---
+
+## Related RFCs
+
+- [RFC-Hook-Efficiency-Improvements](./RFC-Hook-Efficiency-Improvements) - The manual analysis that motivated this plugin
+
+---
+
+## Changelog
+
+| Date | Change |
+|------|--------|
+| 2026-01-28 | Initial RFC created |