Add "RFC-Perf-Sentinel-Plugin"

2026-01-29 04:33:19 +00:00
parent 8f06f95784
commit 22f276388b

@@ -0,0 +1,644 @@
# RFC: perf-sentinel Plugin
**Status:** Proposal
**Created:** 2026-01-28
**Author:** Claude Code Analysis
**Priority:** Medium
**Estimated Effort:** 2 sprints
**Depends On:** None (can be implemented independently)
---
## Executive Summary
Create a new plugin dedicated to performance profiling, benchmarking, and optimization of the plugin/hook ecosystem. This plugin automates the kind of manual analysis performed in RFC-Hook-Efficiency-Improvements, providing continuous monitoring and regression detection.
---
## Motivation
### Problem Statement
1. **Manual analysis is time-consuming** - The hook efficiency review required reading 15+ files and manual assessment
2. **No regression detection** - Performance degradations go unnoticed until they become painful
3. **No baseline tracking** - We don't know if changes improve or worsen performance
4. **Optimization suggestions require expertise** - Not all users can identify inefficiencies
### Goals
1. Automate performance audits
2. Establish and track performance baselines
3. Detect regressions before they accumulate
4. Provide actionable optimization recommendations
5. Keep monitoring overhead minimal (< 10ms per session)
---
## Proposed Solution
### Plugin Overview
| Attribute | Value |
|-----------|-------|
| Name | `perf-sentinel` |
| Version | 1.0.0 |
| Category | Developer Tools |
| Dependencies | None (external tools optional) |
### Directory Structure
```
plugins/perf-sentinel/
├── .claude-plugin/
│ └── plugin.json
├── commands/
│ ├── perf-audit.md # Full performance audit
│ ├── perf-baseline.md # Establish performance baseline
│ ├── perf-compare.md # Compare against baseline
│ ├── benchmark-hooks.md # Benchmark specific hooks
│ ├── profile-session.md # Profile session startup/operations
│ └── perf-optimize.md # Generate optimization suggestions
├── hooks/
│ ├── hooks.json
│ └── session-metrics.sh # Lightweight session-end metrics
├── agents/
│ ├── perf-analyzer.md # Analyze data, generate recommendations
│ └── benchmark-runner.md # Run systematic benchmarks
├── scripts/
│ ├── hook-profiler.sh # Profile individual hooks
│ ├── metrics-collector.sh # Collect system metrics
│ └── lib/
│ ├── timing.sh # Timing utilities
│ ├── metrics.sh # Metrics collection/storage
│ └── reporting.sh # Report generation
├── skills/
│ └── perf-patterns/
│ └── common-issues.md # Knowledge base of common perf issues
└── README.md
```
---
## Feature Specifications
### Command: `/perf-audit`
**Purpose:** Automated performance audit (what we did manually for hooks)
**Behavior:**
1. Enumerate all hooks across all plugins
2. Measure execution time of each hook (dry run with mock input)
3. Analyze hook scripts for known anti-patterns
4. Check for redundant operations across hooks
5. Generate report with recommendations
**Output Example:**
```
[perf-sentinel] Performance Audit - 2026-01-28
==============================================
HOOK ANALYSIS (16 hooks across 12 plugins)
SessionStart Hooks (7):
┌─────────────────────────────────┬──────────┬─────────────────────────────┐
│ Hook │ Time │ Concerns │
├─────────────────────────────────┼──────────┼─────────────────────────────┤
│ projman/startup-check.sh │ 342ms │ ⚠️ HTTP calls, version check │
│ cmdb-assistant/startup-check.sh │ 287ms │ ⚠️ HTTP calls (NetBox API) │
│ contract-validator/auto-valid.. │ 156ms │ MD5 hashing │
│ data-platform/startup-check.sh │ 134ms │ ⚠️ Python + DB connection │
│ claude-config-maintainer/enf.. │ 45ms │ File I/O │
│ pr-review/startup-check.sh │ 38ms │ OK │
│ viz-platform (inline echo) │ 2ms │ ⚠️ No value │
└─────────────────────────────────┴──────────┴─────────────────────────────┘
Total SessionStart overhead: 1,004ms
PostToolUse Hooks (4):
┌─────────────────────────────────┬──────────┬─────────────────────────────┐
│ Hook │ Time │ Concerns │
├─────────────────────────────────┼──────────┼─────────────────────────────┤
│ project-hygiene/cleanup.sh │ 89ms │ ⚠️ find across project │
│ contract-validator/breaking-.. │ 67ms │ git operations │
│ data-platform/schema-diff-ch.. │ 34ms │ OK │
│ doc-guardian/notify.sh │ 12ms │ OK │
└─────────────────────────────────┴──────────┴─────────────────────────────┘
Avg PostToolUse overhead: 50ms per edit
REDUNDANCY ANALYSIS
- MCP venv check: duplicated in 3 hooks
- Git remote check: duplicated in 2 hooks
- JSON parsing: 8 independent implementations
ANTI-PATTERNS DETECTED
1. HTTP calls in SessionStart (2 hooks)
2. Full project find in PostToolUse (1 hook)
3. Broad Bash matcher triggering on all commands (2 hooks)
RECOMMENDATIONS (see /perf-optimize for details)
1. Consolidate SessionStart hooks → est. -400ms
2. Add cooldown to project-hygiene → est. -50ms/edit
3. Remove viz-platform echo hook → est. -2ms
4. Add early-exit to git-flow hooks → est. -20ms/cmd
Overall Score: 62/100 (Needs Improvement)
```
**Anti-patterns to detect:**
| Pattern | Detection Method | Severity |
|---------|------------------|----------|
| HTTP calls in SessionStart | grep for `curl`, `wget`, API URLs | High |
| Full project `find` | grep for `find "$PROJECT"` or `find .` | High |
| Python spawn for simple tasks | grep for `python` in shell scripts | Medium |
| No early exit | Check if script bails out quickly for irrelevant input | Medium |
| Broad Bash matcher | Check hooks.json for `"matcher": "Bash"` | Medium |
| Duplicate logic | Compare function signatures across hooks | Low |
---
### Command: `/perf-baseline`
**Purpose:** Establish performance baseline for current version
**Behavior:**
1. Run multiple session startup measurements (default: 10)
2. Run multiple edit cycle measurements (default: 10)
3. Calculate statistics (mean, stddev, p95)
4. Store baseline with version tag
**Parameters:**
- `--runs N` - Number of measurement runs (default: 10)
- `--tag TAG` - Custom tag (default: version from marketplace.json)
**Output Example:**
```
[perf-sentinel] Creating baseline for v5.4.0
============================================
Running session startup measurements (10 runs)...
Run 1: 487ms
Run 2: 512ms
...
Run 10: 478ms
Running edit cycle measurements (10 runs)...
Run 1: 156ms
...
Baseline Statistics:
┌────────────────────┬─────────┬─────────┬─────────┬─────────┐
│ Metric │ Mean │ StdDev │ P50 │ P95 │
├────────────────────┼─────────┼─────────┼─────────┼─────────┤
│ Session startup │ 487ms │ 42ms │ 481ms │ 534ms │
│ Edit cycle │ 156ms │ 23ms │ 152ms │ 189ms │
│ Hook count │ 16 │ - │ - │ - │
│ Total hook code │ 1,493 │ - │ - │ - │
└────────────────────┴─────────┴─────────┴─────────┴─────────┘
Baseline saved: ~/.cache/claude-plugins/perf-sentinel/baselines/v5.4.0.json
```
**Storage Format:**
```json
{
"version": "5.4.0",
"timestamp": "2026-01-28T12:00:00Z",
"measurements": {
"session_startup": {
"runs": [487, 512, 478, ...],
"mean": 487,
"stddev": 42,
"p50": 481,
"p95": 534
},
"edit_cycle": {
"runs": [156, 162, 148, ...],
"mean": 156,
"stddev": 23,
"p50": 152,
"p95": 189
}
},
"metadata": {
"hook_count": 16,
"total_hook_lines": 1493,
"plugins": ["projman", "pr-review", ...]
}
}
```
---
### Command: `/perf-compare`
**Purpose:** Compare current performance against baseline
**Behavior:**
1. Load baseline (latest or specified)
2. Run current measurements
3. Compare with statistical significance testing
4. Flag regressions exceeding threshold (default: 10%)
**Parameters:**
- `--baseline TAG` - Specific baseline to compare against
- `--threshold N` - Regression threshold percentage (default: 10)
**Output Example:**
```
[perf-sentinel] Performance Comparison
======================================
Baseline: v5.4.0 (2026-01-28)
Current: v5.5.0-dev
Baseline Current Change Status
┌────────────────────┬───────────┬───────────┬───────────┬──────────┐
│ Session startup │ 487ms │ 612ms │ +125ms │ ⚠️ +26% │
│ Edit cycle │ 156ms │ 148ms │ -8ms │ ✅ -5% │
│ Hook count │ 16 │ 18 │ +2 │
└────────────────────┴───────────┴───────────┴───────────┴──────────┘
⚠️ REGRESSION DETECTED: Session startup
Root Cause Analysis:
New hooks since baseline:
+ data-platform/startup-check.sh (SessionStart) → +134ms
+ data-platform/schema-diff-check.sh (PostToolUse)
Modified hooks:
~ projman/startup-check.sh → +12ms (added version check)
Recommendation:
Review data-platform SessionStart hook - consider lazy loading
Run /perf-optimize for detailed suggestions
```
---
### Command: `/benchmark-hooks`
**Purpose:** Detailed benchmarking of specific hooks
**Behavior:**
1. Run specified hook(s) multiple times with mock input
2. Provide statistical analysis
3. Optional: use hyperfine if available
**Parameters:**
- `HOOK_PATH` - Path to specific hook, or "all"
- `--runs N` - Number of runs (default: 20)
- `--warmup N` - Warmup runs (default: 3)
**Output Example:**
```
[perf-sentinel] Hook Benchmark: projman/startup-check.sh
========================================================
Configuration:
Runs: 20 (+ 3 warmup)
Input: mock SessionStart payload
Tool: hyperfine (detected)
Results:
Mean: 342.3ms
StdDev: 28.1ms
Min: 298ms
Max: 412ms
P50: 338ms
P95: 389ms
Time Breakdown (via strace):
Network I/O: 267ms (78%) ← Gitea API calls
File I/O: 42ms (12%)
Process: 33ms (10%)
Recommendation:
Network calls dominate. Consider:
- Caching API responses
- Making calls async/lazy
- Reducing call frequency
```
---
### Command: `/perf-optimize`
**Purpose:** Generate actionable optimization plan
**Behavior:**
1. Run audit if not recently run
2. Analyze patterns and anti-patterns
3. Generate prioritized optimization plan
4. Estimate impact of each optimization
**Output Example:**
```
[perf-sentinel] Optimization Plan
=================================
Based on audit from 2026-01-28 12:00
PRIORITY 1 - High Impact, Low Effort
────────────────────────────────────
1. Remove viz-platform SessionStart hook
File: plugins/viz-platform/hooks/hooks.json
Action: Delete the hook (provides no value)
Impact: -2ms startup
Effort: 1 line change
2. Make clarity-assist opt-in
File: plugins/clarity-assist/hooks/vagueness-check.sh
Action: Change default from true to false
Impact: -50ms per prompt (when disabled)
Effort: 1 line change
PRIORITY 2 - High Impact, Medium Effort
───────────────────────────────────────
3. Add early-exit to git-flow hooks
Files: plugins/git-flow/hooks/branch-check.sh
plugins/git-flow/hooks/commit-msg-check.sh
Action: Check for "git" in command before full parsing
Impact: -20ms per non-git bash command
Effort: ~10 lines each
4. Add cooldown to project-hygiene
File: plugins/project-hygiene/hooks/cleanup.sh
Action: Skip if ran in last 5 minutes
Impact: -89ms per edit (when cooling down)
Effort: ~15 lines
PRIORITY 3 - High Impact, High Effort
─────────────────────────────────────
5. Consolidate SessionStart hooks
Action: Create central dispatcher, remove individual hooks
Impact: -400ms startup (estimated)
Effort: New architecture, ~200 lines
6. Create shared hook library
Action: Extract common JSON parsing, path filtering
Impact: Reduced maintenance, consistent behavior
Effort: ~100 lines library + refactor all hooks
ESTIMATED TOTAL IMPACT
──────────────────────
If all optimizations implemented:
Session startup: 487ms → ~180ms (63% reduction)
Edit cycle: 156ms → ~60ms (62% reduction)
Implementation order recommendation:
Sprint 1: Items 1-4 (quick wins)
Sprint 2: Items 5-6 (architecture)
```
---
### Hook: Session Metrics (Lightweight)
**File:** `hooks/hooks.json`
```json
{
"hooks": {
"SessionEnd": [
{
"type": "command",
"command": "${CLAUDE_PLUGIN_ROOT}/hooks/session-metrics.sh"
}
]
}
}
```
**Design Principles:**
1. **SessionEnd only** - No per-tool overhead
2. **Fast execution** - Must complete in < 10ms
3. **Minimal I/O** - Single file write
4. **No network** - Never make HTTP calls
**Script:** `hooks/session-metrics.sh`
```bash
#!/bin/bash
# perf-sentinel session metrics collector
# MUST be lightweight - runs on every session end
# Metrics directory
METRICS_DIR="$HOME/.cache/claude-plugins/perf-sentinel/sessions"
TODAY=$(date +%Y-%m-%d)
mkdir -p "$METRICS_DIR/$TODAY" 2>/dev/null
# Collect only what's cheap to measure
SESSION_ID="${CLAUDE_SESSION_ID:-$(date +%s)}"
DURATION="${CLAUDE_SESSION_DURATION_MS:-0}"
# Single atomic write
cat > "$METRICS_DIR/$TODAY/$SESSION_ID.json" << EOF
{"ts":"$(date -Iseconds)","dur":$DURATION,"id":"$SESSION_ID"}
EOF
exit 0
```
---
### Agent: perf-analyzer
**Purpose:** AI-powered performance analysis and recommendations
**Capabilities:**
1. Analyze collected metrics for trends
2. Identify anomalies and regressions
3. Generate natural language recommendations
4. Cross-reference with known performance patterns
**Model:** `sonnet` (good balance of speed and analysis quality)
---
### Agent: benchmark-runner
**Purpose:** Execute systematic benchmarks
**Capabilities:**
1. Run benchmarks with proper isolation
2. Handle warmup and cooldown
3. Statistical analysis of results
4. Integration with external tools when available
**Model:** `haiku` (simple task execution)
---
## Data Storage
### Directory Structure
```
~/.cache/claude-plugins/perf-sentinel/
├── baselines/
│ ├── v5.4.0.json
│ ├── v5.5.0.json
│ └── latest -> v5.5.0.json
├── sessions/
│ ├── 2026-01-27/
│ │ ├── session-001.json
│ │ └── session-002.json
│ └── 2026-01-28/
│ └── session-001.json
├── audits/
│ └── 2026-01-28.json
├── benchmarks/
│ └── hooks-2026-01-28.json
└── config.json
```
### Retention Policy
- Sessions: 30 days
- Audits: 90 days
- Baselines: Indefinite
- Benchmarks: 90 days
---
## External Tool Integration
### Detection and Fallback
```bash
# In scripts/lib/timing.sh
benchmark_command() {
local cmd="$1"
local runs="${2:-10}"
if command -v hyperfine &> /dev/null; then
# Use hyperfine for statistical benchmarking
hyperfine --runs "$runs" --warmup 3 --export-json /tmp/bench.json "$cmd"
parse_hyperfine_output /tmp/bench.json
else
# Fallback to basic timing
local total=0
for i in $(seq 1 "$runs"); do
start=$(date +%s%N)
eval "$cmd" > /dev/null 2>&1
end=$(date +%s%N)
elapsed=$(( (end - start) / 1000000 ))
total=$((total + elapsed))
done
echo $((total / runs))
fi
}
```
### Supported Tools
| Tool | Purpose | Required | Detection |
|------|---------|----------|-----------|
| `time` (GNU) | Basic timing | Yes (built-in) | Always available |
| `hyperfine` | Statistical benchmarks | No | `command -v hyperfine` |
| `strace` | Syscall analysis | No | `command -v strace` |
| `ps` | Memory tracking | Yes (built-in) | Always available |
| `perf` | CPU profiling | No | `command -v perf` |
---
## Configuration
### Config File: `~/.cache/claude-plugins/perf-sentinel/config.json`
```json
{
"metrics": {
"enabled": true,
"retention_days": 30
},
"thresholds": {
"regression_percent": 10,
"slow_hook_ms": 100,
"slow_session_ms": 500
},
"external_tools": {
"prefer_hyperfine": true,
"use_strace": false
},
"notifications": {
"on_regression": true,
"on_slow_session": false
}
}
```
### Environment Variables
| Variable | Purpose | Default |
|----------|---------|---------|
| `PERF_SENTINEL_ENABLED` | Enable/disable metrics collection | `true` |
| `PERF_SENTINEL_VERBOSE` | Verbose output | `false` |
| `PERF_SENTINEL_THRESHOLD` | Regression threshold % | `10` |
---
## Implementation Plan
### Phase 1: Core Infrastructure (Sprint 1)
1. Create plugin structure
2. Implement `/perf-audit` command
3. Implement lightweight SessionEnd hook
4. Create scripts/lib/ utilities
### Phase 2: Baseline System (Sprint 1)
5. Implement `/perf-baseline` command
6. Implement `/perf-compare` command
7. Set up data storage structure
### Phase 3: Advanced Features (Sprint 2)
8. Implement `/benchmark-hooks` command
9. Implement `/perf-optimize` command
10. Create perf-analyzer agent
11. Add external tool integration
### Phase 4: Polish (Sprint 2)
12. Add configuration system
13. Create README and documentation
14. Add to marketplace.json
15. Integration testing
---
## Success Metrics
| Metric | Target |
|--------|--------|
| Audit execution time | < 5 seconds |
| Metrics hook overhead | < 10ms |
| Regression detection accuracy | > 90% |
| False positive rate | < 5% |
---
## Risks and Mitigations
| Risk | Mitigation |
|------|------------|
| Metrics collection adds overhead | SessionEnd only, < 10ms budget |
| Benchmark results vary by system | Statistical analysis, warmup runs |
| External tools not available | Graceful fallback to built-in |
| Storage grows unbounded | Retention policy, auto-cleanup |
---
## Open Questions
1. Should `/perf-audit` run automatically on version changes?
2. Should we integrate with CI/CD for automated regression testing?
3. Should baseline comparisons be part of `/sprint-close`?
4. Should we track memory usage in addition to time?
---
## Related RFCs
- [RFC-Hook-Efficiency-Improvements](./RFC-Hook-Efficiency-Improvements) - The manual analysis that motivated this plugin
---
## Changelog
| Date | Change |
|------|--------|
| 2026-01-28 | Initial RFC created |