Clone
1
RFC-Perf-Sentinel-Plugin
Leo Miranda edited this page 2026-01-29 04:33:19 +00:00
This file contains invisible Unicode characters
This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

RFC: perf-sentinel Plugin

Status: Proposal
Created: 2026-01-28
Author: Claude Code Analysis
Priority: Medium
Estimated Effort: 2 sprints
Depends On: None (can be implemented independently)


Executive Summary

Create a new plugin dedicated to performance profiling, benchmarking, and optimization of the plugin/hook ecosystem. This plugin automates the kind of manual analysis performed in RFC-Hook-Efficiency-Improvements, providing continuous monitoring and regression detection.


Motivation

Problem Statement

  1. Manual analysis is time-consuming - The hook efficiency review required reading 15+ files and manual assessment
  2. No regression detection - Performance degradations go unnoticed until they become painful
  3. No baseline tracking - We don't know if changes improve or worsen performance
  4. Optimization suggestions require expertise - Not all users can identify inefficiencies

Goals

  1. Automate performance audits
  2. Establish and track performance baselines
  3. Detect regressions before they accumulate
  4. Provide actionable optimization recommendations
  5. Keep monitoring overhead minimal (< 10ms per session)

Proposed Solution

Plugin Overview

Attribute Value
Name perf-sentinel
Version 1.0.0
Category Developer Tools
Dependencies None (external tools optional)

Directory Structure

plugins/perf-sentinel/
├── .claude-plugin/
│   └── plugin.json
├── commands/
│   ├── perf-audit.md           # Full performance audit
│   ├── perf-baseline.md        # Establish performance baseline
│   ├── perf-compare.md         # Compare against baseline
│   ├── benchmark-hooks.md      # Benchmark specific hooks
│   ├── profile-session.md      # Profile session startup/operations
│   └── perf-optimize.md        # Generate optimization suggestions
├── hooks/
│   ├── hooks.json
│   └── session-metrics.sh      # Lightweight session-end metrics
├── agents/
│   ├── perf-analyzer.md        # Analyze data, generate recommendations
│   └── benchmark-runner.md     # Run systematic benchmarks
├── scripts/
│   ├── hook-profiler.sh        # Profile individual hooks
│   ├── metrics-collector.sh    # Collect system metrics
│   └── lib/
│       ├── timing.sh           # Timing utilities
│       ├── metrics.sh          # Metrics collection/storage
│       └── reporting.sh        # Report generation
├── skills/
│   └── perf-patterns/
│       └── common-issues.md    # Knowledge base of common perf issues
└── README.md

Feature Specifications

Command: /perf-audit

Purpose: Automated performance audit (what we did manually for hooks)

Behavior:

  1. Enumerate all hooks across all plugins
  2. Measure execution time of each hook (dry run with mock input)
  3. Analyze hook scripts for known anti-patterns
  4. Check for redundant operations across hooks
  5. Generate report with recommendations

Output Example:

[perf-sentinel] Performance Audit - 2026-01-28
==============================================

HOOK ANALYSIS (16 hooks across 12 plugins)

SessionStart Hooks (7):
┌─────────────────────────────────┬──────────┬─────────────────────────────┐
│ Hook                            │ Time     │ Concerns                    │
├─────────────────────────────────┼──────────┼─────────────────────────────┤
│ projman/startup-check.sh        │ 342ms    │ ⚠️ HTTP calls, version check │
│ cmdb-assistant/startup-check.sh │ 287ms    │ ⚠️ HTTP calls (NetBox API)   │
│ contract-validator/auto-valid.. │ 156ms    │ MD5 hashing                 │
│ data-platform/startup-check.sh  │ 134ms    │ ⚠️ Python + DB connection    │
│ claude-config-maintainer/enf..  │ 45ms     │ File I/O                    │
│ pr-review/startup-check.sh      │ 38ms     │ OK                          │
│ viz-platform (inline echo)      │ 2ms      │ ⚠️ No value                  │
└─────────────────────────────────┴──────────┴─────────────────────────────┘
Total SessionStart overhead: 1,004ms

PostToolUse Hooks (4):
┌─────────────────────────────────┬──────────┬─────────────────────────────┐
│ Hook                            │ Time     │ Concerns                    │
├─────────────────────────────────┼──────────┼─────────────────────────────┤
│ project-hygiene/cleanup.sh      │ 89ms     │ ⚠️ find across project       │
│ contract-validator/breaking-..  │ 67ms     │ git operations              │
│ data-platform/schema-diff-ch..  │ 34ms     │ OK                          │
│ doc-guardian/notify.sh          │ 12ms     │ OK                          │
└─────────────────────────────────┴──────────┴─────────────────────────────┘
Avg PostToolUse overhead: 50ms per edit

REDUNDANCY ANALYSIS
  - MCP venv check: duplicated in 3 hooks
  - Git remote check: duplicated in 2 hooks
  - JSON parsing: 8 independent implementations

ANTI-PATTERNS DETECTED
  1. HTTP calls in SessionStart (2 hooks)
  2. Full project find in PostToolUse (1 hook)
  3. Broad Bash matcher triggering on all commands (2 hooks)

RECOMMENDATIONS (see /perf-optimize for details)
  1. Consolidate SessionStart hooks → est. -400ms
  2. Add cooldown to project-hygiene → est. -50ms/edit
  3. Remove viz-platform echo hook → est. -2ms
  4. Add early-exit to git-flow hooks → est. -20ms/cmd

Overall Score: 62/100 (Needs Improvement)

Anti-patterns to detect:

Pattern Detection Method Severity
HTTP calls in SessionStart grep for curl, wget, API URLs High
Full project find grep for find "$PROJECT" or find . High
Python spawn for simple tasks grep for python in shell scripts Medium
No early exit Check if script bails out quickly for irrelevant input Medium
Broad Bash matcher Check hooks.json for "matcher": "Bash" Medium
Duplicate logic Compare function signatures across hooks Low

Command: /perf-baseline

Purpose: Establish performance baseline for current version

Behavior:

  1. Run multiple session startup measurements (default: 10)
  2. Run multiple edit cycle measurements (default: 10)
  3. Calculate statistics (mean, stddev, p95)
  4. Store baseline with version tag

Parameters:

  • --runs N - Number of measurement runs (default: 10)
  • --tag TAG - Custom tag (default: version from marketplace.json)

Output Example:

[perf-sentinel] Creating baseline for v5.4.0
============================================

Running session startup measurements (10 runs)...
  Run 1: 487ms
  Run 2: 512ms
  ...
  Run 10: 478ms

Running edit cycle measurements (10 runs)...
  Run 1: 156ms
  ...

Baseline Statistics:
┌────────────────────┬─────────┬─────────┬─────────┬─────────┐
│ Metric             │ Mean    │ StdDev  │ P50     │ P95     │
├────────────────────┼─────────┼─────────┼─────────┼─────────┤
│ Session startup    │ 487ms   │ 42ms    │ 481ms   │ 534ms   │
│ Edit cycle         │ 156ms   │ 23ms    │ 152ms   │ 189ms   │
│ Hook count         │ 16      │ -       │ -       │ -       │
│ Total hook code    │ 1,493   │ -       │ -       │ -       │
└────────────────────┴─────────┴─────────┴─────────┴─────────┘

Baseline saved: ~/.cache/claude-plugins/perf-sentinel/baselines/v5.4.0.json

Storage Format:

{
  "version": "5.4.0",
  "timestamp": "2026-01-28T12:00:00Z",
  "measurements": {
    "session_startup": {
      "runs": [487, 512, 478, ...],
      "mean": 487,
      "stddev": 42,
      "p50": 481,
      "p95": 534
    },
    "edit_cycle": {
      "runs": [156, 162, 148, ...],
      "mean": 156,
      "stddev": 23,
      "p50": 152,
      "p95": 189
    }
  },
  "metadata": {
    "hook_count": 16,
    "total_hook_lines": 1493,
    "plugins": ["projman", "pr-review", ...]
  }
}

Command: /perf-compare

Purpose: Compare current performance against baseline

Behavior:

  1. Load baseline (latest or specified)
  2. Run current measurements
  3. Compare with statistical significance testing
  4. Flag regressions exceeding threshold (default: 10%)

Parameters:

  • --baseline TAG - Specific baseline to compare against
  • --threshold N - Regression threshold percentage (default: 10)

Output Example:

[perf-sentinel] Performance Comparison
======================================
Baseline: v5.4.0 (2026-01-28)
Current:  v5.5.0-dev

                      Baseline    Current     Change      Status
┌────────────────────┬───────────┬───────────┬───────────┬──────────┐
│ Session startup    │ 487ms     │ 612ms     │ +125ms    │ ⚠️ +26%   │
│ Edit cycle         │ 156ms     │ 148ms     │ -8ms      │ ✅ -5%    │
│ Hook count         │ 16        │ 18        │ +2        │         │
└────────────────────┴───────────┴───────────┴───────────┴──────────┘

⚠️ REGRESSION DETECTED: Session startup

Root Cause Analysis:
  New hooks since baseline:
    + data-platform/startup-check.sh (SessionStart) → +134ms
    + data-platform/schema-diff-check.sh (PostToolUse)
  
  Modified hooks:
    ~ projman/startup-check.sh → +12ms (added version check)

Recommendation:
  Review data-platform SessionStart hook - consider lazy loading
  Run /perf-optimize for detailed suggestions

Command: /benchmark-hooks

Purpose: Detailed benchmarking of specific hooks

Behavior:

  1. Run specified hook(s) multiple times with mock input
  2. Provide statistical analysis
  3. Optional: use hyperfine if available

Parameters:

  • HOOK_PATH - Path to specific hook, or "all"
  • --runs N - Number of runs (default: 20)
  • --warmup N - Warmup runs (default: 3)

Output Example:

[perf-sentinel] Hook Benchmark: projman/startup-check.sh
========================================================

Configuration:
  Runs: 20 (+ 3 warmup)
  Input: mock SessionStart payload
  Tool: hyperfine (detected)

Results:
  Mean:     342.3ms
  StdDev:   28.1ms
  Min:      298ms
  Max:      412ms
  P50:      338ms
  P95:      389ms

Time Breakdown (via strace):
  Network I/O:  267ms (78%)  ← Gitea API calls
  File I/O:      42ms (12%)
  Process:       33ms (10%)

Recommendation:
  Network calls dominate. Consider:
  - Caching API responses
  - Making calls async/lazy
  - Reducing call frequency

Command: /perf-optimize

Purpose: Generate actionable optimization plan

Behavior:

  1. Run audit if not recently run
  2. Analyze patterns and anti-patterns
  3. Generate prioritized optimization plan
  4. Estimate impact of each optimization

Output Example:

[perf-sentinel] Optimization Plan
=================================

Based on audit from 2026-01-28 12:00

PRIORITY 1 - High Impact, Low Effort
────────────────────────────────────
1. Remove viz-platform SessionStart hook
   File: plugins/viz-platform/hooks/hooks.json
   Action: Delete the hook (provides no value)
   Impact: -2ms startup
   Effort: 1 line change

2. Make clarity-assist opt-in
   File: plugins/clarity-assist/hooks/vagueness-check.sh
   Action: Change default from true to false
   Impact: -50ms per prompt (when disabled)
   Effort: 1 line change

PRIORITY 2 - High Impact, Medium Effort
───────────────────────────────────────
3. Add early-exit to git-flow hooks
   Files: plugins/git-flow/hooks/branch-check.sh
          plugins/git-flow/hooks/commit-msg-check.sh
   Action: Check for "git" in command before full parsing
   Impact: -20ms per non-git bash command
   Effort: ~10 lines each

4. Add cooldown to project-hygiene
   File: plugins/project-hygiene/hooks/cleanup.sh
   Action: Skip if ran in last 5 minutes
   Impact: -89ms per edit (when cooling down)
   Effort: ~15 lines

PRIORITY 3 - High Impact, High Effort
─────────────────────────────────────
5. Consolidate SessionStart hooks
   Action: Create central dispatcher, remove individual hooks
   Impact: -400ms startup (estimated)
   Effort: New architecture, ~200 lines

6. Create shared hook library
   Action: Extract common JSON parsing, path filtering
   Impact: Reduced maintenance, consistent behavior
   Effort: ~100 lines library + refactor all hooks

ESTIMATED TOTAL IMPACT
──────────────────────
If all optimizations implemented:
  Session startup: 487ms → ~180ms (63% reduction)
  Edit cycle: 156ms → ~60ms (62% reduction)

Implementation order recommendation:
  Sprint 1: Items 1-4 (quick wins)
  Sprint 2: Items 5-6 (architecture)

Hook: Session Metrics (Lightweight)

File: hooks/hooks.json

{
  "hooks": {
    "SessionEnd": [
      {
        "type": "command",
        "command": "${CLAUDE_PLUGIN_ROOT}/hooks/session-metrics.sh"
      }
    ]
  }
}

Design Principles:

  1. SessionEnd only - No per-tool overhead
  2. Fast execution - Must complete in < 10ms
  3. Minimal I/O - Single file write
  4. No network - Never make HTTP calls

Script: hooks/session-metrics.sh

#!/bin/bash
# perf-sentinel session metrics collector
# MUST be lightweight - runs on every session end

# Metrics directory
METRICS_DIR="$HOME/.cache/claude-plugins/perf-sentinel/sessions"
TODAY=$(date +%Y-%m-%d)
mkdir -p "$METRICS_DIR/$TODAY" 2>/dev/null

# Collect only what's cheap to measure
SESSION_ID="${CLAUDE_SESSION_ID:-$(date +%s)}"
DURATION="${CLAUDE_SESSION_DURATION_MS:-0}"

# Single atomic write
cat > "$METRICS_DIR/$TODAY/$SESSION_ID.json" << EOF
{"ts":"$(date -Iseconds)","dur":$DURATION,"id":"$SESSION_ID"}
EOF

exit 0

Agent: perf-analyzer

Purpose: AI-powered performance analysis and recommendations

Capabilities:

  1. Analyze collected metrics for trends
  2. Identify anomalies and regressions
  3. Generate natural language recommendations
  4. Cross-reference with known performance patterns

Model: sonnet (good balance of speed and analysis quality)


Agent: benchmark-runner

Purpose: Execute systematic benchmarks

Capabilities:

  1. Run benchmarks with proper isolation
  2. Handle warmup and cooldown
  3. Statistical analysis of results
  4. Integration with external tools when available

Model: haiku (simple task execution)


Data Storage

Directory Structure

~/.cache/claude-plugins/perf-sentinel/
├── baselines/
│   ├── v5.4.0.json
│   ├── v5.5.0.json
│   └── latest -> v5.5.0.json
├── sessions/
│   ├── 2026-01-27/
│   │   ├── session-001.json
│   │   └── session-002.json
│   └── 2026-01-28/
│       └── session-001.json
├── audits/
│   └── 2026-01-28.json
├── benchmarks/
│   └── hooks-2026-01-28.json
└── config.json

Retention Policy

  • Sessions: 30 days
  • Audits: 90 days
  • Baselines: Indefinite
  • Benchmarks: 90 days

External Tool Integration

Detection and Fallback

# In scripts/lib/timing.sh

benchmark_command() {
    local cmd="$1"
    local runs="${2:-10}"
    
    if command -v hyperfine &> /dev/null; then
        # Use hyperfine for statistical benchmarking
        hyperfine --runs "$runs" --warmup 3 --export-json /tmp/bench.json "$cmd"
        parse_hyperfine_output /tmp/bench.json
    else
        # Fallback to basic timing
        local total=0
        for i in $(seq 1 "$runs"); do
            start=$(date +%s%N)
            eval "$cmd" > /dev/null 2>&1
            end=$(date +%s%N)
            elapsed=$(( (end - start) / 1000000 ))
            total=$((total + elapsed))
        done
        echo $((total / runs))
    fi
}

Supported Tools

Tool Purpose Required Detection
time (GNU) Basic timing Yes (built-in) Always available
hyperfine Statistical benchmarks No command -v hyperfine
strace Syscall analysis No command -v strace
ps Memory tracking Yes (built-in) Always available
perf CPU profiling No command -v perf

Configuration

Config File: ~/.cache/claude-plugins/perf-sentinel/config.json

{
  "metrics": {
    "enabled": true,
    "retention_days": 30
  },
  "thresholds": {
    "regression_percent": 10,
    "slow_hook_ms": 100,
    "slow_session_ms": 500
  },
  "external_tools": {
    "prefer_hyperfine": true,
    "use_strace": false
  },
  "notifications": {
    "on_regression": true,
    "on_slow_session": false
  }
}

Environment Variables

Variable Purpose Default
PERF_SENTINEL_ENABLED Enable/disable metrics collection true
PERF_SENTINEL_VERBOSE Verbose output false
PERF_SENTINEL_THRESHOLD Regression threshold % 10

Implementation Plan

Phase 1: Core Infrastructure (Sprint 1)

  1. Create plugin structure
  2. Implement /perf-audit command
  3. Implement lightweight SessionEnd hook
  4. Create scripts/lib/ utilities

Phase 2: Baseline System (Sprint 1)

  1. Implement /perf-baseline command
  2. Implement /perf-compare command
  3. Set up data storage structure

Phase 3: Advanced Features (Sprint 2)

  1. Implement /benchmark-hooks command
  2. Implement /perf-optimize command
  3. Create perf-analyzer agent
  4. Add external tool integration

Phase 4: Polish (Sprint 2)

  1. Add configuration system
  2. Create README and documentation
  3. Add to marketplace.json
  4. Integration testing

Success Metrics

Metric Target
Audit execution time < 5 seconds
Metrics hook overhead < 10ms
Regression detection accuracy > 90%
False positive rate < 5%

Risks and Mitigations

Risk Mitigation
Metrics collection adds overhead SessionEnd only, < 10ms budget
Benchmark results vary by system Statistical analysis, warmup runs
External tools not available Graceful fallback to built-in
Storage grows unbounded Retention policy, auto-cleanup

Open Questions

  1. Should /perf-audit run automatically on version changes?
  2. Should we integrate with CI/CD for automated regression testing?
  3. Should baseline comparisons be part of /sprint-close?
  4. Should we track memory usage in addition to time?


Changelog

Date Change
2026-01-28 Initial RFC created