Files
leo-claude-mktplace/plugins/projman/skills/runaway-detection.md
lmiranda 2e65b60725 refactor(projman): extract skills and consolidate commands
Major refactoring of projman plugin architecture:

Skills Extraction (17 new files):
- Extracted reusable knowledge from commands and agents into skills/
- branch-security, dependency-management, git-workflow, input-detection
- issue-conventions, lessons-learned, mcp-tools-reference, planning-workflow
- progress-tracking, repo-validation, review-checklist, runaway-detection
- setup-workflows, sprint-approval, task-sizing, test-standards, wiki-conventions

Command Consolidation (17 → 12 commands):
- /setup: consolidates initial-setup, project-init, project-sync (--full/--quick/--sync)
- /debug: consolidates debug-report, debug-review (report/review modes)
- /test: consolidates test-check, test-gen (run/gen modes)
- /sprint-status: absorbs sprint-diagram via --diagram flag

Architecture Cleanup:
- Remove plugin-level mcp-servers/ symlinks (6 plugins)
- Remove plugin README.md files (12 files, ~2000 lines)
- Update all documentation to reflect new command structure
- Fix documentation drift in CONFIGURATION.md, COMMANDS-CHEATSHEET.md

Commands are now thin dispatchers (~20-50 lines) that reference skills.
Agents reference skills for domain knowledge instead of inline content.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 15:02:16 -05:00

3.4 KiB

name, description
name description
runaway-detection Detecting and handling stuck agents

Runaway Detection

Purpose

Defines how to detect stuck agents and intervention protocols.

When to Use

  • Orchestrator agent: When monitoring dispatched agents
  • Executor agent: Self-monitoring during execution

Warning Signs

Sign Threshold Action
No progress comment 30+ minutes Investigate
Same phase repeated 20+ tool calls Consider stopping
Same error 3+ times Immediately Stop agent
Approaching budget 80% of limit Post checkpoint

Agent Timeout Guidelines

Task Size Expected Duration Intervention Point
XS ~5-10 min 15 min no progress
S ~10-20 min 30 min no progress
M ~20-40 min 45 min no progress

Detection Protocol

  1. Read latest progress comment - Check tool call count and phase
  2. Compare to previous - Is progress happening?
  3. Check for error patterns - Same error repeating?
  4. Evaluate time elapsed - Beyond expected duration?

Intervention Protocol

When you detect an agent may be stuck:

Step 1: Assess

Agent Status Check for #45:
- Last progress: 25 minutes ago
- Phase: "Testing" (same as 20 tool calls ago)
- Errors: "ModuleNotFoundError" (3 times)
- Assessment: LIKELY STUCK

Step 2: Stop Agent

# If TaskStop available
TaskStop(task_id="agent-id")

Step 3: Update Issue Status

update_issue(
    repo="org/repo",
    issue_number=45,
    labels=["Status/Failed", ...other_labels]
)

Step 4: Add Explanation Comment

add_comment(
    repo="org/repo",
    number=45,
    body="""## Agent Intervention
**Reason:** No progress detected for 25 minutes / repeated errors
**Last Status:** Testing phase, ModuleNotFoundError x3
**Action:** Stopped agent, requires human review

### What Was Completed
- [x] Created auth/jwt_service.py
- [x] Implemented generate_token()

### What Remains
- [ ] Fix import issue
- [ ] Write tests
- [ ] Commit

### Recommendation
- Check for missing dependency in requirements.txt
- May need manual intervention to resolve import
"""
)

Self-Monitoring (Executor)

Executors should self-monitor:

Circuit Breakers

  • Same error 3 times: Stop and report
  • 80% of tool call budget: Post checkpoint
  • File not found 3 times: Stop and ask for help
  • Test failing same way 5 times: Stop and report

Self-Check Template

Self-check at tool call 45/100:
- Progress: 4/7 steps completed
- Current phase: Testing
- Errors encountered: 1 (resolved)
- Remaining budget: 55 calls
- Status: ON TRACK

Recovery Actions

After stopping a stuck agent:

  1. Preserve work - Branch and commits remain
  2. Document state - Checkpoint in issue comment
  3. Identify cause - What caused the loop?
  4. Plan recovery:
    • Manual completion
    • Different approach
    • Break down further
    • Assign to human

Common Stuck Patterns

Pattern Cause Solution
Import loop Missing dependency Add to requirements
Test loop Non-deterministic test Fix test isolation
Validation loop Error message not changing Improve error specificity
File not found Wrong path Verify path exists
Permission denied File ownership Check permissions