Major refactoring of projman plugin architecture: Skills Extraction (17 new files): - Extracted reusable knowledge from commands and agents into skills/ - branch-security, dependency-management, git-workflow, input-detection - issue-conventions, lessons-learned, mcp-tools-reference, planning-workflow - progress-tracking, repo-validation, review-checklist, runaway-detection - setup-workflows, sprint-approval, task-sizing, test-standards, wiki-conventions Command Consolidation (17 → 12 commands): - /setup: consolidates initial-setup, project-init, project-sync (--full/--quick/--sync) - /debug: consolidates debug-report, debug-review (report/review modes) - /test: consolidates test-check, test-gen (run/gen modes) - /sprint-status: absorbs sprint-diagram via --diagram flag Architecture Cleanup: - Remove plugin-level mcp-servers/ symlinks (6 plugins) - Remove plugin README.md files (12 files, ~2000 lines) - Update all documentation to reflect new command structure - Fix documentation drift in CONFIGURATION.md, COMMANDS-CHEATSHEET.md Commands are now thin dispatchers (~20-50 lines) that reference skills. Agents reference skills for domain knowledge instead of inline content. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
3.4 KiB
3.4 KiB
name, description
| name | description |
|---|---|
| runaway-detection | Detecting and handling stuck agents |
Runaway Detection
Purpose
Defines how to detect stuck agents and intervention protocols.
When to Use
- Orchestrator agent: When monitoring dispatched agents
- Executor agent: Self-monitoring during execution
Warning Signs
| Sign | Threshold | Action |
|---|---|---|
| No progress comment | 30+ minutes | Investigate |
| Same phase repeated | 20+ tool calls | Consider stopping |
| Same error 3+ times | Immediately | Stop agent |
| Approaching budget | 80% of limit | Post checkpoint |
Agent Timeout Guidelines
| Task Size | Expected Duration | Intervention Point |
|---|---|---|
| XS | ~5-10 min | 15 min no progress |
| S | ~10-20 min | 30 min no progress |
| M | ~20-40 min | 45 min no progress |
Detection Protocol
- Read latest progress comment - Check tool call count and phase
- Compare to previous - Is progress happening?
- Check for error patterns - Same error repeating?
- Evaluate time elapsed - Beyond expected duration?
Intervention Protocol
When you detect an agent may be stuck:
Step 1: Assess
Agent Status Check for #45:
- Last progress: 25 minutes ago
- Phase: "Testing" (same as 20 tool calls ago)
- Errors: "ModuleNotFoundError" (3 times)
- Assessment: LIKELY STUCK
Step 2: Stop Agent
# If TaskStop available
TaskStop(task_id="agent-id")
Step 3: Update Issue Status
update_issue(
repo="org/repo",
issue_number=45,
labels=["Status/Failed", ...other_labels]
)
Step 4: Add Explanation Comment
add_comment(
repo="org/repo",
number=45,
body="""## Agent Intervention
**Reason:** No progress detected for 25 minutes / repeated errors
**Last Status:** Testing phase, ModuleNotFoundError x3
**Action:** Stopped agent, requires human review
### What Was Completed
- [x] Created auth/jwt_service.py
- [x] Implemented generate_token()
### What Remains
- [ ] Fix import issue
- [ ] Write tests
- [ ] Commit
### Recommendation
- Check for missing dependency in requirements.txt
- May need manual intervention to resolve import
"""
)
Self-Monitoring (Executor)
Executors should self-monitor:
Circuit Breakers
- Same error 3 times: Stop and report
- 80% of tool call budget: Post checkpoint
- File not found 3 times: Stop and ask for help
- Test failing same way 5 times: Stop and report
Self-Check Template
Self-check at tool call 45/100:
- Progress: 4/7 steps completed
- Current phase: Testing
- Errors encountered: 1 (resolved)
- Remaining budget: 55 calls
- Status: ON TRACK
Recovery Actions
After stopping a stuck agent:
- Preserve work - Branch and commits remain
- Document state - Checkpoint in issue comment
- Identify cause - What caused the loop?
- Plan recovery:
- Manual completion
- Different approach
- Break down further
- Assign to human
Common Stuck Patterns
| Pattern | Cause | Solution |
|---|---|---|
| Import loop | Missing dependency | Add to requirements |
| Test loop | Non-deterministic test | Fix test isolation |
| Validation loop | Error message not changing | Improve error specificity |
| File not found | Wrong path | Verify path exists |
| Permission denied | File ownership | Check permissions |