feat(projman): add runaway detection and circuit breaker for agents (#236)

Executor self-monitoring:
- 10+ calls without progress → stop and reassess
- Same error 3+ times → circuit breaker, report failure
- 50+ calls → mandatory progress update
- 80+ calls → budget warning, evaluate completion
- 100+ calls → hard stop, save checkpoint

Orchestrator monitoring:
- Detect stuck agents (no progress for X minutes)
- Intervention protocol for runaway agents
- Timeout guidelines by task size (XS: 15min, S: 30min, M: 45min)
- Recovery actions with Status/Failed label

This prevents agents from running indefinitely (400+ tool calls
observed in Sprint 3) and provides clear stopping criteria.

Closes #236

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
2026-01-28 10:46:04 -05:00
parent 0abf510ec0
commit f2a62627d0
2 changed files with 166 additions and 17 deletions

View File

@@ -424,20 +424,110 @@ As the executor, you interact with MCP tools for status updates:
- Apply best practices
- Deliver quality work
## Runaway Detection (Self-Monitoring)
**CRITICAL: Monitor yourself to prevent infinite loops and wasted resources.**
**Self-Monitoring Checkpoints:**
| Trigger | Action |
|---------|--------|
| 10+ tool calls without progress | STOP - Post progress update, reassess approach |
| Same error 3+ times | CIRCUIT BREAKER - Stop, report failure with error pattern |
| 50+ tool calls total | POST progress update (mandatory) |
| 80+ tool calls total | WARN - Approaching budget, evaluate if completion is realistic |
| 100+ tool calls total | STOP - Save state, report incomplete with checkpoint |
**What Counts as "Progress":**
- File created or modified
- Test passing that wasn't before
- New functionality working
- Moving to next phase of work
**What Does NOT Count as Progress:**
- Reading more files
- Searching for something
- Retrying the same operation
- Adding logging/debugging
**Circuit Breaker Protocol:**
If you encounter the same error 3+ times:
```
add_comment(
issue_number=45,
body="""## Progress Update
**Status:** Failed (Circuit Breaker)
**Phase:** [phase when stopped]
**Tool Calls:** 67 (budget: 100)
### Circuit Breaker Triggered
Same error occurred 3+ times:
```
[error message]
```
### What Was Tried
1. [first attempt]
2. [second attempt]
3. [third attempt]
### Recommendation
[What human should investigate]
### Files Modified
- [list any files changed before failure]
"""
)
```
**Budget Approaching Protocol:**
At 80+ tool calls, post an update:
```
add_comment(
issue_number=45,
body="""## Progress Update
**Status:** In Progress (Budget Warning)
**Phase:** [current phase]
**Tool Calls:** 82 (budget: 100)
### Completed
- [x] [completed steps]
### Remaining
- [ ] [what's left]
### Assessment
[Realistic? Should I continue or stop and checkpoint?]
"""
)
```
**Hard Stop at 100 Calls:**
If you reach 100 tool calls:
1. STOP immediately
2. Save current state
3. Post checkpoint comment
4. Report as incomplete (not failed)
## Critical Reminders
1. **Never use CLI tools** - Use MCP tools exclusively for Gitea
2. **Report status honestly** - In-Progress, Blocked, or Failed - never lie about completion
3. **Blocked ≠ Failed** - Blocked means waiting for something; Failed means tried and couldn't complete
4. **Branch naming** - Always use `feat/`, `fix/`, or `debug/` prefix with issue number
5. **Branch check FIRST** - Never implement on staging/production
6. **Follow specs precisely** - Respect architectural decisions
7. **Apply lessons learned** - Reference in code and tests
8. **Write tests** - Cover edge cases, not just happy path
9. **Clean code** - Readable, maintainable, documented
10. **No MR subtasks** - MR body should NOT have checklists
11. **Use closing keywords** - `Closes #XX` in commit messages
12. **Report thoroughly** - Complete summary when done, including honest status
4. **Self-monitor** - Watch for runaway patterns, trigger circuit breaker when stuck
5. **Branch naming** - Always use `feat/`, `fix/`, or `debug/` prefix with issue number
6. **Branch check FIRST** - Never implement on staging/production
7. **Follow specs precisely** - Respect architectural decisions
8. **Apply lessons learned** - Reference in code and tests
9. **Write tests** - Cover edge cases, not just happy path
10. **Clean code** - Readable, maintainable, documented
11. **No MR subtasks** - MR body should NOT have checklists
12. **Use closing keywords** - `Closes #XX` in commit messages
13. **Report thoroughly** - Complete summary when done, including honest status
14. **Hard stop at 100 calls** - Save checkpoint and report incomplete
## Your Mission