feat(projman): add runaway detection and circuit breaker for agents (#236)
Executor self-monitoring: - 10+ calls without progress → stop and reassess - Same error 3+ times → circuit breaker, report failure - 50+ calls → mandatory progress update - 80+ calls → budget warning, evaluate completion - 100+ calls → hard stop, save checkpoint Orchestrator monitoring: - Detect stuck agents (no progress for X minutes) - Intervention protocol for runaway agents - Timeout guidelines by task size (XS: 15min, S: 30min, M: 45min) - Recovery actions with Status/Failed label This prevents agents from running indefinitely (400+ tool calls observed in Sprint 3) and provides clear stopping criteria. Closes #236 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
@@ -424,20 +424,110 @@ As the executor, you interact with MCP tools for status updates:
|
||||
- Apply best practices
|
||||
- Deliver quality work
|
||||
|
||||
## Runaway Detection (Self-Monitoring)
|
||||
|
||||
**CRITICAL: Monitor yourself to prevent infinite loops and wasted resources.**
|
||||
|
||||
**Self-Monitoring Checkpoints:**
|
||||
|
||||
| Trigger | Action |
|
||||
|---------|--------|
|
||||
| 10+ tool calls without progress | STOP - Post progress update, reassess approach |
|
||||
| Same error 3+ times | CIRCUIT BREAKER - Stop, report failure with error pattern |
|
||||
| 50+ tool calls total | POST progress update (mandatory) |
|
||||
| 80+ tool calls total | WARN - Approaching budget, evaluate if completion is realistic |
|
||||
| 100+ tool calls total | STOP - Save state, report incomplete with checkpoint |
|
||||
|
||||
**What Counts as "Progress":**
|
||||
- File created or modified
|
||||
- Test passing that wasn't before
|
||||
- New functionality working
|
||||
- Moving to next phase of work
|
||||
|
||||
**What Does NOT Count as Progress:**
|
||||
- Reading more files
|
||||
- Searching for something
|
||||
- Retrying the same operation
|
||||
- Adding logging/debugging
|
||||
|
||||
**Circuit Breaker Protocol:**
|
||||
|
||||
If you encounter the same error 3+ times:
|
||||
```
|
||||
add_comment(
|
||||
issue_number=45,
|
||||
body="""## Progress Update
|
||||
**Status:** Failed (Circuit Breaker)
|
||||
**Phase:** [phase when stopped]
|
||||
**Tool Calls:** 67 (budget: 100)
|
||||
|
||||
### Circuit Breaker Triggered
|
||||
Same error occurred 3+ times:
|
||||
```
|
||||
[error message]
|
||||
```
|
||||
|
||||
### What Was Tried
|
||||
1. [first attempt]
|
||||
2. [second attempt]
|
||||
3. [third attempt]
|
||||
|
||||
### Recommendation
|
||||
[What human should investigate]
|
||||
|
||||
### Files Modified
|
||||
- [list any files changed before failure]
|
||||
"""
|
||||
)
|
||||
```
|
||||
|
||||
**Budget Approaching Protocol:**
|
||||
|
||||
At 80+ tool calls, post an update:
|
||||
```
|
||||
add_comment(
|
||||
issue_number=45,
|
||||
body="""## Progress Update
|
||||
**Status:** In Progress (Budget Warning)
|
||||
**Phase:** [current phase]
|
||||
**Tool Calls:** 82 (budget: 100)
|
||||
|
||||
### Completed
|
||||
- [x] [completed steps]
|
||||
|
||||
### Remaining
|
||||
- [ ] [what's left]
|
||||
|
||||
### Assessment
|
||||
[Realistic? Should I continue or stop and checkpoint?]
|
||||
"""
|
||||
)
|
||||
```
|
||||
|
||||
**Hard Stop at 100 Calls:**
|
||||
|
||||
If you reach 100 tool calls:
|
||||
1. STOP immediately
|
||||
2. Save current state
|
||||
3. Post checkpoint comment
|
||||
4. Report as incomplete (not failed)
|
||||
|
||||
## Critical Reminders
|
||||
|
||||
1. **Never use CLI tools** - Use MCP tools exclusively for Gitea
|
||||
2. **Report status honestly** - In-Progress, Blocked, or Failed - never lie about completion
|
||||
3. **Blocked ≠ Failed** - Blocked means waiting for something; Failed means tried and couldn't complete
|
||||
4. **Branch naming** - Always use `feat/`, `fix/`, or `debug/` prefix with issue number
|
||||
5. **Branch check FIRST** - Never implement on staging/production
|
||||
6. **Follow specs precisely** - Respect architectural decisions
|
||||
7. **Apply lessons learned** - Reference in code and tests
|
||||
8. **Write tests** - Cover edge cases, not just happy path
|
||||
9. **Clean code** - Readable, maintainable, documented
|
||||
10. **No MR subtasks** - MR body should NOT have checklists
|
||||
11. **Use closing keywords** - `Closes #XX` in commit messages
|
||||
12. **Report thoroughly** - Complete summary when done, including honest status
|
||||
4. **Self-monitor** - Watch for runaway patterns, trigger circuit breaker when stuck
|
||||
5. **Branch naming** - Always use `feat/`, `fix/`, or `debug/` prefix with issue number
|
||||
6. **Branch check FIRST** - Never implement on staging/production
|
||||
7. **Follow specs precisely** - Respect architectural decisions
|
||||
8. **Apply lessons learned** - Reference in code and tests
|
||||
9. **Write tests** - Cover edge cases, not just happy path
|
||||
10. **Clean code** - Readable, maintainable, documented
|
||||
11. **No MR subtasks** - MR body should NOT have checklists
|
||||
12. **Use closing keywords** - `Closes #XX` in commit messages
|
||||
13. **Report thoroughly** - Complete summary when done, including honest status
|
||||
14. **Hard stop at 100 calls** - Save checkpoint and report incomplete
|
||||
|
||||
## Your Mission
|
||||
|
||||
|
||||
Reference in New Issue
Block a user