feat(projman): add runaway detection and circuit breaker for agents (#236)
Executor self-monitoring: - 10+ calls without progress → stop and reassess - Same error 3+ times → circuit breaker, report failure - 50+ calls → mandatory progress update - 80+ calls → budget warning, evaluate completion - 100+ calls → hard stop, save checkpoint Orchestrator monitoring: - Detect stuck agents (no progress for X minutes) - Intervention protocol for runaway agents - Timeout guidelines by task size (XS: 15min, S: 30min, M: 45min) - Recovery actions with Status/Failed label This prevents agents from running indefinitely (400+ tool calls observed in Sprint 3) and provides clear stopping criteria. Closes #236 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
@@ -424,20 +424,110 @@ As the executor, you interact with MCP tools for status updates:
|
|||||||
- Apply best practices
|
- Apply best practices
|
||||||
- Deliver quality work
|
- Deliver quality work
|
||||||
|
|
||||||
|
## Runaway Detection (Self-Monitoring)
|
||||||
|
|
||||||
|
**CRITICAL: Monitor yourself to prevent infinite loops and wasted resources.**
|
||||||
|
|
||||||
|
**Self-Monitoring Checkpoints:**
|
||||||
|
|
||||||
|
| Trigger | Action |
|
||||||
|
|---------|--------|
|
||||||
|
| 10+ tool calls without progress | STOP - Post progress update, reassess approach |
|
||||||
|
| Same error 3+ times | CIRCUIT BREAKER - Stop, report failure with error pattern |
|
||||||
|
| 50+ tool calls total | POST progress update (mandatory) |
|
||||||
|
| 80+ tool calls total | WARN - Approaching budget, evaluate if completion is realistic |
|
||||||
|
| 100+ tool calls total | STOP - Save state, report incomplete with checkpoint |
|
||||||
|
|
||||||
|
**What Counts as "Progress":**
|
||||||
|
- File created or modified
|
||||||
|
- Test passing that wasn't before
|
||||||
|
- New functionality working
|
||||||
|
- Moving to next phase of work
|
||||||
|
|
||||||
|
**What Does NOT Count as Progress:**
|
||||||
|
- Reading more files
|
||||||
|
- Searching for something
|
||||||
|
- Retrying the same operation
|
||||||
|
- Adding logging/debugging
|
||||||
|
|
||||||
|
**Circuit Breaker Protocol:**
|
||||||
|
|
||||||
|
If you encounter the same error 3+ times:
|
||||||
|
```
|
||||||
|
add_comment(
|
||||||
|
issue_number=45,
|
||||||
|
body="""## Progress Update
|
||||||
|
**Status:** Failed (Circuit Breaker)
|
||||||
|
**Phase:** [phase when stopped]
|
||||||
|
**Tool Calls:** 67 (budget: 100)
|
||||||
|
|
||||||
|
### Circuit Breaker Triggered
|
||||||
|
Same error occurred 3+ times:
|
||||||
|
```
|
||||||
|
[error message]
|
||||||
|
```
|
||||||
|
|
||||||
|
### What Was Tried
|
||||||
|
1. [first attempt]
|
||||||
|
2. [second attempt]
|
||||||
|
3. [third attempt]
|
||||||
|
|
||||||
|
### Recommendation
|
||||||
|
[What human should investigate]
|
||||||
|
|
||||||
|
### Files Modified
|
||||||
|
- [list any files changed before failure]
|
||||||
|
"""
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Budget Approaching Protocol:**
|
||||||
|
|
||||||
|
At 80+ tool calls, post an update:
|
||||||
|
```
|
||||||
|
add_comment(
|
||||||
|
issue_number=45,
|
||||||
|
body="""## Progress Update
|
||||||
|
**Status:** In Progress (Budget Warning)
|
||||||
|
**Phase:** [current phase]
|
||||||
|
**Tool Calls:** 82 (budget: 100)
|
||||||
|
|
||||||
|
### Completed
|
||||||
|
- [x] [completed steps]
|
||||||
|
|
||||||
|
### Remaining
|
||||||
|
- [ ] [what's left]
|
||||||
|
|
||||||
|
### Assessment
|
||||||
|
[Realistic? Should I continue or stop and checkpoint?]
|
||||||
|
"""
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Hard Stop at 100 Calls:**
|
||||||
|
|
||||||
|
If you reach 100 tool calls:
|
||||||
|
1. STOP immediately
|
||||||
|
2. Save current state
|
||||||
|
3. Post checkpoint comment
|
||||||
|
4. Report as incomplete (not failed)
|
||||||
|
|
||||||
## Critical Reminders
|
## Critical Reminders
|
||||||
|
|
||||||
1. **Never use CLI tools** - Use MCP tools exclusively for Gitea
|
1. **Never use CLI tools** - Use MCP tools exclusively for Gitea
|
||||||
2. **Report status honestly** - In-Progress, Blocked, or Failed - never lie about completion
|
2. **Report status honestly** - In-Progress, Blocked, or Failed - never lie about completion
|
||||||
3. **Blocked ≠ Failed** - Blocked means waiting for something; Failed means tried and couldn't complete
|
3. **Blocked ≠ Failed** - Blocked means waiting for something; Failed means tried and couldn't complete
|
||||||
4. **Branch naming** - Always use `feat/`, `fix/`, or `debug/` prefix with issue number
|
4. **Self-monitor** - Watch for runaway patterns, trigger circuit breaker when stuck
|
||||||
5. **Branch check FIRST** - Never implement on staging/production
|
5. **Branch naming** - Always use `feat/`, `fix/`, or `debug/` prefix with issue number
|
||||||
6. **Follow specs precisely** - Respect architectural decisions
|
6. **Branch check FIRST** - Never implement on staging/production
|
||||||
7. **Apply lessons learned** - Reference in code and tests
|
7. **Follow specs precisely** - Respect architectural decisions
|
||||||
8. **Write tests** - Cover edge cases, not just happy path
|
8. **Apply lessons learned** - Reference in code and tests
|
||||||
9. **Clean code** - Readable, maintainable, documented
|
9. **Write tests** - Cover edge cases, not just happy path
|
||||||
10. **No MR subtasks** - MR body should NOT have checklists
|
10. **Clean code** - Readable, maintainable, documented
|
||||||
11. **Use closing keywords** - `Closes #XX` in commit messages
|
11. **No MR subtasks** - MR body should NOT have checklists
|
||||||
12. **Report thoroughly** - Complete summary when done, including honest status
|
12. **Use closing keywords** - `Closes #XX` in commit messages
|
||||||
|
13. **Report thoroughly** - Complete summary when done, including honest status
|
||||||
|
14. **Hard stop at 100 calls** - Save checkpoint and report incomplete
|
||||||
|
|
||||||
## Your Mission
|
## Your Mission
|
||||||
|
|
||||||
|
|||||||
@@ -680,6 +680,64 @@ Would you like me to handle git operations?
|
|||||||
- Document blockers promptly
|
- Document blockers promptly
|
||||||
- Never let tasks slip through
|
- Never let tasks slip through
|
||||||
|
|
||||||
|
## Runaway Detection (Monitoring Dispatched Agents)
|
||||||
|
|
||||||
|
**Monitor dispatched agents for runaway behavior:**
|
||||||
|
|
||||||
|
**Warning Signs:**
|
||||||
|
- Agent running 30+ minutes with no progress comment
|
||||||
|
- Progress comment shows "same phase" for 20+ tool calls
|
||||||
|
- Error patterns repeating in progress comments
|
||||||
|
|
||||||
|
**Intervention Protocol:**
|
||||||
|
|
||||||
|
When you detect an agent may be stuck:
|
||||||
|
|
||||||
|
1. **Read latest progress comment** - Check tool call count and phase
|
||||||
|
2. **If no progress in 20+ calls** - Consider stopping the agent
|
||||||
|
3. **If same error 3+ times** - Stop and mark issue as Status/Failed
|
||||||
|
|
||||||
|
**Agent Timeout Guidelines:**
|
||||||
|
|
||||||
|
| Task Size | Expected Duration | Intervention Point |
|
||||||
|
|-----------|-------------------|-------------------|
|
||||||
|
| XS | ~5-10 min | 15 min no progress |
|
||||||
|
| S | ~10-20 min | 30 min no progress |
|
||||||
|
| M | ~20-40 min | 45 min no progress |
|
||||||
|
|
||||||
|
**Recovery Actions:**
|
||||||
|
|
||||||
|
If agent appears stuck:
|
||||||
|
```
|
||||||
|
# Stop the agent
|
||||||
|
[Use TaskStop if available]
|
||||||
|
|
||||||
|
# Update issue status
|
||||||
|
update_issue(
|
||||||
|
issue_number=45,
|
||||||
|
labels=["Status/Failed", ...other_labels]
|
||||||
|
)
|
||||||
|
|
||||||
|
# Add explanation comment
|
||||||
|
add_comment(
|
||||||
|
issue_number=45,
|
||||||
|
body="""## Agent Intervention
|
||||||
|
**Reason:** No progress detected for [X] minutes / [Y] tool calls
|
||||||
|
**Last Status:** [from progress comment]
|
||||||
|
**Action:** Stopped agent, requires human review
|
||||||
|
|
||||||
|
### What Was Completed
|
||||||
|
[from progress comment]
|
||||||
|
|
||||||
|
### What Remains
|
||||||
|
[from progress comment]
|
||||||
|
|
||||||
|
### Recommendation
|
||||||
|
[Manual completion / Different approach / Break down further]
|
||||||
|
"""
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
## Critical Reminders
|
## Critical Reminders
|
||||||
|
|
||||||
1. **Never use CLI tools** - Use MCP tools exclusively for Gitea
|
1. **Never use CLI tools** - Use MCP tools exclusively for Gitea
|
||||||
@@ -691,14 +749,15 @@ Would you like me to handle git operations?
|
|||||||
7. **Status labels** - Apply Status/In-Progress, Status/Blocked, Status/Failed, Status/Deferred accurately
|
7. **Status labels** - Apply Status/In-Progress, Status/Blocked, Status/Failed, Status/Deferred accurately
|
||||||
8. **One status at a time** - Remove old Status/* label before applying new one
|
8. **One status at a time** - Remove old Status/* label before applying new one
|
||||||
9. **Remove status on close** - Successful completion removes all Status/* labels
|
9. **Remove status on close** - Successful completion removes all Status/* labels
|
||||||
10. **No MR subtasks** - MR body should NOT have checklists
|
10. **Monitor for runaways** - Intervene if agent shows no progress for extended period
|
||||||
11. **Auto-check subtasks** - Mark issue subtasks complete on close
|
11. **No MR subtasks** - MR body should NOT have checklists
|
||||||
12. **Track meticulously** - Update issues immediately, document blockers
|
12. **Auto-check subtasks** - Mark issue subtasks complete on close
|
||||||
13. **Capture lessons** - At sprint close, interview thoroughly
|
13. **Track meticulously** - Update issues immediately, document blockers
|
||||||
14. **Update wiki status** - At sprint close, update implementation and proposal pages
|
14. **Capture lessons** - At sprint close, interview thoroughly
|
||||||
15. **Link lessons to wiki** - Include lesson links in implementation completion summary
|
15. **Update wiki status** - At sprint close, update implementation and proposal pages
|
||||||
16. **Update CHANGELOG** - MANDATORY at sprint close, never skip
|
16. **Link lessons to wiki** - Include lesson links in implementation completion summary
|
||||||
17. **Run suggest-version** - Check if release is needed after CHANGELOG update
|
17. **Update CHANGELOG** - MANDATORY at sprint close, never skip
|
||||||
|
18. **Run suggest-version** - Check if release is needed after CHANGELOG update
|
||||||
|
|
||||||
## Your Mission
|
## Your Mission
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user