Page:
lessons/sprints/sprint-3---agent-runaway-detection-and-timeout-handling
Pages
Change V5.4.0: Multi-Model Agent Support Proposal
Change V5.4.0: Multi-Model Support (Sprint 7 Implementation)
Change V04.1.0: Proposal (Implementation 1)
Change V04.1.0: Proposal
Change-V5.2.0:-Plugin-Enhancements-(Sprint-4-Commands)
Change-V5.2.0:-Plugin-Enhancements-(Sprint-5-Documentation)
Change-V5.2.0:-Plugin-Enhancements-Proposal.-
Change-V5.5.0:-Hook-Efficiency-Quick-Wins-(Sprint-8-Implementation)
Change-V5.6.0:-Domain-Advisory-Pattern-(Sprint-9-Implementation).-
Change V5.6.0: Domain Advisory Pattern Proposal
Change-V5.7.0:-Data-Platform-Domain-Advisory-(Sprint-10-Implementation)
RFC-Hook-Efficiency-Improvements
RFC-Perf-Sentinel-Plugin
Sprint-1-viz-platform-Implementation-Plan
branding/header-templates
branding/plugin-registry
branding/progress-templates
branding/visual-spec
lessons/patterns/agent-model-field-not-supported-by-claude-code
lessons/patterns/command-frontmatter-missing-name-field-causes-silent-load-failure
lessons/patterns/hook-message-wording-affects-claude-continuation-behavior
lessons/patterns/mcp-venv-symlinks-lost-on-marketplace-update---5-hour-debug-loop
lessons/patterns/mcp_servers-field-in-pluginjson---another-failed-debug-theory
lessons/patterns/plugin-hooks-must-be-in-separate-file-not-inline
lessons/patterns/plugin-load-errors---missing-name-field-in-command-frontmatter
lessons/patterns/plugin-load-failure---check-command-frontmatter-first
lessons/patterns/plugin-manifest-validation---hooks-and-agents-format-requirements
lessons/patterns/plugin-version-mismatch-causes-silent-load-failure
lessons/patterns/reset-pandas-index-after-filtering-to-prevent-column-pollution
lessons/patterns/session-2026-02-02---mcp-server-venv-package-installation-failures
lessons/patterns/setup-wizard-url-format-mismatch
lessons/patterns/sprint-4---new-commands-not-discoverable-until-session-restart
lessons/patterns/startup-hooks-must-check-venv-cache-path-first
lessons/patterns/sync-entire-plugin-directory-not-individual-files
lessons/patterns/use-fixes-n-keyword-for-automatic-issue-closing-in-prs
lessons/sprints/cache-clearing-breaks-mcp-tools-mid-session
lessons/sprints/sprint-1---viz-platform-plugin-implementation
lessons/sprints/sprint-10---domain-advisory-pattern-replication-success
lessons/sprints/sprint-2---contract-validator-plugin-implementation
lessons/sprints/sprint-3---agent-runaway-detection-and-timeout-handling
lessons/sprints/sprint-3---background-agent-permissions-must-be-pre-granted
lessons/sprints/sprint-3---mcp-server-branch-detection-bug-runs-from-installed-dir
lessons/sprints/sprint-4---plugin-commands-implementation
lessons/sprints/sprint-6---visual-branding-and-documentation-maintenance
lessons/sprints/sprint-8---parallel-hook-optimization-success
lessons/sprints/v400-release---wiki-workflow-and-versioning-patterns
lessons/sprints/versioning-workflow---use-unreleased-and-release-script
lessons-learned/sprints/hook-efficiency-rfc
unnamed
Clone
1
lessons/sprints/sprint-3---agent-runaway-detection-and-timeout-handling
Leo Miranda edited this page 2026-01-28 15:09:52 +00:00
Sprint 3 - Agent Runaway Detection and Timeout Handling
Metadata
- Implementation: Change V5.2.0: Plugin Enhancements Proposal (Sprint 3 Hooks)
- Issues: #225, #226, #227, #228, #229, [Sprint 3] feat: Implement breaking change detection for contract-validator (#230)
- Sprint: Sprint 3
Context
Background agents were spawned to implement hook functionality. Some agents ran for extended periods without completing.
Problem
Agents ran 400+ tool calls over approximately 1 hour without completing their tasks. They got stuck in loops or kept exploring tangential paths instead of completing the core implementation. Manual intervention was required to stop them and commit the partial work.
Solution
- Stopped the runaway agents manually
- Reviewed what work had been completed
- Committed the completed portions
- Finished remaining work in the main session
Prevention
Agent design best practices:
- Give agents NARROW, SPECIFIC tasks (not broad "implement feature X")
- Include explicit completion criteria in the agent prompt
- Set maximum tool call limits when possible
- Break large tasks into smaller subtasks with checkpoints
Monitoring:
- Check agent progress periodically (every 15-20 minutes)
- If agent exceeds 100 tool calls, review if it's making progress
- Look for repetitive patterns (same files being read/edited repeatedly)
- Be ready to intervene and salvage partial work
Task scoping:
- BAD: "Implement the vagueness detection hook"
- GOOD: "Create hooks/hooks.json with a UserPromptSubmit hook that runs detect-vagueness.sh"
Tags
agents, timeout, runaway, claude-code, sprint-3
Tags: agents, timeout, runaway, claude-code, sprint-3