Template

Files

lmiranda 2e65b60725 refactor(projman): extract skills and consolidate commands

Major refactoring of projman plugin architecture:

Skills Extraction (17 new files):
- Extracted reusable knowledge from commands and agents into skills/
- branch-security, dependency-management, git-workflow, input-detection
- issue-conventions, lessons-learned, mcp-tools-reference, planning-workflow
- progress-tracking, repo-validation, review-checklist, runaway-detection
- setup-workflows, sprint-approval, task-sizing, test-standards, wiki-conventions

Command Consolidation (17 → 12 commands):
- /setup: consolidates initial-setup, project-init, project-sync (--full/--quick/--sync)
- /debug: consolidates debug-report, debug-review (report/review modes)
- /test: consolidates test-check, test-gen (run/gen modes)
- /sprint-status: absorbs sprint-diagram via --diagram flag

Architecture Cleanup:
- Remove plugin-level mcp-servers/ symlinks (6 plugins)
- Remove plugin README.md files (12 files, ~2000 lines)
- Update all documentation to reflect new command structure
- Fix documentation drift in CONFIGURATION.md, COMMANDS-CHEATSHEET.md

Commands are now thin dispatchers (~20-50 lines) that reference skills.
Agents reference skills for domain knowledge instead of inline content.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-01-30 15:02:16 -05:00

3.4 KiB

Raw Blame History

name, description

name	description
runaway-detection	Detecting and handling stuck agents

Runaway Detection

Purpose

Defines how to detect stuck agents and intervention protocols.

When to Use

Orchestrator agent: When monitoring dispatched agents
Executor agent: Self-monitoring during execution

Warning Signs

Sign	Threshold	Action
No progress comment	30+ minutes	Investigate
Same phase repeated	20+ tool calls	Consider stopping
Same error 3+ times	Immediately	Stop agent
Approaching budget	80% of limit	Post checkpoint

Agent Timeout Guidelines

Task Size	Expected Duration	Intervention Point
XS	~5-10 min	15 min no progress
S	~10-20 min	30 min no progress
M	~20-40 min	45 min no progress

Detection Protocol

Read latest progress comment - Check tool call count and phase
Compare to previous - Is progress happening?
Check for error patterns - Same error repeating?
Evaluate time elapsed - Beyond expected duration?

Intervention Protocol

When you detect an agent may be stuck:

Step 1: Assess

Agent Status Check for #45:
- Last progress: 25 minutes ago
- Phase: "Testing" (same as 20 tool calls ago)
- Errors: "ModuleNotFoundError" (3 times)
- Assessment: LIKELY STUCK

Step 2: Stop Agent

# If TaskStop available
TaskStop(task_id="agent-id")

Step 3: Update Issue Status

update_issue(
    repo="org/repo",
    issue_number=45,
    labels=["Status/Failed", ...other_labels]
)

Step 4: Add Explanation Comment

add_comment(
    repo="org/repo",
    number=45,
    body="""## Agent Intervention
**Reason:** No progress detected for 25 minutes / repeated errors
**Last Status:** Testing phase, ModuleNotFoundError x3
**Action:** Stopped agent, requires human review

### What Was Completed
- [x] Created auth/jwt_service.py
- [x] Implemented generate_token()

### What Remains
- [ ] Fix import issue
- [ ] Write tests
- [ ] Commit

### Recommendation
- Check for missing dependency in requirements.txt
- May need manual intervention to resolve import
"""
)

Self-Monitoring (Executor)

Executors should self-monitor:

Circuit Breakers

Same error 3 times: Stop and report
80% of tool call budget: Post checkpoint
File not found 3 times: Stop and ask for help
Test failing same way 5 times: Stop and report

Self-Check Template

Self-check at tool call 45/100:
- Progress: 4/7 steps completed
- Current phase: Testing
- Errors encountered: 1 (resolved)
- Remaining budget: 55 calls
- Status: ON TRACK

Recovery Actions

After stopping a stuck agent:

Preserve work - Branch and commits remain
Document state - Checkpoint in issue comment
Identify cause - What caused the loop?
Plan recovery:
- Manual completion
- Different approach
- Break down further
- Assign to human

Common Stuck Patterns

Pattern	Cause	Solution
Import loop	Missing dependency	Add to requirements
Test loop	Non-deterministic test	Fix test isolation
Validation loop	Error message not changing	Improve error specificity
File not found	Wrong path	Verify path exists
Permission denied	File ownership	Check permissions

3.4 KiB Raw Blame History