From cef09b35b8b9c8b567a610909935f7a4d49c2760 Mon Sep 17 00:00:00 2001 From: Leo Miranda Date: Tue, 3 Feb 2026 03:37:06 +0000 Subject: [PATCH] Update "lessons%2Fpatterns%2Fmcp-venv-symlinks-lost-on-marketplace-update---5-hour-debug-loop.-" --- ...arketplace-update---5-hour-debug-loop.-.md | 215 +++++++++++++++--- 1 file changed, 187 insertions(+), 28 deletions(-) diff --git a/lessons%2Fpatterns%2Fmcp-venv-symlinks-lost-on-marketplace-update---5-hour-debug-loop.-.md b/lessons%2Fpatterns%2Fmcp-venv-symlinks-lost-on-marketplace-update---5-hour-debug-loop.-.md index 3810c95..ad97049 100644 --- a/lessons%2Fpatterns%2Fmcp-venv-symlinks-lost-on-marketplace-update---5-hour-debug-loop.-.md +++ b/lessons%2Fpatterns%2Fmcp-venv-symlinks-lost-on-marketplace-update---5-hour-debug-loop.-.md @@ -6,7 +6,7 @@ ## Context -User was working on branch `fix/startup-hook-venv-cache-path` to fix venv path issues in startup hooks. After marketplace reinstalls, plugins repeatedly failed to load with "1 error" message. +User was working on branch `fix/startup-hook-venv-cache-path` to fix venv path issues in startup hooks. After marketplace reinstalls, plugins repeatedly failed to load with "1 error" message. User had to reinstall at least 4 times in 30 minutes. ## Problem @@ -16,18 +16,186 @@ User was working on branch `fix/startup-hook-venv-cache-path` to fix venv path i 4. **No postInstall hooks:** Claude Code marketplaces do NOT support postInstall hooks (feature requested in GitHub issue #11240) 5. **SessionStart hooks can't fix it:** MCP servers fail BEFORE hooks run, so startup-check.sh can't auto-create symlinks in time -## What Was Tried (All Failed to Identify Root Cause) +## Complete Debug Trace - Everything Claude Did -- ✅ Verified plugin.json files are valid JSON -- ✅ Verified hooks.json format is correct -- ✅ Verified hook scripts are executable -- ✅ Verified MCP server references match .mcp.json -- ✅ Verified commands directories exist -- ✅ Ran validate-marketplace.sh - all passed -- ✅ Verified cache venvs exist and work -- ✅ MCP servers start successfully when invoked manually +### 1. Initial Plugin Status Check +User ran `/plugin` command showing: +- contract-validator: failed to load - 1 error +- projman: failed to load - 1 error +- plugin:pyright-lsp:pyright: failed to load - 1 error +- data-platform: enabled but startup hook showed "MCP venv missing" -**Critical Gap:** Claude Code does NOT expose the actual plugin load error message. User sees "failed to load - 1 error" with no way to see WHAT the error is. +### 2. Read Plugin JSON Files +```bash +# Read source plugin.json files +/home/lmiranda/claude-plugins-work/plugins/projman/.claude-plugin/plugin.json +/home/lmiranda/claude-plugins-work/plugins/contract-validator/.claude-plugin/plugin.json +``` +**Result:** Both valid JSON with mcp_servers references + +### 3. Ran verify-hooks.sh +```bash +./scripts/verify-hooks.sh +``` +**Result:** All 7 hooks verified OK (doc-guardian, code-sentinel, projman, pr-review, project-hygiene, data-platform, cmdb-assistant) + +### 4. Checked Installed MCP Servers Directory +```bash +ls -la ~/.claude/plugins/marketplaces/leo-claude-mktplace/mcp-servers/ +``` +**Result:** Found 5 servers: contract-validator, data-platform, gitea, netbox, viz-platform + +### 5. Checked .venv in Each Installed MCP Server +```bash +for dir in ~/.claude/plugins/marketplaces/leo-claude-mktplace/mcp-servers/*/; do + ls -la "$dir".venv +done +``` +**Result:** ALL .venv directories MISSING in installed path +- contract-validator: NO .venv +- data-platform: NO .venv +- gitea: NO .venv +- netbox: NO .venv +- viz-platform: NO .venv + +### 6. Checked .venv in Source MCP Servers +```bash +for dir in /home/lmiranda/claude-plugins-work/mcp-servers/*/; do + ls -la "$dir".venv +done +``` +**Result:** ALL have symlinks pointing to cache: +- contract-validator/.venv -> ~/.cache/claude-mcp-venvs/leo-claude-mktplace/contract-validator/.venv +- data-platform/.venv -> ~/.cache/claude-mcp-venvs/leo-claude-mktplace/data-platform/.venv +- gitea/.venv -> ~/.cache/claude-mcp-venvs/leo-claude-mktplace/gitea/.venv +- netbox/.venv -> ~/.cache/claude-mcp-venvs/leo-claude-mktplace/netbox/.venv +- viz-platform/.venv -> ~/.cache/claude-mcp-venvs/leo-claude-mktplace/viz-platform/.venv + +### 7. Checked Cache Directory Exists +```bash +ls -la ~/.cache/claude-mcp-venvs/leo-claude-mktplace/ +``` +**Result:** Cache exists with all 5 server venvs from Jan 27 + +### 8. Ran setup-venvs.sh in Installed Path +```bash +cd ~/.claude/plugins/marketplaces/leo-claude-mktplace && ./scripts/setup-venvs.sh +``` +**Result:** Created all symlinks successfully + +### 9. Verified Symlinks Created +```bash +for dir in ~/.claude/plugins/marketplaces/leo-claude-mktplace/mcp-servers/*/; do + readlink "$dir.venv" +done +``` +**Result:** All 5 symlinks now point to cache + +### 10. Searched for postInstall Hook Support +Used claude-code-guide agent to research if marketplace.json supports postInstall hooks. +**Result:** NOT SUPPORTED - feature requested in GitHub issue #11240 + +### 11. Read Startup Hook Scripts +```bash +# Read data-platform startup hook +/home/lmiranda/claude-plugins-work/plugins/data-platform/hooks/startup-check.sh +``` +**Result:** Hook checks cache first, then local, but only WARNS - doesn't auto-fix + +### 12. Modified data-platform Startup Hook (USELESS) +Added code to auto-create symlink if missing. **THIS WAS POINTLESS** because MCP servers fail BEFORE hooks run. + +### 13. Checked MCP Server Dependencies per Plugin +```bash +grep "mcp_servers" /home/lmiranda/claude-plugins-work/plugins/*/.claude-plugin/plugin.json +``` +**Result:** +- cmdb-assistant → netbox +- contract-validator → contract-validator +- data-platform → data-platform +- projman → gitea +- pr-review → gitea +- viz-platform → viz-platform + +### 14. Read .mcp.json Configuration +```bash +cat /home/lmiranda/claude-plugins-work/.mcp.json +``` +**Result:** All 5 MCP servers configured with relative paths to run.sh + +### 15. Verified Cache Venvs Have Python Binary +```bash +ls -la ~/.cache/claude-mcp-venvs/leo-claude-mktplace/data-platform/.venv/bin/python +``` +**Result:** Exists (symlink to python3) + +### 16. Tested MCP Server Startup Manually +```bash +cd ~/.claude/plugins/marketplaces/leo-claude-mktplace && ./mcp-servers/gitea/run.sh & +``` +**Result:** Server starts and runs successfully + +### 17. Checked Plugin JSON Validity +```bash +python3 -c "import json; json.load(open('plugin.json')); print('VALID')" +``` +**Result:** Both projman and contract-validator JSON valid + +### 18. Checked Commands Directories +```bash +ls ~/.claude/plugins/marketplaces/leo-claude-mktplace/plugins/projman/commands/ +ls ~/.claude/plugins/marketplaces/leo-claude-mktplace/plugins/contract-validator/commands/ +``` +**Result:** Both exist with command files + +### 19. Checked Hook Script Permissions +```bash +ls -la ~/.claude/plugins/marketplaces/.../hooks/*.sh +``` +**Result:** All executable (755) + +### 20. Checked MCP Server References Match +Compared mcp_servers in plugin.json vs entries in .mcp.json +**Result:** All match + +### 21. Ran validate-marketplace.sh +```bash +cd ~/.claude/plugins/marketplaces/leo-claude-mktplace && ./scripts/validate-marketplace.sh +``` +**Result:** ALL VALIDATIONS PASSED (12 plugins, all hooks, all MCP servers) + +### 22. Checked for Claude Logs +```bash +ls ~/.claude/logs/ +ls ~/.config/claude-code/logs/ +``` +**Result:** No log directories exist + +### 23. Compared Source vs Installed Versions +**Result:** Version mismatch (not the cause): +- projman: Source 3.4.0, Installed 3.3.0 +- contract-validator: Source 1.2.0, Installed 1.1.0 + +### 24. Read run.sh to Check Venv Logic +```bash +cat /home/lmiranda/claude-plugins-work/mcp-servers/gitea/run.sh +``` +**Result:** run.sh ALREADY checks cache first (line 10-11), so should work without local symlink + +## Critical Finding + +**The run.sh scripts already check cache FIRST.** If cache venv exists, MCP server should start. But plugins still fail with "1 error". + +**Claude Code hides the actual error message.** There is NO way for users to see WHAT the error is. This makes debugging impossible. + +## What Claude Got Wrong + +1. **Said "Found it!" at step 5** - Wrong. Identified symptom (missing symlinks) but not why plugins fail when cache venv exists +2. **Said "Found it!" at step 6** - Wrong. Symlinks in source don't matter, run.sh checks cache first +3. **Modified startup hook** - Useless. MCP servers fail before hooks run +4. **Spent 20+ tool calls checking things that were fine** - JSON valid, hooks valid, permissions valid, references valid +5. **Never identified the actual error** - Because Claude Code doesn't expose it +6. **Didn't recognize that run.sh already handles cache paths** - Should have stopped investigating after reading run.sh ## Actual Fix (Temporary) @@ -37,31 +205,22 @@ cd ~/.claude/plugins/marketplaces/leo-claude-mktplace ./scripts/setup-venvs.sh ``` -This creates symlinks in the installed path. But they get wiped on next update. +This creates symlinks. But they get wiped on next update. -## Proper Fix Needed +## Real Root Cause (UNKNOWN) -1. **Option A:** Make `run.sh` auto-create symlinks before starting MCP server -2. **Option B:** Have Claude Code preserve symlinks during marketplace updates -3. **Option C:** Wait for postInstall hooks feature (GitHub issue #11240) +The run.sh scripts check cache first. Cache venvs exist. MCP servers start manually. But plugins fail to load with "1 error". + +**We never found the actual cause because Claude Code hides error messages.** ## Prevention 1. **After ANY marketplace reinstall/update:** Run `setup-venvs.sh` in installed path -2. **Document this prominently:** Add to README and CLAUDE.md -3. **Consider:** Making all MCP paths absolute to cache instead of relative with symlinks - -## Claude AI Failures in This Session - -1. Said "Found it!" multiple times without actually finding the root cause -2. Kept checking things that turned out to be fine -3. Did not identify that the actual error message is hidden from users -4. Took 5+ hours of user time without resolving the underlying issue -5. The SessionStart hook fix added won't help because MCP servers fail before hooks run +2. **File a bug report:** Claude Code should expose plugin load errors +3. **Consider:** Making MCP server paths absolute in .mcp.json instead of relative ## Tags -plugins, mcp, venv, symlinks, marketplace, installation, critical-bug - +plugins, mcp, venv, symlinks, marketplace, installation, critical-bug, time-waste, claude-failure, hidden-errors --- **Tags:** plugins, mcp, venv, symlinks, marketplace, critical-bug, time-waste, claude-failure \ No newline at end of file