--- description: Interactive setup wizard for data-platform plugin - configures MCP server and optional PostgreSQL/dbt --- # Data Platform Setup Wizard This command sets up the data-platform plugin with pandas, PostgreSQL, and dbt integration. ## Important Context - **This command uses Bash, Read, Write, and AskUserQuestion tools** - NOT MCP tools - **MCP tools won't work until after setup + session restart** - **PostgreSQL and dbt are optional** - pandas tools work without them --- ## Phase 1: Environment Validation ### Step 1.1: Check Python Version ```bash python3 --version ``` Requires Python 3.10+. If below, stop setup and inform user. ### Step 1.2: Check for Required Libraries ```bash python3 -c "import sys; print(f'Python {sys.version_info.major}.{sys.version_info.minor}')" ``` --- ## Phase 2: MCP Server Setup ### Step 2.1: Locate Data Platform MCP Server The MCP server should be at the marketplace root: ```bash # If running from installed marketplace ls -la ~/.claude/plugins/marketplaces/leo-claude-mktplace/mcp-servers/data-platform/ 2>/dev/null || echo "NOT_FOUND_INSTALLED" # If running from source ls -la ~/claude-plugins-work/mcp-servers/data-platform/ 2>/dev/null || echo "NOT_FOUND_SOURCE" ``` Determine the correct path based on which exists. ### Step 2.2: Check Virtual Environment ```bash ls -la /path/to/mcp-servers/data-platform/.venv/bin/python 2>/dev/null && echo "VENV_EXISTS" || echo "VENV_MISSING" ``` ### Step 2.3: Create Virtual Environment (if missing) ```bash cd /path/to/mcp-servers/data-platform && python3 -m venv .venv && source .venv/bin/activate && pip install --upgrade pip && pip install -r requirements.txt && deactivate ``` **Note:** This may take a few minutes due to pandas, pyarrow, and dbt dependencies. --- ## Phase 3: PostgreSQL Configuration (Optional) ### Step 3.1: Ask About PostgreSQL Use AskUserQuestion: - Question: "Do you want to configure PostgreSQL database access?" - Header: "PostgreSQL" - Options: - "Yes, I have a PostgreSQL database" - "No, I'll only use pandas/dbt tools" **If user chooses "No":** Skip to Phase 4. ### Step 3.2: Create Config Directory ```bash mkdir -p ~/.config/claude ``` ### Step 3.3: Check PostgreSQL Configuration ```bash cat ~/.config/claude/postgres.env 2>/dev/null || echo "FILE_NOT_FOUND" ``` **If file exists with valid URL:** Skip to Step 3.6. **If missing or has placeholders:** Continue. ### Step 3.4: Gather PostgreSQL Information Use AskUserQuestion: - Question: "What is your PostgreSQL connection URL format?" - Header: "DB Format" - Options: - "Standard: postgresql://user:pass@host:5432/db" - "PostGIS: postgresql://user:pass@host:5432/db (with PostGIS extension)" - "Other (I'll provide the full URL)" Ask user to provide the connection URL. ### Step 3.5: Create Configuration File ```bash cat > ~/.config/claude/postgres.env << 'EOF' # PostgreSQL Configuration # Generated by data-platform /initial-setup POSTGRES_URL= EOF chmod 600 ~/.config/claude/postgres.env ``` ### Step 3.6: Test PostgreSQL Connection (if configured) ```bash source ~/.config/claude/postgres.env && python3 -c " import asyncio import asyncpg async def test(): try: conn = await asyncpg.connect('$POSTGRES_URL', timeout=5) ver = await conn.fetchval('SELECT version()') await conn.close() print(f'SUCCESS: {ver.split(\",\")[0]}') except Exception as e: print(f'FAILED: {e}') asyncio.run(test()) " ``` Report result: - SUCCESS: Connection works - FAILED: Show error and suggest fixes --- ## Phase 4: dbt Configuration (Optional) ### Step 4.1: Ask About dbt Use AskUserQuestion: - Question: "Do you use dbt for data transformations in your projects?" - Header: "dbt" - Options: - "Yes, I have dbt projects" - "No, I don't use dbt" **If user chooses "No":** Skip to Phase 5. ### Step 4.2: dbt Discovery dbt configuration is **project-level** (not system-level). The plugin auto-detects dbt projects by looking for `dbt_project.yml`. Inform user: ``` dbt projects are detected automatically when you work in a directory containing dbt_project.yml. If your dbt project is in a subdirectory, you can set DBT_PROJECT_DIR in your project's .env file: DBT_PROJECT_DIR=./transform DBT_PROFILES_DIR=~/.dbt ``` ### Step 4.3: Check dbt Installation ```bash dbt --version 2>/dev/null || echo "DBT_NOT_FOUND" ``` **If not found:** Inform user that dbt CLI tools require dbt-core to be installed globally or in the project. --- ## Phase 5: Validation ### Step 5.1: Verify MCP Server ```bash cd /path/to/mcp-servers/data-platform && .venv/bin/python -c "from mcp_server.server import DataPlatformMCPServer; print('MCP Server OK')" ``` ### Step 5.2: Summary ``` ╔════════════════════════════════════════════════════════════╗ ║ DATA-PLATFORM SETUP COMPLETE ║ ╠════════════════════════════════════════════════════════════╣ ║ MCP Server: ✓ Ready ║ ║ pandas Tools: ✓ Available (14 tools) ║ ║ PostgreSQL Tools: [✓/✗] [Status based on config] ║ ║ PostGIS Tools: [✓/✗] [Status based on PostGIS] ║ ║ dbt Tools: [✓/✗] [Status based on discovery] ║ ╚════════════════════════════════════════════════════════════╝ ``` ### Step 5.3: Session Restart Notice --- **⚠️ Session Restart Required** Restart your Claude Code session for MCP tools to become available. **After restart, you can:** - Run `/ingest` to load data from files or database - Run `/profile` to analyze DataFrame statistics - Run `/schema` to explore database/DataFrame schema - Run `/run` to execute dbt models (if configured) - Run `/lineage` to view dbt model dependencies --- ## Memory Limits The data-platform plugin has a default row limit of 100,000 rows per DataFrame. For larger datasets: - Use chunked processing (`chunk_size` parameter) - Filter data before loading - Store to Parquet for efficient re-loading You can override the limit by setting in your project `.env`: ``` DATA_PLATFORM_MAX_ROWS=500000 ```