[data-platform] filter tool adds unexpected __index_level_0__ column #203

Closed
opened 2026-01-26 22:20:38 +00:00 by lmiranda · 1 comment
Owner

User-Reported Issue

Reported: 2026-01-26T17:20:00-05:00
Reporter: Claude Code via /debug-report (user feedback)

Context

Field Value
Plugin data-platform
Command/Tool filter
Repository personal-projects/personal-portfolio
Working Directory /home/lmiranda/Repositories/personal/personal-portfolio
Branch development

Problem Description

Goal

Filter DataFrame rows by condition and get a result with the same schema as the source DataFrame.

What Happened

Problem Type: Unexpected behavior

When using the filter tool on a DataFrame, the resulting DataFrame includes an extra column __index_level_0__ that was not present in the original data.

Example:

Original DataFrame (sales): 4 columns
  - date, product, quantity, price

After filter(sales, 'quantity > 5'): 5 columns
  - date, product, quantity, price, __index_level_0__

This was observed in the list_data output:

{
  "ref": "high_quantity",
  "rows": 7,
  "columns": 5,
  "column_names": ["date", "product", "quantity", "price", "__index_level_0__"]
}

Expected Behavior

The filtered DataFrame should have the same columns as the source DataFrame (4 columns, not 5). The pandas index should either be reset or not exposed as a column in the stored result.

Workaround

None identified. Users could potentially use select after filter to drop the unwanted column, but this is not intuitive.

Investigation Hints

Based on the affected plugin/command, relevant files to check:

  • mcp-servers/data-platform/mcp_server/tools/dataframe_ops.py - likely contains filter implementation
  • Look for how the DataFrame is stored after filtering - the index may be getting converted to a column
  • Consider adding .reset_index(drop=True) before storing the filtered result

Suggested Fix

In the filter tool implementation, after applying the pandas query, reset the index:

result = df.query(condition).reset_index(drop=True)

This will drop the original index and create a new sequential index, preventing the __index_level_0__ column from appearing.


Generated by /debug-report (user feedback) - Labels: Type/Bug, Component/API

## User-Reported Issue **Reported**: 2026-01-26T17:20:00-05:00 **Reporter**: Claude Code via /debug-report (user feedback) ## Context | Field | Value | |-------|-------| | Plugin | `data-platform` | | Command/Tool | `filter` | | Repository | `personal-projects/personal-portfolio` | | Working Directory | `/home/lmiranda/Repositories/personal/personal-portfolio` | | Branch | `development` | ## Problem Description ### Goal Filter DataFrame rows by condition and get a result with the same schema as the source DataFrame. ### What Happened **Problem Type**: Unexpected behavior When using the `filter` tool on a DataFrame, the resulting DataFrame includes an extra column `__index_level_0__` that was not present in the original data. **Example:** ``` Original DataFrame (sales): 4 columns - date, product, quantity, price After filter(sales, 'quantity > 5'): 5 columns - date, product, quantity, price, __index_level_0__ ``` This was observed in the `list_data` output: ```json { "ref": "high_quantity", "rows": 7, "columns": 5, "column_names": ["date", "product", "quantity", "price", "__index_level_0__"] } ``` ### Expected Behavior The filtered DataFrame should have the same columns as the source DataFrame (4 columns, not 5). The pandas index should either be reset or not exposed as a column in the stored result. ## Workaround None identified. Users could potentially use `select` after `filter` to drop the unwanted column, but this is not intuitive. ## Investigation Hints Based on the affected plugin/command, relevant files to check: - `mcp-servers/data-platform/mcp_server/tools/dataframe_ops.py` - likely contains filter implementation - Look for how the DataFrame is stored after filtering - the index may be getting converted to a column - Consider adding `.reset_index(drop=True)` before storing the filtered result ## Suggested Fix In the filter tool implementation, after applying the pandas query, reset the index: ```python result = df.query(condition).reset_index(drop=True) ``` This will drop the original index and create a new sequential index, preventing the `__index_level_0__` column from appearing. --- *Generated by /debug-report (user feedback) - Labels: Type/Bug, Component/API*
lmiranda added the Component/APIType/Bug labels 2026-01-26 22:20:38 +00:00
Author
Owner

Resolution

Fixed in branch fix/data-platform-filter-index (commit 4ed3ed7).

Root cause: df.query() preserves the original DataFrame index. When the filtered result is stored, the index gets converted to a column named __index_level_0__.

Fix: Added .reset_index(drop=True) after the query operation in pandas_tools.py:333:

filtered = df.query(condition).reset_index(drop=True)

Merged via PR #204.

## Resolution Fixed in branch `fix/data-platform-filter-index` (commit `4ed3ed7`). **Root cause:** `df.query()` preserves the original DataFrame index. When the filtered result is stored, the index gets converted to a column named `__index_level_0__`. **Fix:** Added `.reset_index(drop=True)` after the query operation in `pandas_tools.py:333`: ```python filtered = df.query(condition).reset_index(drop=True) ``` Merged via PR #204.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: personal-projects/leo-claude-mktplace#203