Skip to content

Conversation

@tonychang04
Copy link
Contributor

@tonychang04 tonychang04 commented Oct 22, 2025

Change Type

  • ✨ feat
  • πŸ› fix
  • ♻️ refactor
  • πŸ’„ style
  • πŸ‘· build
  • ⚑️ perf
  • πŸ“ docs
  • πŸ”¨ chore

Description of Change

  1. Support insforge and supabase as part of postgres dataset. Then we can cross compare the behavior of the three mcps
  2. Added sonnet 4.5 as part of bench mark test

test:
opened supabase cli and tested the dataset
opened insforge local host version and tested the dataaset

# Insforge (for Insforge tasks)
INSFORGE_API_KEY="ik_xxx"
INSFORGE_BACKEND_URL="http://localhost:7130"

# Supabase (for Supabase tasks)
# Run: supabase init && supabase start
# Get keys from: supabase status
SUPABASE_API_URL="http://localhost:54321"
SUPABASE_API_KEY="sb_secret_"  # Using secret key for full access

python -m pipeline
--mcp supabase
--models claude-sonnet-4.5
--exp-name supabase-comprehensive-sonnet45
--tasks all
--k 4

python -m pipeline \
--mcp insforge \ --models claude-sonnet-4.5 \
--exp-name insforge-comprehensive-sonnet45
--tasks all
--k 4

Additional Information

Screenshot 2025-10-21 at 22 01 24

This is the benchmark results. We can also provide the runs and the files!

tonychang04 and others added 5 commits October 17, 2025 16:03
Adds complete support for benchmarking Insforge Backend-as-a-Service via MCP.

## What's Added:
- **Insforge MCP Service**: New service configuration in `src/services.py`
- **State Management**: `InsforgeStateManager` handles backend setup via `prepare_environment.py` scripts
- **Login Helper**: `InsforgeLoginHelper` validates backend connectivity
- **Task Manager**: `InsforgeTaskManager` manages Insforge-specific tasks
- **MCP Integration**: Added Insforge to both `base_agent.py` and `mcpmark_agent.py`
- **Docker Support**: Updated `run-task.sh` with Insforge container configuration

## How It Works:
- Uses `@insforge/mcp` npm package for MCP server
- Requires `INSFORGE_API_KEY` and `INSFORGE_BACKEND_URL` environment variables
- Can test SQL tasks by symlinking `tasks/insforge -> tasks/postgres`
- Enables comparison between direct SQL (postgres-mcp) and REST API (insforge) approaches

## Configuration:
```bash
# In .mcp_env
INSFORGE_API_KEY="your-api-key"
INSFORGE_BACKEND_URL="http://localhost:7130"
```

## Usage:
```bash
python -m pipeline \
  --mcp insforge \
  --models claude-sonnet-4 \
  --exp-name insforge-test \
  --tasks your_task \
  --k 1
```

πŸ€– Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@tonychang04 tonychang04 changed the title Feat/add insforge mcp support Feat/add insforge + supabase mcp support to evaluate against postgres Oct 22, 2025
@zjwu0522 zjwu0522 requested review from arvinxx and zjwu0522 October 22, 2025 05:17
@zjwu0522
Copy link
Collaborator

Thanks for PR! We will look into this asap.

Copy link
Collaborator

@zjwu0522 zjwu0522 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, cc @arvinxx

@zjwu0522 zjwu0522 merged commit 528b589 into eval-sys:main Oct 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants