A modern Model Context Protocol (MCP) server that provides advanced browser automation capabilities using the FastMCP framework. Features session-based instance management, TTL cleanup, PDF generation, file downloads, cookie management, and comprehensive browser configuration options. All browser operations are implemented via browser-use.
- Session-Based Management: Each MCP session gets its own isolated browser instance automatically
- Advanced Browser Control: Full browser automation with Playwright backend (via browser-use)
- PDF Generation: Convert web pages to PDF with custom formatting options
- File Operations: Download/upload files, manage file system, and access all temp files
- Cookie Management: Set, get, and manage browser cookies for authentication
- Screenshot Capture: Take full-page, viewport, or element screenshots
- Tab Management: Create, switch, and close browser tabs
- Content Extraction: Extract and search page content
- Session Persistence: Automatic cleanup with configurable TTL
- Multi-Instance Support: Run multiple isolated browser sessions
- Configurable Security: All browser security settings are configurable via API
-
Install Dependencies:
Using uv (recommended):
uv sync --all-extras
-
Install the Browser:
uv run playwright install --with-deps chromium
-
Start the Server:
Using uv (recommended):
uv run main.py
-
Basic Usage (Direct SessionBrowserManager):
# Direct usage without MCP protocol (for testing/development) from browser_fastmcp_server import SessionBrowserManager, BrowserConfig import asyncio async def main(): # Create session manager manager = SessionBrowserManager(max_instances=5, default_ttl=300) await manager.start_cleanup_task() # Create a new browser session session_id = "test_session_123" instance = await manager.get_or_create_session_instance( session_id, BrowserConfig(headless=True) ) # Navigate to a website browser_session = instance.browser_session await browser_session.navigate("https://example.com") # Get page elements state_summary = await browser_session.get_state_summary(cache_clickable_elements_hashes=True) print(f"Interactive elements: {len(state_summary.selector_map)}") # Take a screenshot page = await browser_session.get_current_page() screenshot_bytes = await page.screenshot(full_page=True) # Close session when done await manager.close_session(session_id) await manager.shutdown() if __name__ == "__main__": asyncio.run(main())
Install test dependencies and run all tests:
uv run python -m pytest test_browser_workflow_test.py test_browser_fastmcp_client.py test_browser_test.py -v
create_chrome_instance(headless, viewport_width, viewport_height)
→ Create a new browser session, returnssession_id
close_instance(session_id)
→ Close a specific sessionget_instance_info(session_id)
→ Get info for a sessioncheck_browser_health(session_id)
→ Check the health status of a browser session and provide recovery suggestionsget_browser_status()
→ List all sessionsclose_all_instances()
→ Close all sessions
set_browser_config(session_id, headless, no_sandbox, user_agent, viewport_width, viewport_height, disable_web_security)
→ Set browser config (restart if needed)get_browser_config(session_id)
→ Get current config
navigate_to(session_id, url, new_tab=False)
→ Go to any URL (optionally in new tab)navigate_back(session_id)
/navigate_forward(session_id)
→ History navigationrefresh_page(session_id)
→ Refresh the current pageget_page_state(session_id)
→ List interactive elements with indices
get_tabs_info(session_id)
→ List all open tabsswitch_tab(session_id, page_id)
→ Switch between tabsclose_tab(session_id, page_id)
→ Close specific tab
click_element(session_id, index)
→ Click element by indexclick_element_by_xpath(session_id, xpath)
→ Click element by XPathinput_text(session_id, index, text)
→ Type into form fieldsset_element_value(session_id, index, value)
→ Set input/select value directlyget_element_info(session_id, index=None, xpath=None)
→ Get element info (by index or xpath)send_keys(session_id, keys)
→ Send keyboard shortcutsupload_file(session_id, index, file_path)
→ Upload files to formsget_dropdown_options(session_id, index)
→ Inspect select elements
take_screenshot(session_id, target=None, width=None, height=None, full_page=True, quality=90, format="png")
→ Capture screenshotsgenerate_pdf(session_id, url=None, html_content=None, output_filename=None, ...)
→ Save page as PDFdownload_file(session_id, url, output_filename=None, timeout=30)
→ Download files from URLsdownload_image(session_id, image_url, output_filename=None, timeout=30)
→ Download images specifically
set_cookie(session_id, name, value, domain, path, http_only, secure, same_site, expires, max_age)
→ Set browser cookiesget_cookies(session_id, domain=None)
→ Retrieve current cookies
scroll_page(session_id, direction="down")
→ Scroll up/downextract_content(session_id, query)
→ Extract text contentwait(seconds)
→ Pause executionbrowser_tips()
→ Get automation best practicessearch_bing(session_id, query)
→ Bing search
browser://status
→ Manager and sessions statusbrowser://instances
→ All sessions infobrowser://instance/{id}/page
→ Session page infobrowser://instance/{id}/tabs
→ Session tabsbrowser://instance/{id}/screenshots
→ Session screenshotsbrowser://instance/{id}/status
→ Session status (detailed)browser://instance/{id}/files
→ Session temp filesbrowser://instance/{id}/cookies
→ Session cookiesbrowser://instance/{id}/file/{relative_path}
→ Read a file in session tempbrowser://help
→ This help
Configure the server using environment variables:
# Maximum number of concurrent browser instances
BROWSER_MAXIMUM_INSTANCES=10
# Session TTL in seconds (default: 30 minutes)
BROWSER_INSTANCE_TTL=1800
# Command execution timeout in seconds
BROWSER_EXECUTE_TIMEOUT=30
# Cleanup interval in seconds
BROWSER_CLEANUP_INTERVAL=60
Built-in prompts for common automation scenarios:
web_testing(url, test_scenario)
→ Web testing workflowsdata_extraction(url, data_type)
→ Data extraction strategiesform_filling(url, form_data)
→ Automated form filling (returns conversation)automation_troubleshooting()
→ Debugging help
-
Add to Claude Desktop Configuration:
Edit your Claude Desktop configuration file (usually at
~/Library/Application Support/Claude/claude_desktop_config.json
on macOS):{ "mcpServers": { "browser-mcp": { "command": "uv", "args": ["run", "fastmcp", "run", "/path/to/browser-mcp/browser_fastmcp_server.py"], "env": { "BROWSER_MAXIMUM_INSTANCES": "5", "BROWSER_INSTANCE_TTL": "1800" } } } }
-
Restart Claude Desktop to load the MCP server
-
Start Using: The browser automation tools will now be available in your Claude conversations
Method 1: Network-based MCP Client (via HTTP/SSE)
import asyncio
from mcp import ClientSession, SSEClientTransport
async def main():
# Connect to the running server via network
transport = SSEClientTransport("http://localhost:8000/sse")
async with ClientSession(transport) as session:
# Initialize session
await session.initialize()
# Start browser
info = await session.call_tool("create_chrome_instance", {"headless": True})
session_id = info["session_id"]
# Navigate to website
await session.call_tool("navigate_to", {"session_id": session_id, "url": "https://example.com"})
# Take screenshot
await session.call_tool("take_screenshot", {"session_id": session_id})
# Close session
await session.call_tool("close_instance", {"session_id": session_id})
if __name__ == "__main__":
asyncio.run(main())
Method 2: Direct Client (No Network)
import asyncio
from fastmcp import Client
from browser_fastmcp_server import mcp as browsers_mcp
async def main():
# Direct client connection (no network)
client = Client(browsers_mcp)
async with client:
# Start browser
session = await client.call_tool("create_chrome_instance", {"headless": True})
session_id = session.data.session_id
# Navigate to website
await client.call_tool("navigate_to", {"session_id": session_id, "url": "https://example.com"})
# Take screenshot
await client.call_tool("take_screenshot", {"session_id": session_id})
# Close session
await client.call_tool("close_instance", {"session_id": session_id})
if __name__ == "__main__":
asyncio.run(main())
For server deployments requiring authentication, modify main.py
to set an AuthProvider before startup:
Basic Authentication:
from fastmcp.auth import BasicAuth
# Add this before mcp.run()
mcp.auth = BasicAuth(username="admin", password="password")
JWT Authentication (Recommended for Production):
For more advanced authentication, we recommend using fastmcp-authentication:
from fastmcp_authentication import BearerAuthProvider
JWKS_URI = "http://localhost:8080/.well-known/jwks.json"
auth = BearerAuthProvider(
jwks_uri=JWKS_URI,
issuer="http://localhost:8080",
audience="localhost:8080",
algorithm="RS256"
)
mcp.auth = auth
- Web Testing: Automated functional, security, and performance testing
- Data Scraping: Extract structured data from websites
- Form Automation: Fill and submit web forms programmatically
- Content Monitoring: Track changes in web content
- Screenshot Documentation: Capture visual evidence for reports
- PDF Generation: Convert web pages to PDF documents
- Session Management: Handle authenticated workflows
- Session isolation between MCP clients
- Secure cookie management with HttpOnly and Secure flags
- Configurable browser security settings (CORS, sandbox, etc.)
- Automatic cleanup of temporary files
- TTL-based session expiration
Build the image:
docker build -t browser-mcp .
Run the server (default: port 8000, SSE transport):
docker run -p 8000:8000 browser-mcp
You can override startup parameters via environment variables:
docker run -e MCP_PORT=9000 -e MCP_TRANSPORT=http -e MCP_HOST=127.0.0.1 -p 9000:9000 browser-mcp
基于会话的浏览器自动化 FastMCP 服务器,提供先进的浏览器自动化功能,使用 FastMCP 框架构建。所有浏览器操作均通过 browser-use 实现。
- 基于会话的管理: 每个 MCP 会话自动获得独立的浏览器实例
- 高级浏览器控制: 基于 Playwright 的完整浏览器自动化(由 browser-use 提供)
- PDF 生成: 将网页转换为 PDF,支持自定义格式选项
- 文件操作: 下载/上传文件,管理临时文件目录
- Cookie 管理: 设置、获取和管理浏览器 Cookie 用于身份验证
- 截图捕获: 全页面、视口或元素截图
- 标签页管理: 创建、切换和关闭浏览器标签页
- 内容提取: 提取和搜索页面内容
- 会话持久化: 自动清理,可配置 TTL
- 多实例支持: 运行多个隔离的浏览器会话
- 可配置安全性: 所有浏览器安全设置均可通过 API 配置
-
安装依赖:
使用 uv(推荐):
uv sync --all-extras
-
安装浏览器:
uv run playwright install --with-deps chromium
-
启动服务器:
使用 uv(推荐):
uv run main.py
-
基本使用(直接使用 SessionBrowserManager):
# 直接使用,不通过 MCP 协议(用于测试/开发) from browser_fastmcp_server import SessionBrowserManager, BrowserConfig import asyncio async def main(): # 创建会话管理器 manager = SessionBrowserManager(max_instances=5, default_ttl=300) await manager.start_cleanup_task() # 创建新浏览器会话 session_id = "test_session_123" instance = await manager.get_or_create_session_instance( session_id, BrowserConfig(headless=True) ) # 导航到网站 browser_session = instance.browser_session await browser_session.navigate("https://example.com") # 获取页面元素 state_summary = await browser_session.get_state_summary(cache_clickable_elements_hashes=True) print(f"交互元素: {len(state_summary.selector_map)}") # 截图 page = await browser_session.get_current_page() screenshot_bytes = await page.screenshot(full_page=True) # 完成后关闭会话 await manager.close_session(session_id) await manager.shutdown() if __name__ == "__main__": asyncio.run(main())
安装测试依赖并运行所有测试:
uv run python -m pytest test_browser_workflow_test.py test_browser_fastmcp_client.py test_browser_test.py -v
create_chrome_instance(headless, viewport_width, viewport_height)
→ 创建新浏览器会话,返回session_id
close_instance(session_id)
→ 关闭指定会话get_instance_info(session_id)
→ 获取会话信息check_browser_health(session_id)
→ 检查浏览器会话的健康状态并提供恢复建议get_browser_status()
→ 列出所有会话close_all_instances()
→ 关闭所有会话
set_browser_config(session_id, headless, no_sandbox, user_agent, viewport_width, viewport_height, disable_web_security)
→ 设置浏览器配置(如需重启自动重启)get_browser_config(session_id)
→ 获取当前配置
navigate_to(session_id, url, new_tab=False)
→ 导航到 URL(可选新标签页)navigate_back(session_id)
/navigate_forward(session_id)
→ 历史记录导航refresh_page(session_id)
→ 刷新当前页面get_page_state(session_id)
→ 获取带索引的交互元素
get_tabs_info(session_id)
→ 列出所有打开的标签页switch_tab(session_id, page_id)
→ 切换标签页close_tab(session_id, page_id)
→ 关闭指定标签页
click_element(session_id, index)
→ 按索引点击元素click_element_by_xpath(session_id, xpath)
→ 按 XPath 点击元素input_text(session_id, index, text)
→ 在表单字段中输入文本set_element_value(session_id, index, value)
→ 直接设置输入/选择值get_element_info(session_id, index=None, xpath=None)
→ 获取元素信息(按索引或 xpath)send_keys(session_id, keys)
→ 发送键盘快捷键upload_file(session_id, index, file_path)
→ 上传文件到表单get_dropdown_options(session_id, index)
→ 检查 select 元素
take_screenshot(session_id, target=None, width=None, height=None, full_page=True, quality=90, format="png")
→ 截图generate_pdf(session_id, url=None, html_content=None, output_filename=None, ...)
→ 保存页面为 PDFdownload_file(session_id, url, output_filename=None, timeout=30)
→ 下载文件download_image(session_id, image_url, output_filename=None, timeout=30)
→ 下载图片
set_cookie(session_id, name, value, domain, path, http_only, secure, same_site, expires, max_age)
→ 设置 Cookieget_cookies(session_id, domain=None)
→ 获取当前 Cookie
scroll_page(session_id, direction="down")
→ 上下滚动extract_content(session_id, query)
→ 提取文本内容wait(seconds)
→ 暂停执行browser_tips()
→ 获取自动化最佳实践search_bing(session_id, query)
→ Bing 搜索
browser://status
→ 管理器和会话状态browser://instances
→ 所有会话信息browser://instance/{id}/page
→ 会话页面信息browser://instance/{id}/tabs
→ 会话标签页browser://instance/{id}/screenshots
→ 会话截图browser://instance/{id}/status
→ 会话详细状态browser://instance/{id}/files
→ 会话临时文件browser://instance/{id}/cookies
→ 会话 Cookiebrowser://instance/{id}/file/{relative_path}
→ 读取会话临时文件browser://help
→ 帮助
使用环境变量配置服务器:
# 最大并发浏览器实例数
BROWSER_MAXIMUM_INSTANCES=10
# 会话 TTL(秒)(默认:30分钟)
BROWSER_INSTANCE_TTL=1800
# 命令执行超时(秒)
BROWSER_EXECUTE_TIMEOUT=30
# 清理间隔(秒)
BROWSER_CLEANUP_INTERVAL=60
常见自动化场景的内置 prompt:
web_testing(url, test_scenario)
→ Web 测试工作流data_extraction(url, data_type)
→ 数据提取策略form_filling(url, form_data)
→ 自动表单填写(返回对话)automation_troubleshooting()
→ 调试帮助
-
添加到 Claude Desktop 配置:
编辑 Claude Desktop 配置文件(macOS 上通常位于
~/Library/Application Support/Claude/claude_desktop_config.json
):{ "mcpServers": { "browser-mcp": { "command": "uv", "args": ["run", "fastmcp", "run", "/path/to/browser-mcp/browser_fastmcp_server.py"], "env": { "BROWSER_MAXIMUM_INSTANCES": "5", "BROWSER_INSTANCE_TTL": "1800" } } } }
-
重启 Claude Desktop 以加载 MCP 服务器
-
开始使用: 浏览器自动化工具现在可在您的 Claude 对话中使用
方式一:基于网络的 MCP 客户端(通过 HTTP/SSE)
import asyncio
from mcp import ClientSession, SSEClientTransport
async def main():
# 通过网络连接到运行的服务器
transport = SSEClientTransport("http://localhost:8000/sse")
async with ClientSession(transport) as session:
# 初始化会话
await session.initialize()
# 启动浏览器
info = await session.call_tool("create_chrome_instance", {"headless": True})
session_id = info["session_id"]
# 导航到网站
await session.call_tool("navigate_to", {"session_id": session_id, "url": "https://example.com"})
# 截图
await session.call_tool("take_screenshot", {"session_id": session_id})
# 关闭会话
await session.call_tool("close_instance", {"session_id": session_id})
if __name__ == "__main__":
asyncio.run(main())
方式二:直接客户端(无网络)
import asyncio
from fastmcp import Client
from browser_fastmcp_server import mcp as browsers_mcp
async def main():
# 直接客户端连接(无网络)
client = Client(browsers_mcp)
async with client:
# 启动浏览器
session = await client.call_tool("create_chrome_instance", {"headless": True})
session_id = session.data.session_id
# 导航到网站
await client.call_tool("navigate_to", {"session_id": session_id, "url": "https://example.com"})
# 截图
await client.call_tool("take_screenshot", {"session_id": session_id})
# 关闭会话
await client.call_tool("close_instance", {"session_id": session_id})
if __name__ == "__main__":
asyncio.run(main())
对于需要身份验证的服务器部署,在启动前修改 main.py
设置 AuthProvider:
基本身份验证:
from fastmcp.auth import BasicAuth
# 在 mcp.run() 之前添加
mcp.auth = BasicAuth(username="admin", password="password")
JWT 身份验证(生产环境推荐):
对于更高级的身份验证,我们推荐使用 fastmcp-authentication:
from fastmcp_authentication import BearerAuthProvider
JWKS_URI = "http://localhost:8080/.well-known/jwks.json"
auth = BearerAuthProvider(
jwks_uri=JWKS_URI,
issuer="http://localhost:8080",
audience="localhost:8080",
algorithm="RS256"
)
mcp.auth = auth
- Web 测试: 自动化功能、安全和性能测试
- 数据抓取: 从网站提取结构化数据
- 表单自动化: 程序化填写和提交 Web 表单
- 内容监控: 跟踪 Web 内容变化
- 截图文档: 为报告捕获视觉证据
- PDF 生成: 将网页转换为 PDF 文档
- 会话管理: 处理身份验证工作流
- MCP 客户端之间的会话隔离
- 支持 HttpOnly 和 Secure 标志的安全 Cookie 管理
- 可配置的浏览器安全设置(CORS、沙箱等)
- 临时文件自动清理
- 基于 TTL 的会话过期
构建镜像:
docker build -t browser-mcp .
运行服务(默认8000端口,SSE模式):
docker run -p 8000:8000 browser-mcp
可通过环境变量覆盖启动参数:
docker run -e MCP_PORT=9000 -e MCP_TRANSPORT=http -e MCP_HOST=127.0.0.1 -p 9000:9000 browser-mcp