|
| 1 | +# CLAUDE.md |
| 2 | + |
| 3 | +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. |
| 4 | + |
| 5 | +## Development Commands |
| 6 | + |
| 7 | +### Environment Setup |
| 8 | +```bash |
| 9 | +# Create and activate virtual environment |
| 10 | +uv venv |
| 11 | +source .venv/bin/activate |
| 12 | + |
| 13 | +# Install with test dependencies |
| 14 | +uv pip install -e ".[test]" |
| 15 | +``` |
| 16 | + |
| 17 | +### Testing |
| 18 | +```bash |
| 19 | +# Run all tests with coverage |
| 20 | +python -m pytest |
| 21 | + |
| 22 | +# Run specific test file |
| 23 | +python -m pytest tests/tools/test_search.py |
| 24 | + |
| 25 | +# Run tests with verbose output |
| 26 | +python -m pytest -v |
| 27 | +``` |
| 28 | + |
| 29 | +### Running the Server |
| 30 | +```bash |
| 31 | +# Run as module |
| 32 | +python -m arxiv_mcp_server |
| 33 | + |
| 34 | +# Or via entry point |
| 35 | +arxiv-mcp-server |
| 36 | +``` |
| 37 | + |
| 38 | +## Architecture Overview |
| 39 | + |
| 40 | +This is an **MCP (Message Control Protocol) server** that provides AI models access to arXiv research papers. The codebase follows a modular architecture with four main layers: |
| 41 | + |
| 42 | +### Core Components |
| 43 | + |
| 44 | +1. **Server Layer** (`server.py`): Main MCP server implementation that handles tool registration and request routing |
| 45 | +2. **Tools Layer** (`tools/`): Individual MCP tools for paper operations: |
| 46 | + - `search.py`: Advanced arXiv paper search with filtering |
| 47 | + - `download.py`: Paper download and storage management |
| 48 | + - `list_papers.py`: List locally stored papers |
| 49 | + - `read_paper.py`: Read paper content from storage |
| 50 | +3. **Resource Management** (`resources/papers.py`): `PaperManager` class handles paper storage, PDF-to-markdown conversion using pymupdf4llm, and local caching |
| 51 | +4. **Configuration** (`config.py`): Pydantic-based settings with environment variable support |
| 52 | + |
| 53 | +### Key Design Patterns |
| 54 | + |
| 55 | +- **MCP Protocol Compliance**: All tools follow MCP specification with proper type definitions |
| 56 | +- **Async-First**: Built on asyncio with aiofiles for non-blocking I/O operations |
| 57 | +- **Storage Strategy**: Papers downloaded as PDFs, converted to markdown, stored locally with PDF cleanup |
| 58 | +- **Error Handling**: Comprehensive error handling with user-friendly messages throughout tool chain |
| 59 | + |
| 60 | +### Configuration |
| 61 | + |
| 62 | +Environment variables (all optional with sensible defaults): |
| 63 | +- `ARXIV_STORAGE_PATH`: Paper storage location (default: `~/.arxiv-mcp-server/papers`) |
| 64 | +- `ARXIV_MAX_RESULTS`: Search results limit (default: 50) |
| 65 | +- `ARXIV_REQUEST_TIMEOUT`: API timeout in seconds (default: 60) |
| 66 | + |
| 67 | +### Testing Strategy |
| 68 | + |
| 69 | +Tests use pytest with async support and comprehensive mocking: |
| 70 | +- `conftest.py` provides shared fixtures for mock arXiv papers and HTTP responses |
| 71 | +- Tests cover both unit-level tool functionality and integration scenarios |
| 72 | +- Mock-based approach avoids external API calls during testing |
0 commit comments