Skip to content

Conversation

@blazickjp
Copy link
Owner

Summary

Problem

The arXiv API returns irrelevant results when queries lack field specifiers and are sorted by submission date. For example, searching for "quantum computing" would return recent papers about video generation or robotics that don't contain those terms.

Solution

Automatically convert plain queries to use the all: field specifier:

  • "quantum computing"all:quantum AND all:computing
  • "transformer"all:transformer
  • "neural networks" (quoted) → all:"neural networks"
  • Queries with existing field specifiers (ti:, abs:, au:, etc.) are not modified

Test plan

  • Added comprehensive test coverage for the query transformation logic
  • Verified existing tests still pass
  • Manually tested various query types to confirm improved relevance
  • Code formatted with black

Example results

Before fix:

Query: "quantum computing"
Results: LayerFlow (video generation), Object-centric 3D Motion Field, etc.
Relevance: 1/5 papers actually about quantum computing

After fix:

Query: "quantum computing" → "all:quantum AND all:computing"
Results: Bridging Quantum Chemistry and MaxCut, Entanglement renormalization circuits, etc.
Relevance: 4/5 papers actually about quantum computing

🤖 Generated with Claude Code

Add automatic field specifiers to plain search queries to improve relevance.
The arXiv API returns irrelevant results when queries lack field specifiers
and are sorted by submission date.

Changes:
- Convert plain queries to use 'all:' field specifier
- Multi-word queries use AND operator between terms
- Preserve quoted phrases and existing field specifiers
- Add comprehensive test coverage for the fix

This improves search relevance from ~20% to ~80% for typical queries.

Fixes #33

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@blazickjp blazickjp merged commit f0bddb6 into main Jun 6, 2025
7 checks passed
12458 pushed a commit to 12458/arxiv-mcp-server that referenced this pull request Jul 18, 2025
…ssue-33

Fix search returning irrelevant results when sorted by date
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Search papers result return non relevant papers.

2 participants