WideSearch: Benchmarking Agentic Broad Info-Seeking

👋 Hi, everyone!
We are ByteDance Seed team.

You can get to know us better through the following channels👇

WideSearch: Benchmarking Agentic Broad Info-Seeking

We will release the arxiv paper soon! Stay tuned!

News

[2025/08/11]🔥We release WideSearch Benchmark.

Introduction

From Tedious Labor to Autonomous Agent

Many real-world information-gathering tasks are not hard, just huge. Consider a financial analyst compiling key metrics for all companies in a sector, or a job seeker collecting every vacancy that meets their criteria. The challenge isn't cognitive complexity, but the sheer scale and repetitive nature of the work—a critical productivity bottleneck.

WideSearch is designed to evaluate an agent's ability to automate these tasks, shifting from laborious manual collection to efficient, automated workflows. This shift, however, introduces novel failure modes like hallucination and incompleteness, making rigorous evaluation essential.

A New Paradigm: Wide vs. Deep

Current research primarily focuses on "deep" tasks. DeepSearch tackles the "I can't find it" problem of locating hidden facts, while DeepResearch addresses the "I can't write it well" problem of synthesizing reports.

In sharp contrast, WideSearch tackles the "I could do it, but the sheer volume is overwhelming" problem. It requires agents to systematically find and organize large-scale information into a structured output, shifting the primary challenge from deep search to achieving exhaustiveness and fidelity at scale.

Experiments

We test both single-agent and multi-agent modes, and manually conducted end-to-end testing of the commercial AI system on the web interface. In addition, we randomly select 20 questions and invited human annotators to perform tests. The experiment results are as follows:

Quickstart

Set up environment

Install dependencies, see prepare-env.sh for more details.

git clone https://github.com/ByteDance-Seed/WideSearch.git
cd WideSearch
sh prepare-env.sh
source .venv/bin/activate

Configuration

Implement custom search tools in src/agent/tools.py
Configure model parameters in src/utils/config.py

Inference and Evaluation

Run the following command to perform inference and evaluation:

python3 scripts/run_infer_and_eval_batching.py \
--trial_num={your_trial_num} \
--model_config_name={your_model_config_name} \
--response_root={your_response_root} \
--result_save_root={your_result_save_root} \
--stage={infer/eval or both}

License

This project is licensed under MIT. See the LICENSE file for details.

Citation

If you find WideSearch useful for your research and applications, feel free to give us a star ⭐ and cite us using:

@misc{wong2025widesearchbenchmarkingagenticbroad,
      title={WideSearch: Benchmarking Agentic Broad Info-Seeking}, 
      author={Ryan Wong and Jiawei Wang and Junjie Zhao and Li Chen and Yan Gao and Long Zhang and Xuan Zhou and Zuo Wang and Kai Xiang and Ge Zhang and Wenhao Huang and Yang Wang and Ke Wang},
      year={2025},
      eprint={2508.07999},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2508.07999}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
figs		figs
scripts		scripts
src		src
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
prepare-env.sh		prepare-env.sh
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

WideSearch: Benchmarking Agentic Broad Info-Seeking

News

Introduction

From Tedious Labor to Autonomous Agent

A New Paradigm: Wide vs. Deep

Experiments

Quickstart

Set up environment

Configuration

Inference and Evaluation

License

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Uh oh!

License

Uh oh!

ByteDance-Seed/WideSearch

Folders and files

Latest commit

History

Repository files navigation

WideSearch: Benchmarking Agentic Broad Info-Seeking

News

Introduction

From Tedious Labor to Autonomous Agent

A New Paradigm: Wide vs. Deep

Experiments

Quickstart

Set up environment

Configuration

Inference and Evaluation

License

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages