Jr. AI Scientist and Its Risk Report: Autonomous Scientific Exploration from a Baseline Paper [arXiv]
*Equal Contribution
We believe that understanding the limits and risks of AI Scientists is essential to ensuring safe and sustainable AI progress.
The current quality of AI Scientists is still far from sufficient, and how to further improve them remains an open challenge. In addition, most existing AI Scientist systems have primarily focused on success cases, and do not focus on the risks and failures. This has hindered the academic community to accurately understand AI Scientists.
To this end, we present the development of a state-of-the-art AI Scientist and provide a comprehensive report on its limitations, risks, and failures, with the goal of fostering a more accurate understanding of AI Scientists within the research community.
-
- Development of Jr. AI Scientist. We develop Jr. AI Scientist, a state-of-the-art autonomous AI scientist system that mimics the core research workflow of an early-career human scientist: Given the baseline paper from the human mentor, it analyzes its limitations, formulates novel hypotheses for improvement, validates them through rigorous experimentation, and writes a paper with the results. Unlike previous approaches that assume full automation or operate on small-scale code, Jr. AI Scientist follows a well-defined research workflow and leverages modern coding agents to handle complex, multi-file implementations, leading to scientifically valuable contributions.

- Development of Jr. AI Scientist. We develop Jr. AI Scientist, a state-of-the-art autonomous AI scientist system that mimics the core research workflow of an early-career human scientist: Given the baseline paper from the human mentor, it analyzes its limitations, formulates novel hypotheses for improvement, validates them through rigorous experimentation, and writes a paper with the results. Unlike previous approaches that assume full automation or operate on small-scale code, Jr. AI Scientist follows a well-defined research workflow and leverages modern coding agents to handle complex, multi-file implementations, leading to scientifically valuable contributions.
-
- Comprehensive Reporting of Risks and Limitations. We comprehensively report important limitations and various risks identified during development. These include the potential for review-score hacking, difficulties in ensuring proper citation, challenges in managing ablation experiment results, and the problem of detecting fabricated results.
We hope these insights will deepen understanding of current progress and risks in AI Scientist development.
We created our papers based on LoCoOp [NeurIPS2023], Min-K%++ [ICLR2025], and GL-MCM [IJCV2025].
Each paper is under generated_papers/
We are currently pausing the public release of Jr. AI Scientist because we have not yet been able to properly assess the potential negative impacts that AI Scientist might have on the academic community.
We plan to proceed with open-sourcing once we have gained a sufficient understanding of these risks.
We are collecting issues and feedback from the community through this form. This is intended to gather feedback from the community regarding issue with the paper, suggestions, and any other inquiries. Submissions can be made either anonymously or with your name.
If you find our work helpful, please consider citing:
@article{miyai2025juniorai,
title={Jr. AI Scientist and Its Risk Report: Autonomous Scientific Exploration from a Baseline Paper},
author={Miyai, Atsuyuki and Toyooka, Mashiro and Otonari, Takashi and Zhao, Zaiying and Aizawa, Kiyoharu},
journal={arXiv preprint arXiv:2511.04583},
year={2025}
}
