Skip to content

V-Droid-Agent/V-Droid-Agent.github.io

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Advancing Mobile GUI Agents: A Verifier-Driven Approach to Practical Deployment

Gaole Dai1,2, Shiqi Jiang1, Ting Cao1, Yuanchun Li3, Yuqing Yang1, Rui Tan2, Mo Li4, Lili Qiu1

1 Microsoft Research 2 Nanyang Technological University 3 AIR @ Tsinghua University 4 Hong Kong University of Science & Technology

Paper PDF Project Page Code Repo

Overview

Introducing V-Droid – the first mobile GUI agent with near-real-time, high-quality decision making ability. Unlike traditional agents that rely on large language models (LLMs) to generate actions at every step, V-Droid employes LLMs as verifiers evaluating candidate actions to ensure high-quality decision-making.

V-Droid features:

  1. Discretized Action Space & Prefilling-Only Workflow: Accelerates decision-making by verifying candidate actions in parallel using prefix caching.

  2. Pair-Wise Progress Preference Training: Enhances the verifier’s decision-making and self-correction capabilities through progress-aware training.

  3. Scalable Human-Agent Joint Annotation: V-Droid quickly takes the lead role in the annotation process after just two training rounds, significantly reducing overhead while boosting performance.

V-Droid has set new benchmarks in mobile tasks automation, achieving state-of-the-art task success rates of 59.5% on AndroidWorld, 38.3% on AndroidLab, and 49% on MobileAgentBench, outperforming existing agents by 9.5%, 2.1%, and 9%, respectively. Furthermore, V-Droid achieves an low latency of 0.7 seconds per decision, which is 32.8X faster than existing agents.

Code & Weights

The complete codebase and model weights will be released shortly—stay tuned!

Citation

If you use this work, please cite:

@article{dai2025advancing,
  title={Advancing Mobile GUI Agents: A Verifier-Driven Approach to Practical Deployment},
  author={Dai, Gaole and Jiang, Shiqi and Cao, Ting and Li, Yuanchun and Yang, Yuqing and Tan, Rui and Li, Mo and Qiu, Lili},
  journal={arXiv preprint arXiv:2503.15937},
  year={2025}
}

Releases

No releases published

Packages

No packages published