GitHub - showlab/TrustScorer: ACM MM 2025 Can I Trust You? Advancing GUI Task Automation with Action Trust Score

ACM MM 2025: Can I Trust You? Advancing GUI Task Automation with Action Trust Score

Haiyang Mei, Difei Gao, Xiaopeng Wei, Xin Yang, Mike Zheng Shou

[Paper] [BibTeX]

Table of Contents

1. TrustScorer

TrustScorer evaluates the trustworthiness of GUI agent actions for selective human intervention when action trust score is low, to help mingling human precision with AI efficiency.

TrustScorer takes as input the user query q, subtask description d, action sequence s, and state observation o, and outputs a trustworthiness label l indicating the likelihood that the action sequence can accomplish the specified subtask

2. TrustBench

TrustBench includes 106 specific tasks from 9 commonly used applications as well as 718 agent action sequences along with the corresponding ground-truth annotations.

One TrustBench example on PPT:

The annotation pipeline:

The TrustBench will be released at December 2025.

3. Implementation

We will release the training/testing/evaluation codes around the end of November 2025.

4. Acknowledgements

Our work builds upon AssistGUI.

5. Citation

If you use TrustScorer/TrustBench in your research, please use the following BibTeX entry.

@InProceedings{Mei_2025_MM_TrustScorer,
    author    = {Mei, Haiyang and Gao, Difei and Wei, Xiaopeng and Yang, Xin and Shou, Mike Zheng},
    title     = {Can I Trust You? Advancing GUI Task Automation with Action Trust Score},
    booktitle = {Proceedings of the 33rd ACM International Conference on Multimedia (ACM MM)},
    year      = {2025},
}

6. License

Please see LICENSE

7. Contact

E-Mail: Haiyang Mei ([email protected])

⬆ back to top

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
assets		assets
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ACM MM 2025: Can I Trust You? Advancing GUI Task Automation with Action Trust Score

1. TrustScorer

2. TrustBench

3. Implementation

4. Acknowledgements

5. Citation

6. License

7. Contact

About

Uh oh!

Releases

Packages

License

showlab/TrustScorer

Folders and files

Latest commit

History

Repository files navigation

ACM MM 2025: Can I Trust You? Advancing GUI Task Automation with Action Trust Score

1. TrustScorer

2. TrustBench

3. Implementation

4. Acknowledgements

5. Citation

6. License

7. Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages