Skip to content
/ ASTRA Public

πŸ₯‡ Amazon Nova AI Challenge Winner - ASTRA emerged victorious as the top attacking team in Amazon's global AI safety competition, defeating elite defending teams from universities worldwide in live adversarial evaluation.

License

Notifications You must be signed in to change notification settings

PurCL/ASTRA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

ASTRA Banner

ASTRA: Autonomous Spatial-Temporal Red-teaming for AI Software Assistants

Amazon Nova AI Challenge Website Paper License

πŸ† Red-Team Winner of Amazon Nova AI Challenge - First-ever global tournament where elite university teams battle to harden and hack AI coding assistants

πŸ“° News

πŸ† Latest Achievement

πŸ₯‡ Amazon Nova AI Challenge Winner - ASTRA emerged victorious as the top attacking team in Amazon's global AI safety competition, defeating elite defending teams from universities worldwide in live adversarial evaluation.

🎯 Key Highlights

  • πŸ† Winner of Amazon Nova AI Challenge - Top attacking team category
  • πŸ₯‡ $250,000 Prize - Awarded for winning the competition
  • πŸ“Š >=90% Success Rate - In AI assistant safety assessment

πŸ“° Media Coverage

  • Amazon Science - Official announcement of ASTRA as the winning red team tool

🎯 About

ASTRA Technical Details

ASTRA System Overview

ASTRA (Autonomous Spatial-Temporal Red-teaming for AI Software Assistants) is a full lifecycle red-teaming system that builds structured domain-specific knowledge graphs and performs online vulnerability exploration by adaptively probing both input space (spatial) and reasoning processes (temporal).

πŸš€ What Makes ASTRA Different

Unlike existing tools that are either static benchmarks or jailbreak attempts on given benchmarks, ASTRA operates as a complete red-teaming solution:

πŸ” 1. Structural Domain Modeling

  • Given a target domain, ASTRA performs structural modeling and generates high-quality violation-inducing prompts
  • No pre-defined benchmarks required - ASTRA creates its own test cases systematically

πŸ’¬ 2. Multi-turn Conversation Framework

  • Uses generated prompts as starting points for comprehensive testing
  • Conducts adaptive multi-round conversations with target systems based on responses
  • Temporal Exploration: Identifies weak links in target system reasoning traces and dynamically adjusts test prompts to exploit discovered vulnerabilities

🎯 3. Self-Evolving Red-teaming

  • Self-evolving capability: Records successful cases and adjusts sampling strategies to target similar prompts, gradually improving success rates
  • Autonomous operation: No human intervention required during testing

πŸš€ Quick Start

βœ… Prerequisites

  • 🐍 Python 3.9+
  • πŸ“¦ Required dependencies (see requirements.txt)
  • πŸ”‘ API access to LLM providers (OpenAI, Anthropic, etc.)

πŸ› οΈ Installation

git clone https://github.com/PurCL/ASTRA
cd ASTRA
pip install -r requirements.txt

▢️ Basic Usage

ASTRA consists multiple stages from knowledge graph construction to online adaptive red-teaming. This section provides a convenient guide on how to run the online adaptive red-teaming component with a new target model.
For detailed usage instructions, see πŸ“˜ USAGE.md.

ASTRA comes with prompts generated for secure code generation and security event guidance domains. You can directly use those prompts to test your target model.

🧰 Specify the configure of your model at resources/client-config.yaml.
Then run the following command to start the online adaptive red-teaming process:

python3 online/main.py --model_name <name of the blue team model> --log <path to the output log file> --n_session <number of chat sessions> --n_probing <number of initial probing sessions before the chat sessions> --n_turn <maximum number of turns per session>

For example,

python3 online/main.py --model_name phi4m --log log_out/phi4m.jsonl --n_session 200 --n_probing 0 --n_turn 5

πŸ“ This will run 200 chat sessions with the target model phi4m, each with up to 5 turns, and log the results to log_out/phi4m.jsonl.

πŸ“§ Contact

For questions, collaborations, or feedback, please contact:

We welcome academic collaborations and industry partnerships!

πŸ“„ Citation

If you find ASTRA useful in your research, please cite our paper:

@article{xu2025astra,
  title={ASTRA: Autonomous Spatial-Temporal Red-teaming for AI Software Assistants},
  author={Xu, Xiangzhe and Shen, Guangyu and Su, Zian and Cheng, Siyuan and Guo, Hanxi and Yan, Lu and Chen, Xuan and Jiang, Jiasheng and Jin, Xiaolong and Wang, Chengpeng and others},
  journal={arXiv preprint arXiv:2508.03936},
  year={2025}
}

πŸ™ Acknowledgments

We would like to thank the following projects and communities for their inspiration and support:


Made with ❀️ for AI Safety Research

GitHub stars GitHub forks GitHub issues

About

πŸ₯‡ Amazon Nova AI Challenge Winner - ASTRA emerged victorious as the top attacking team in Amazon's global AI safety competition, defeating elite defending teams from universities worldwide in live adversarial evaluation.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages