π Red-Team Winner of Amazon Nova AI Challenge - First-ever global tournament where elite university teams battle to harden and hack AI coding assistants
π₯ Amazon Nova AI Challenge Winner - ASTRA emerged victorious as the top attacking team in Amazon's global AI safety competition, defeating elite defending teams from universities worldwide in live adversarial evaluation.
- π Winner of Amazon Nova AI Challenge - Top attacking team category
- π₯ $250,000 Prize - Awarded for winning the competition
- π >=90% Success Rate - In AI assistant safety assessment
- Amazon Science - Official announcement of ASTRA as the winning red team tool
ASTRA (Autonomous Spatial-Temporal Red-teaming for AI Software Assistants) is a full lifecycle red-teaming system that builds structured domain-specific knowledge graphs and performs online vulnerability exploration by adaptively probing both input space (spatial) and reasoning processes (temporal).
Unlike existing tools that are either static benchmarks or jailbreak attempts on given benchmarks, ASTRA operates as a complete red-teaming solution:
- Given a target domain, ASTRA performs structural modeling and generates high-quality violation-inducing prompts
- No pre-defined benchmarks required - ASTRA creates its own test cases systematically
- Uses generated prompts as starting points for comprehensive testing
- Conducts adaptive multi-round conversations with target systems based on responses
- Temporal Exploration: Identifies weak links in target system reasoning traces and dynamically adjusts test prompts to exploit discovered vulnerabilities
- Self-evolving capability: Records successful cases and adjusts sampling strategies to target similar prompts, gradually improving success rates
- Autonomous operation: No human intervention required during testing
- π Python 3.9+
- π¦ Required dependencies (see
requirements.txt
) - π API access to LLM providers (OpenAI, Anthropic, etc.)
git clone https://github.com/PurCL/ASTRA
cd ASTRA
pip install -r requirements.txt
ASTRA consists multiple stages from knowledge graph construction to online adaptive red-teaming. This section provides a convenient guide on how to run the online adaptive red-teaming component with a new target model.
For detailed usage instructions, see π USAGE.md.
ASTRA comes with prompts generated for secure code generation and security event guidance domains. You can directly use those prompts to test your target model.
π§° Specify the configure of your model at resources/client-config.yaml
.
Then run the following command to start the online adaptive red-teaming process:
python3 online/main.py --model_name <name of the blue team model> --log <path to the output log file> --n_session <number of chat sessions> --n_probing <number of initial probing sessions before the chat sessions> --n_turn <maximum number of turns per session>
For example,
python3 online/main.py --model_name phi4m --log log_out/phi4m.jsonl --n_session 200 --n_probing 0 --n_turn 5
π This will run 200 chat sessions with the target model phi4m
, each with up to 5 turns, and log the results to log_out/phi4m.jsonl
.
For questions, collaborations, or feedback, please contact:
- Xiangzhe Xu - [email protected]
- Guangyu Shen - [email protected]
We welcome academic collaborations and industry partnerships!
If you find ASTRA useful in your research, please cite our paper:
@article{xu2025astra,
title={ASTRA: Autonomous Spatial-Temporal Red-teaming for AI Software Assistants},
author={Xu, Xiangzhe and Shen, Guangyu and Su, Zian and Cheng, Siyuan and Guo, Hanxi and Yan, Lu and Chen, Xuan and Jiang, Jiasheng and Jin, Xiaolong and Wang, Chengpeng and others},
journal={arXiv preprint arXiv:2508.03936},
year={2025}
}
We would like to thank the following projects and communities for their inspiration and support:
- Amazon Nova AI Challenge - For providing the platform and resources that enabled ASTRA's development and validation