Danau5tin

Follow

Dan Austin Danau5tin

Follow

www.danaustin.ai

20 followers · 0 following

London
www.danaustin.ai

Achievements

Achievements

Pinned Loading

terminal-bench-rl terminal-bench-rl Public

GRPO training code which scales to 32xH100s for long horizon terminal/coding tasks. Base agent is now the top Qwen3 agent on Stanford's TerminalBench leaderboard.

Python 230 14
calculator_agent_rl calculator_agent_rl Public

Training an LLM to use a calculator with multi-turn reinforcement learning, achieving a **62% absolute increase in evaluation accuracy**.

Python 45 3
tbench-agentic-data-pipeline tbench-agentic-data-pipeline Public

Multi-agent synthetic data generation pipeline capable of generating and validating long horizon terminal/coding tasks for RL training

Python 13 2
agentic_environments agentic_environments Public

A mini-framework to build agentic LLM environments.

Python 2