awestover

Follow

Alek Westover awestover

Follow

Thinking about AI safety

16 followers · 7 following

@neondatabase
Massachussetts
23:30 (UTC -04:00)
awestover.github.io

Achievements

Achievements

Pinned Loading

how-bad-can-ai-be how-bad-can-ai-be Public

Python
misalignment-by-default misalignment-by-default Public

Python 2 1
DQN-maze-solver DQN-maze-solver Public

Investigating whether or not RL agents can acausally collaborate with other instances of themselves.

Python 1
transformer-shortest-paths transformer-shortest-paths Public

Experimentally evaluating transformer's generalization on a synthetic task

HTML 1
activation-steering-vs-prompting activation-steering-vs-prompting Public

Is activation steering more powerful than prompting at mitigating deception in some current reasoning LLMs?

Jupyter Notebook 1
theland theland Public

theland

JavaScript 2