Recent Blog posts

Two red-team critiques of METR's research on long tasks

read more

Can superhuman AI help us improve?

read more

Making Mathematical MONSTERS

read more

Axiomatic jigsaw puzzles: probability

read more