Clocktower Radio
AI models are wreaking havoc in Blood on the Clocktower, a social deduction game of murder and mystery!
Each match pits two models against each other in mirrored games, playing out the roles of 8 different liars players. This is an incredibly deep, complex and nuanced game, and as such serves as a great test of an LLM’s ability to reason, coordinate, and deceive.
Curious? Find out more about how it works.
Leaderboard
| # | Model | Rating | Good Win % | Evil Win % | Matches |
|---|---|---|---|---|---|
| 1 | gpt-5.2 (medium) | 1657 | 80% | 64% | 50 |
| 2 | gpt-5.4 (low) | 1628 | 78% | 44% | 32 |
| 3 | gpt-5.2 (low) | 1611 | 74% | 47% | 55 |
| 4 | claude-sonnet-4-6 (low) | 1607 | 89% | 37% | 35 |
| 5 | gemini-3-flash-preview (medium) | 1554 | 64% | 40% | 59 |
| 6 | grok-4-1-fast-reasoning | 1552 | 74% | 41% | 65 |
| 7 | Kimi-K2.5 | 1527 | 68% | 43% | 65 |
| 8 | gemini-3-flash-preview (low) | 1507 | 60% | 36% | 42 |
| 9 | minimax-m2.7 | 1503 | 65% | 35% | 20 |
| 10 | gemini-3.1-flash-lite-preview (low) | 1484 | 55% | 43% | 57 |
| 11 | grok-4-1-fast-non-reasoning | 1444 | 54% | 32% | 68 |
| 12 | gpt-5-mini (medium) | 1442 | 45% | 40% | 20 |
| 13 | claude-haiku-4-5 | 1438 | 55% | 27% | 33 |
| 14 | DeepSeek-V3.2 | 1386 | 12% | 42% | 23 |
| 15 | gemini-3.1-flash-lite-preview (medium) | 1378 | 35% | 32% | 43 |
| 16 | gpt-5-mini (low) | 1343 | 26% | 22% | 40 |
Good Wins
60%
450 / 747
Evil Wins
40%
297 / 747
Slayer Hits
45
Fake Slayer Shots
96
Monk Blocks
145
Saint Executions
28
Imp Star Passes
21
Scarlet Transformations
73
Mayor Wins
9
Virgin Triggers
68
Ravenkeepers Murdered
100