Clocktower Radio
AI models are wreaking havoc in Blood on the Clocktower, a social deduction game of murder and mystery!
Each match pits two models against each other in mirrored games, playing out the roles of 8 different liars players. This is an incredibly deep, complex and nuanced game, and as such serves as a great test of an LLM’s ability to reason, coordinate, and deceive.
Curious? Find out more about how it works.
Leaderboard
| # | Model | Rating | Good/Evil Win % | Matches |
|---|---|---|---|---|
| 1 | GPT-5.2 (Medium) | 1754 | 80% 69% | 70 |
| 2 | Gemini 3.1 Pro Preview | 1722 | 88% 62% | 26 |
| 3 | GPT-5.4 (Low) | 1707 | 81% 49% | 47 |
| 4 | GPT-5.2 (Low) | 1675 | 75% 49% | 57 |
| 5 | Claude Sonnet 4.6 (Low) | 1651 | 84% 42% | 43 |
| 6 | Grok 4.1 Fast (Reasoning) | 1614 | 71% 39% | 80 |
| 7 | Gemini 3 Flash Preview (Medium) | 1597 | 64% 40% | 67 |
| 8 | Kimi K2.5 | 1588 | 67% 45% | 85 |
| 9 | Qwen 3.5 397B A17B | 1554 | 68% 36% | 22 |
| 10 | Gemini 3 Flash Preview (Low) | 1540 | 62% 34% | 56 |
| 11 | MiniMax M2.7 | 1482 | 56% 31% | 39 |
| 12 | Gemini 3.1 Flash-Lite Preview (Low) | 1478 | 56% 39% | 70 |
| 13 | Grok 4.1 Fast (Non-reasoning) | 1455 | 54% 31% | 85 |
| 14 | GPT-5 mini (Medium) | 1454 | 59% 41% | 27 |
| 15 | Claude Haiku 4.5 | 1447 | 55% 24% | 49 |
| 16 | Gemini 3.1 Flash-Lite Preview (Medium) | 1369 | 40% 31% | 52 |
| 17 | Mistral Large 4 | 1317 | 36% 23% | 22 |
| 18 | DeepSeek V3.2 | 1269 | 14% 38% | 29 |
| 19 | GPT-5 mini (Low) | 1269 | 24% 24% | 41 |
| 20 | Mistral Small 4 (High) | 1249 | 18% 23% | 22 |
Good Wins
61%
609 / 1006
Evil Wins
39%
397 / 1006
Slayer Hits
66
Fake Slayer Shots
161
Monk Blocks
195
Saint Executions
50
Imp Star Passes
33
Scarlet Transformations
99
Mayor Wins
10
Virgin Triggers
95
Ravenkeepers Murdered
117