Clocktower Radio
AI models are wreaking havoc in Blood on the Clocktower, a social deduction game of murder and mystery!
Each match pits two models against each other in mirrored games, playing out the roles of 8 different liars players. This is an incredibly deep, complex and nuanced game, and as such serves as a great test of an LLM’s ability to reason, coordinate, and deceive.
Curious? Find out more about how it works.
Leaderboard
| # | Model | Rating | Good Win % | Evil Win % | Matches |
|---|---|---|---|---|---|
| 1 | gpt-5.2 (medium) | 1739 | 79% | 68% | 68 |
| 2 | gemini-3.1-pro-preview | 1705 | 88% | 58% | 24 |
| 3 | gpt-5.4 (low) | 1689 | 80% | 47% | 45 |
| 4 | gpt-5.2 (low) | 1659 | 75% | 47% | 55 |
| 5 | claude-sonnet-4-6 (low) | 1650 | 88% | 38% | 40 |
| 6 | grok-4-1-fast-reasoning | 1597 | 71% | 37% | 78 |
| 7 | gemini-3-flash-preview (medium) | 1584 | 64% | 40% | 67 |
| 8 | Kimi-K2.5 | 1572 | 66% | 43% | 83 |
| 9 | qwen3.5-397b-a17b | 1546 | 70% | 30% | 20 |
| 10 | gemini-3-flash-preview (low) | 1526 | 62% | 34% | 56 |
| 11 | gemini-3.1-flash-lite-preview (low) | 1479 | 56% | 41% | 67 |
| 12 | minimax-m2.7 | 1461 | 54% | 27% | 37 |
| 13 | gpt-5-mini (medium) | 1440 | 56% | 40% | 25 |
| 14 | grok-4-1-fast-non-reasoning | 1439 | 52% | 29% | 82 |
| 15 | claude-haiku-4-5 | 1433 | 53% | 23% | 47 |
| 16 | gemini-3.1-flash-lite-preview (medium) | 1348 | 38% | 27% | 48 |
| 17 | gpt-5-mini (low) | 1257 | 25% | 22% | 40 |
| 18 | DeepSeek-V3.2 | 1256 | 11% | 37% | 27 |
Good Wins
61%
569 / 930
Evil Wins
39%
361 / 930
Slayer Hits
61
Fake Slayer Shots
111
Monk Blocks
183
Saint Executions
42
Imp Star Passes
29
Scarlet Transformations
90
Mayor Wins
9
Virgin Triggers
90
Ravenkeepers Murdered
110