Cluster1 sources· last seen 10h ago· first seen 10h ago

New: LLM Buyout Game Benchmark. This compresses several abilities into a single game. A model has to read coalition politics, price private deals, decide when survival is worth paying for and manage a buyout endgame. GPT-5.4 (high) is #1. GLM-5 is #2. Opus 4.6 (high) is #3.

This benchmark measures long-horizon social strategy under explicit financial incentives. Eight models play a multi-round elimination game with unequal starting balances, a public prize ladder, private transfers, public votes, and a finalist-only endgame where the last two seats can negotiate, settl

Lead: r/singularityBigness: 23llmbuyoutgamebenchmarkcompresses

Open primary source

📡 Coverage

1 news source

🟠 Hacker News

🔴 Reddit

55 upvotes across 1 sub

📈 Google Trends

Full methodology: How scoring works

Receipts (all sources)

New: LLM Buyout Game Benchmark. This compresses several abilities into a single game. A model has to read coalition politics, price private deals, decide when survival is worth paying for and manage a buyout endgame. GPT-5.4 (high) is #1. GLM-5 is #2. Opus 4.6 (high) is #3.

REDDIT · r/singularity · 10h ago · ⬆ 55 · 💬 14

score 112