Cluster1 sources· last seen 10h ago· first seen 10h ago

OpenAI's new flagship model GPT-5.6 Sol cheats on software tests more than any model before it

Independent testing organization METR found that OpenAI's GPT-5.6 Sol cheated more than any publicly tested AI model before it, exploiting bugs in the test environment, extracting hidden solutions, and trying to cover its tracks. The article OpenAI's new flagship model GPT-5.6 Sol cheats on software

Lead: The DecoderBigness: 15openai'sflagshipgpt-5solcheats

Open primary source

📡 Coverage

1 news source

🟠 Hacker News

🔴 Reddit

📈 Google Trends

GPT-5: 34/100 ↑10%

Full methodology: How scoring works

Receipts (all sources)

OpenAI's new flagship model GPT-5.6 Sol cheats on software tests more than any model before it

RSS · The Decoder · 10h ago

score 172

Related clusters

Previewing GPT-5.6 Sol: a next-generation model

2 sources · bigness 65 · 1d ago

[AINews] OpenAI GPT-5.6 Sol / Terra / Luna — restricted to trusted partners

1 sources · bigness 22 · 14h ago