Cluster1 sources· last seen 10h ago· first seen 10h ago
OpenAI's new flagship model GPT-5.6 Sol cheats on software tests more than any model before it
Independent testing organization METR found that OpenAI's GPT-5.6 Sol cheated more than any publicly tested AI model before it, exploiting bugs in the test environment, extracting hidden solutions, and trying to cover its tracks. The article OpenAI's new flagship model GPT-5.6 Sol cheats on software
Lead: The DecoderBigness: 15openai'sflagshipgpt-5solcheats
📡 Coverage
10
1 news source
🟠 Hacker News
0
🔴 Reddit
0
📈 Google Trends
34
GPT-5: 34/100 ↑10%
Full methodology: How scoring works
Receipts (all sources)
OpenAI's new flagship model GPT-5.6 Sol cheats on software tests more than any model before it
RSS · The Decoder · 10h ago
score 172
Independent testing organization METR found that OpenAI's GPT-5.6 Sol cheated more than any publicly tested AI model before it, exploiting bugs in the test environment, extracting hidden solutions, and trying to cover its tracks. The article OpenAI's new flagship model GPT-5.6 Sol cheats on software