Rising1 sources· last seen 18h ago· first seen 18h ago

On a difficult new SWE benchmark, ProgramBench, GPT5.5 high/xhigh solves a task for first time, significantly outperforms Opus 4.7

Link to tweets: https://x.com/KLieret/status/2054215545663144217?s=20 Link to GitHub: [https://github.com/facebookresearch/ProgramBench/](https://github.com/facebookresearch/ProgramBench/) Link to ProgramBench website: [https://programbench.com/blog/gpt-5-5-first-solve/](https://programbench.co

Lead: r/singularityBigness: 32difficultswebenchmarkprogrambenchopenai

Open primary source

📡 Coverage

1 news source

🟠 Hacker News

🔴 Reddit

437 upvotes across 1 sub

📈 Google Trends

Full methodology: How scoring works

Receipts (all sources)

On a difficult new SWE benchmark, ProgramBench, GPT5.5 high/xhigh solves a task for first time, significantly outperforms Opus 4.7

REDDIT · r/singularity · 18h ago · ⬆ 437 · 💬 78

score 114