Rising1 sources· last seen 18h ago· first seen 18h ago
On a difficult new SWE benchmark, ProgramBench, GPT5.5 high/xhigh solves a task for first time, significantly outperforms Opus 4.7
Link to tweets: https://x.com/KLieret/status/2054215545663144217?s=20 Link to GitHub: [https://github.com/facebookresearch/ProgramBench/](https://github.com/facebookresearch/ProgramBench/) Link to ProgramBench website: [https://programbench.com/blog/gpt-5-5-first-solve/](https://programbench.co
Lead: r/singularityBigness: 32difficultswebenchmarkprogrambenchopenai
📡 Coverage
10
1 news source
🟠 Hacker News
0
🔴 Reddit
80
437 upvotes across 1 sub
📈 Google Trends
0
Full methodology: How scoring works
Receipts (all sources)
On a difficult new SWE benchmark, ProgramBench, GPT5.5 high/xhigh solves a task for first time, significantly outperforms Opus 4.7
REDDIT · r/singularity · 18h ago · ⬆ 437 · 💬 78
score 114
Link to tweets: https://x.com/KLieret/status/2054215545663144217?s=20 Link to GitHub: [https://github.com/facebookresearch/ProgramBench/](https://github.com/facebookresearch/ProgramBench/) Link to ProgramBench website: [https://programbench.com/blog/gpt-5-5-first-solve/](https://programbench.co