Rising1 sources· last seen 4h ago· first seen 4h ago
The ARC-AGI2 Illusion Of Progress: If Changing the Font Breaks the Model, It Doesn't Understand
Over the past few weeks, with the release of Claude Opus 4.6, Gemini 3.1 Pro, and Gemini 3 Pro Deepthink, all scoring a record-breaking 68%, 77%, and 84% on ARC-AGI2, I became extremely excited and started to believe these new models could kick off recursive self-improvement any minute. Indeed, the
Lead: r/singularityBigness: 28arc-agi2illusionprogresschangingfont
📡 Coverage
10
1 news source
🟠 Hacker News
0
🔴 Reddit
68
151 upvotes across 1 sub
📈 Google Trends
0
Full methodology: How scoring works
Receipts (all sources)
The ARC-AGI2 Illusion Of Progress: If Changing the Font Breaks the Model, It Doesn't Understand
REDDIT · r/singularity · 4h ago · ⬆ 151 · 💬 65
score 127
Over the past few weeks, with the release of Claude Opus 4.6, Gemini 3.1 Pro, and Gemini 3 Pro Deepthink, all scoring a record-breaking 68%, 77%, and 84% on ARC-AGI2, I became extremely excited and started to believe these new models could kick off recursive self-improvement any minute. Indeed, the