Big1 sources· last seen 6h ago· first seen 6h ago

SWE-rebench Leaderboard (Feb 2026): GPT-5.4, Qwen3.5, Gemini 3.1 Pro, Step-3.5-Flash and More

Hi, We’ve updated the **SWE-rebench leaderboard** with our **February runs** on **57 fresh GitHub PR tasks** (restricted to PRs created in the previous month). The setup is standard SWE-bench: models read real PR issues, edit code, run tests, and must make the full suite pass. Key observations: *

Lead: r/LocalLLaMABigness: 54swe-rebenchleaderboardfeb2026gpt-5

Open primary source

📡 Coverage

1 news source

🟠 Hacker News

🔴 Reddit

97 upvotes across 1 sub

📈 Google Trends

Gemini AI: 76/100 ↑9%

Full methodology: How scoring works

Receipts (all sources)

SWE-rebench Leaderboard (Feb 2026): GPT-5.4, Qwen3.5, Gemini 3.1 Pro, Step-3.5-Flash and More

REDDIT · r/LocalLLaMA · 6h ago · ⬆ 97 · 💬 57

score 123