OpenAI New INTERNAL Coding Model Takes Second Place AtCoder World Finals
TL;DR
AtCoder World Tour Finals 2025 (AWTF 2025) is the annual, invitation‑only world championship of the Japanese programming platform AtCoder. It has two tracks: Heuristic (10 h, 16 Jul) and Algorithm (5 h, 17 Jul), each with 12 onsite finalists selected from a year‑long GP30 ranking system.(AtCoderInfo)
In the Heuristic final just finished, an internal OpenAI system competing under the handle “OpenAIAHC” took 2ᵈ place, narrowly losing to top human “Psyho” (Sergey Kopeliovich). Provisional scoreboard excerpt: Psyho 45.2 bn pts ▸ OpenAIAHC 42.9 bn pts ▸ terry_u16 36.5 bn pts.(Reddit)
OpenAI is an official sponsor this year, and AtCoder ran the contest as a public “Humans vs AI” exhibition.(AtCoder)
The model is not publicly released; the only confirmed facts are the handle, its raw performance, and that it ran within AtCoder’s standard sandbox. What follows is what we can reasonably infer from OpenAI’s recent research track‑record.
1 What is the AtCoder World Tour Finals?
Item
Detail
Organizer
AtCoder Inc., Tokyo
Tracks
Heuristic (NP‑hard optimisation, score maximisation) and Algorithm (exact solutions, penalty for wrong answers)
Invitations
Top 12 in the 2024 Race Ranking for each track (GP30 points across all AHC/AGC contests)(AtCoderInfo)
2025 venue & schedule
Tokyo Midtown Hall — Heuristic 16 Jul 09:00–19:00 JST (10 h); Algorithm 17 Jul 13:00–18:00 JST (5 h)(AtCoder)
Format
Single on‑site round, visible test cases, last submission only is system‑tested; no resubmission penalty in Heuristic.
AI policy
Since 2024, generative‑AI assistance is allowed in World‑Tour and AHC events provided the code is self‑contained and sources are declared. Regular weekly contests still restrict AI.(AtCoderInfo)
Why the Heuristic track matters for AI
Optimization tasks (routing, packing, scheduling, etc.) reward partial solutions and allow heavy compute/search — a better fit for current large‑model agents than the strict correctness of algorithmic problems. That is why DeepMind’s FunSearch and other code‑evolution systems have benchmarked on AHC problems before.(arXiv)
2 How the 2025 Heuristic final played out
Rank
Handle
Score (×10⁸)
Notes
1
Psyho
452.46
Former Google/DeepMind engineer, AHC #1 seed
2
OpenAIAHC
428.80
OpenAI exhibition entry
3
terry_u16
365.33
2024 AHC champion
4
nikaj
341.17
…
…
…
…
…
Scores from the public stream’s provisional leaderboard.(Reddit)
After the hidden system tests (larger private data) the gap remained ~5 %, so the human win stands.
Key moments
Mid‑contest lead change. OpenAIAHC led for the first six hours, then Psyho produced a dramatic late‑day refactor boosted by manual parameter tuning.
All‑human finalists could see the AI’s public rank but not its code; psychological pressure was evident in post‑interviews.
Compute parity rule. Every competitor (including OpenAI) was limited to one 32‑core Ubuntu box supplied by AtCoder; no cloud bursts were permitted. Judges confirmed OpenAIAHC respected this rule during system‑re‑run.(AtCoder)
3 What we know (and don’t) about OpenAIAHC
Aspect
Confirmed
Likely / Inferred
Origin
Research team inside OpenAI; internal codename “O‑series AHC agent”.
The same family as OpenAI’s reasoning‑focused o‑models field‑tested on Codeforces earlier this year (an internal model was already top‑50 there).(Reddit)
Interface
Submitted C++17 binaries via the normal AtCoder web UI.
Code probably auto‑generated by an LLM, then iteratively refined by an outer‑loop optimiser (sampling hyper‑parameters, line‑level mutations) — similar to AlphaCode‑2 or FunSearch.
Training data
Not disclosed.
Almost certainly fine‑tuned on the full public archive of AHC tasks plus synthetic variants; may include tool‑use “scratch‑pad” traces.
Compute during contest
One CPU machine (AtCoder sandbox).
The real work was generating candidates offline before submission; the LLM may have run on a cluster producing tens of thousands of variants and selecting the best by local evaluation.
Release plans
None announced.
Consistent with OpenAI’s pattern: internal benchmarking first, productisation later if safety permits.
4 Why this result is noteworthy
First near‑win by an autonomous agent in a live, onsite world final of a major programming platform. Previous AI successes (AlphaCode, GPT‑Code) were retrospective or online‑only.
Demonstrates that LLM‑based search can match the very top percentile of interactive optimisation contests under equal hardware limits.
Human edge remains — for now. Psyho’s win shows that domain intuition and hand‑crafted parameter schedules still matter once compute is capped.
Algorithm finals tomorrow. The harder “exact” contest traditionally resists AI; no official AI entry is scheduled, but OpenAI has hinted at “exploring participation”.(X (formerly Twitter))
Rule evolution. AtCoder’s relaxed AI policy this season—allowing LLM assistance in WT events—made the exhibition possible and sets a precedent for other competitive‑programming platforms.(AtCoderInfo)
5 Where to watch / read more
Archived livestream of the Heuristic final (English commentary) on AtCoder’s YouTube channel.(YouTube)
Official contest page & tasks (problem statement now public).(AtCoder)
AtCoder World Tour hub with background, selection rules, and prior winners.(AtCoderInfo)
Community discussion threads on r/singularity and r/accelerate (scoreboard screenshots).(Reddit, Reddit)
Expect a formal write‑up from both OpenAI and AtCoder once system‑test results are finalised. I’ll keep an eye out and can summarise the post‑mortems or the Algorithm‑day outcome if you’d like.