Google’s new Gemini Pro model has record benchmark scores — again
Frankly speaking, this model feels like it's out of this world and shouldn't exist. Beats Claude Sonnet 4.6 in every way possible. Been testing it extensively. It is the only model to perfectly ace my personal code benchmark so far. Does everything incredibly well, writes extremely clean React, Pyt
Receipts (all sources)
Gemini 3.1 Pro promises a Google LLM capable of handling more complex forms of work.
Frankly speaking, this model feels like it's out of this world and shouldn't exist. Beats Claude Sonnet 4.6 in every way possible. Been testing it extensively. It is the only model to perfectly ace my personal code benchmark so far. Does everything incredibly well, writes extremely clean React, Pyt
Definitely a noticeable improvement. Some notes: * The actual JSONs which were created from the model's output were noticeably *much* longer than 3.0 Pro; the model's increase in output length is very nice 😋 * The model actually created JSONs which were over 50MB long (for which I actually had t