Rising1 sources· last seen 4h ago· first seen 9h ago

Gemma 4 on Llama.cpp should be stable now

With the merging of [https://github.com/ggml-org/llama.cpp/pull/21534](https://github.com/ggml-org/llama.cpp/pull/21534), all of the fixes to known Gemma 4 issues in Llama.cpp have been resolved. I've been running Gemma 4 31B on Q5 quants for some time now with no issues. Runtime hints: * remember

Lead: r/LocalLLaMABigness: 33gemmametacppstable

Open primary source

📡 Coverage

1 news source

🟠 Hacker News

🔴 Reddit

500 upvotes across 1 sub

📈 Google Trends

Full methodology: How scoring works

Receipts (all sources)

Gemma 4 on Llama.cpp should be stable now

REDDIT · r/LocalLLaMA · 9h ago · ⬆ 432 · 💬 113

score 127

backend-agnostic tensor parallelism has been merged into llama.cpp

REDDIT · r/LocalLLaMA · 4h ago · ⬆ 68 · 💬 40

score 121

if you have more than one GPU - your models can now run much faster \-sm layer is the default behaviour, -sm tensor is the new thing to try "backend-agnostic" means you don't need CUDA to enjoy this This is experimental, and in your case the results may be poor (try different models). You have be