Rising1 sources· last seen 4h ago· first seen 9h ago

Gemma 4 on Llama.cpp should be stable now

With the merging of [https://github.com/ggml-org/llama.cpp/pull/21534](https://github.com/ggml-org/llama.cpp/pull/21534), all of the fixes to known Gemma 4 issues in Llama.cpp have been resolved. I've been running Gemma 4 31B on Q5 quants for some time now with no issues. Runtime hints: * remember

Lead: r/LocalLLaMABigness: 33gemmametacppstable
📡 Coverage
10
1 news source
🟠 Hacker News
0
🔴 Reddit
83
500 upvotes across 1 sub
📈 Google Trends
0
Full methodology: How scoring works

Receipts (all sources)

Gemma 4 on Llama.cpp should be stable now
REDDIT · r/LocalLLaMA · 9h ago · ⬆ 432 · 💬 113
score 127

With the merging of [https://github.com/ggml-org/llama.cpp/pull/21534](https://github.com/ggml-org/llama.cpp/pull/21534), all of the fixes to known Gemma 4 issues in Llama.cpp have been resolved. I've been running Gemma 4 31B on Q5 quants for some time now with no issues. Runtime hints: * remember

backend-agnostic tensor parallelism has been merged into llama.cpp
REDDIT · r/LocalLLaMA · 4h ago · ⬆ 68 · 💬 40
score 121

if you have more than one GPU - your models can now run much faster \-sm layer is the default behaviour, -sm tensor is the new thing to try "backend-agnostic" means you don't need CUDA to enjoy this This is experimental, and in your case the results may be poor (try different models). You have be