Gemma 4 on Llama.cpp should be stable now
With the merging of [https://github.com/ggml-org/llama.cpp/pull/21534](https://github.com/ggml-org/llama.cpp/pull/21534), all of the fixes to known Gemma 4 issues in Llama.cpp have been resolved. I've been running Gemma 4 31B on Q5 quants for some time now with no issues. Runtime hints: * remember
Receipts (all sources)
With the merging of [https://github.com/ggml-org/llama.cpp/pull/21534](https://github.com/ggml-org/llama.cpp/pull/21534), all of the fixes to known Gemma 4 issues in Llama.cpp have been resolved. I've been running Gemma 4 31B on Q5 quants for some time now with no issues. Runtime hints: * remember
if you have more than one GPU - your models can now run much faster \-sm layer is the default behaviour, -sm tensor is the new thing to try "backend-agnostic" means you don't need CUDA to enjoy this This is experimental, and in your case the results may be poor (try different models). You have be