Cluster1 sources· last seen 2h ago· first seen 2h ago

Gemma 4 31B at 256K Full Context on a Single RTX 5090 — TurboQuant KV Cache Benchmark

Just got Gemma 4 31B running at **full 256K context** on a single RTX 5090 using TurboQuant KV cache compression. ## System Specs | Component | Spec | |-----------|------| | GPU | NVIDIA GeForce RTX 5090 (32GB VRAM) | | CPU | AMD Ryzen 9 9950X3D (16-core) | | RAM | 64GB DDR5 | | OS | Windows 11 |

Lead: r/LocalLLaMABigness: 22gemma31b256kfullcontext

Open primary source

📡 Coverage

1 news source

🟠 Hacker News

🔴 Reddit

29 upvotes across 1 sub

📈 Google Trends

Full methodology: How scoring works

Receipts (all sources)

Gemma 4 31B at 256K Full Context on a Single RTX 5090 — TurboQuant KV Cache Benchmark

REDDIT · r/LocalLLaMA · 2h ago · ⬆ 29 · 💬 34

score 120