Cluster1 sources· last seen 2h ago· first seen 2h ago
Gemma 4 31B at 256K Full Context on a Single RTX 5090 — TurboQuant KV Cache Benchmark
Just got Gemma 4 31B running at **full 256K context** on a single RTX 5090 using TurboQuant KV cache compression. ## System Specs | Component | Spec | |-----------|------| | GPU | NVIDIA GeForce RTX 5090 (32GB VRAM) | | CPU | AMD Ryzen 9 9950X3D (16-core) | | RAM | 64GB DDR5 | | OS | Windows 11 |
Lead: r/LocalLLaMABigness: 22gemma31b256kfullcontext
📡 Coverage
10
1 news source
🟠 Hacker News
0
🔴 Reddit
50
29 upvotes across 1 sub
📈 Google Trends
0
Full methodology: How scoring works
Receipts (all sources)
Gemma 4 31B at 256K Full Context on a Single RTX 5090 — TurboQuant KV Cache Benchmark
REDDIT · r/LocalLLaMA · 2h ago · ⬆ 29 · 💬 34
score 120
Just got Gemma 4 31B running at **full 256K context** on a single RTX 5090 using TurboQuant KV cache compression. ## System Specs | Component | Spec | |-----------|------| | GPU | NVIDIA GeForce RTX 5090 (32GB VRAM) | | CPU | AMD Ryzen 9 9950X3D (16-core) | | RAM | 64GB DDR5 | | OS | Windows 11 |