Cluster1 sources· last seen 2h ago· first seen 2h ago

Gemma 4 31B at 256K Full Context on a Single RTX 5090 — TurboQuant KV Cache Benchmark

Just got Gemma 4 31B running at **full 256K context** on a single RTX 5090 using TurboQuant KV cache compression. ## System Specs | Component | Spec | |-----------|------| | GPU | NVIDIA GeForce RTX 5090 (32GB VRAM) | | CPU | AMD Ryzen 9 9950X3D (16-core) | | RAM | 64GB DDR5 | | OS | Windows 11 |

Lead: r/LocalLLaMABigness: 22gemma31b256kfullcontext
📡 Coverage
10
1 news source
🟠 Hacker News
0
🔴 Reddit
50
29 upvotes across 1 sub
📈 Google Trends
0
Full methodology: How scoring works

Receipts (all sources)

score 120

Just got Gemma 4 31B running at **full 256K context** on a single RTX 5090 using TurboQuant KV cache compression. ## System Specs | Component | Spec | |-----------|------| | GPU | NVIDIA GeForce RTX 5090 (32GB VRAM) | | CPU | AMD Ryzen 9 9950X3D (16-core) | | RAM | 64GB DDR5 | | OS | Windows 11 |