Cluster1 sources· last seen 3h ago· first seen 3h ago

Gemma-4-31B NVFP4 inference numbers on 1x RTX Pro 6000

Ran a quick inference sweep on gemma 4 31B in NVFP4 (using [nvidia/Gemma-4-31B-IT-NVFP4](https://huggingface.co/nvidia/Gemma-4-31B-IT-NVFP4)). The NVFP4 checkpoint is 32GB, half of the BF16 size from google (63GB), likely a mix of BF16 and FP4 roughly equal to FP8 in size. This model uses a ton of V

Lead: r/LocalLLaMABigness: 21gemma-4-31bnvfp4inferencenumbersrtx

Open primary source

📡 Coverage

1 news source

🟠 Hacker News

🔴 Reddit

29 upvotes across 1 sub

📈 Google Trends

Full methodology: How scoring works

Receipts (all sources)

Gemma-4-31B NVFP4 inference numbers on 1x RTX Pro 6000

REDDIT · r/LocalLLaMA · 3h ago · ⬆ 29 · 💬 15

score 118