Cluster1 sources· last seen 1d ago· first seen 1d ago

We could be hours (or less than a week) away from true NVFP4 support in Llama.cpp GGUF format 👀

I'm not a contributor myself but as someone with only 48GB total usable memory I am so glad to see this so quickly coming to fruition. Previously the best we had for NVFP4 was through [vLLM which not only can't offload weights to RAM like llama.cpp but also has loads of related bugs](https://www.red

Lead: r/LocalLLaMABigness: 23hourslessweekawaytrue
📡 Coverage
10
1 news source
🟠 Hacker News
0
🔴 Reddit
54
147 upvotes across 1 sub
📈 Google Trends
0
Full methodology: How scoring works

Receipts (all sources)

I'm not a contributor myself but as someone with only 48GB total usable memory I am so glad to see this so quickly coming to fruition. Previously the best we had for NVFP4 was through [vLLM which not only can't offload weights to RAM like llama.cpp but also has loads of related bugs](https://www.red