Cluster1 sources· last seen 2h ago· first seen 2h ago

llama.cpp DeepSeek v4 Flash experimental inference

Hi, [here you can find](https://github.com/antirez/llama.cpp-deepseek-v4-flash) experimental llama.cpp support for DeepSeek v4, and [here](https://huggingface.co/antirez/deepseek-v4-gguf) there is the GGUF you can use to run the inference with "just" (lol) 128GB of RAM. The model, even quantized at

Lead: r/LocalLLaMABigness: 20metacppdeepseekflashexperimental
📡 Coverage
10
1 news source
🟠 Hacker News
0
🔴 Reddit
43
15 upvotes across 1 sub
📈 Google Trends
0
Full methodology: How scoring works

Receipts (all sources)

llama.cpp DeepSeek v4 Flash experimental inference
REDDIT · r/LocalLLaMA · 2h ago · ⬆ 15 · 💬 24
score 114

Hi, [here you can find](https://github.com/antirez/llama.cpp-deepseek-v4-flash) experimental llama.cpp support for DeepSeek v4, and [here](https://huggingface.co/antirez/deepseek-v4-gguf) there is the GGUF you can use to run the inference with "just" (lol) 128GB of RAM. The model, even quantized at

Related clusters