Cluster1 sources· last seen 2h ago· first seen 2h ago
llama.cpp DeepSeek v4 Flash experimental inference
Hi, [here you can find](https://github.com/antirez/llama.cpp-deepseek-v4-flash) experimental llama.cpp support for DeepSeek v4, and [here](https://huggingface.co/antirez/deepseek-v4-gguf) there is the GGUF you can use to run the inference with "just" (lol) 128GB of RAM. The model, even quantized at
Lead: r/LocalLLaMABigness: 20metacppdeepseekflashexperimental
📡 Coverage
10
1 news source
🟠 Hacker News
0
🔴 Reddit
43
15 upvotes across 1 sub
📈 Google Trends
0
Full methodology: How scoring works
Receipts (all sources)
llama.cpp DeepSeek v4 Flash experimental inference
REDDIT · r/LocalLLaMA · 2h ago · ⬆ 15 · 💬 24
score 114
Hi, [here you can find](https://github.com/antirez/llama.cpp-deepseek-v4-flash) experimental llama.cpp support for DeepSeek v4, and [here](https://huggingface.co/antirez/deepseek-v4-gguf) there is the GGUF you can use to run the inference with "just" (lol) 128GB of RAM. The model, even quantized at
Related clusters
Benchmark: Windows 11 vs Lubuntu 26.04 on Llama.cpp (RTX 5080 + i9-14900KF). I didn't expect the gap to be this big.
1 sources · bigness 23 · 4h ago
Experts-Volunteers needed for Vulkan on ik_llama.cpp
1 sources · bigness 22 · 8h ago
Will llama.cpp multislot improve speed?
1 sources · bigness 17 · 5h ago
[AINews] DeepSeek V4 Pro (1.6T-A49B) and Flash (284B-A13B), Base and Instruct — runnable on Huawei Ascend chips
1 sources · bigness 4 · 1d ago