Cluster1 sources· last seen 6d ago· first seen 6d ago

FYI, Step 3.5 Flash has better perf and context is 1/4 the price in llama.cpp

So i recently updated LMstudio after a long pause and updated my llama.cpp runtimes too.. i was shocked.. i thought maybe something like turboquant was enabled by default.. but.. it just turns out this model's support got way better. Step 3.5 Flash now slows down \~2.5x less as you load the con

Lead: r/LocalLLaMABigness: 13fyistepflashbetterperf

Open primary source

📡 Coverage

1 news source

🟠 Hacker News

🔴 Reddit

12 upvotes across 1 sub

📈 Google Trends

Full methodology: How scoring works

Receipts (all sources)

FYI, Step 3.5 Flash has better perf and context is 1/4 the price in llama.cpp

REDDIT · r/LocalLLaMA · 6d ago · ⬆ 12 · 💬 12

score 113