Cluster1 sources· last seen 9h ago· first seen 9h ago
Ran the same models across Strix Halo, RTX 3090, and RTX 5070 because I wanted my own numbers
I kept seeing inference-speed claims for these models and wanting an apples-to-apples comparison on the hardware I actually have. So I built a harness and a public page that dumps every run as YAML. The dataset: 55 runs, three rigs, five backends (rocm, vulkan, cpu, cuda, vllm-cuda), models from 0.
Lead: r/LocalLLaMABigness: 22ranacrossstrixhalortx
📡 Coverage
10
1 news source
🟠 Hacker News
0
🔴 Reddit
49
34 upvotes across 1 sub
📈 Google Trends
0
Full methodology: How scoring works
Receipts (all sources)
Ran the same models across Strix Halo, RTX 3090, and RTX 5070 because I wanted my own numbers
REDDIT · r/LocalLLaMA · 9h ago · ⬆ 34 · 💬 15
score 110
I kept seeing inference-speed claims for these models and wanting an apples-to-apples comparison on the hardware I actually have. So I built a harness and a public page that dumps every run as YAML. The dataset: 55 runs, three rigs, five backends (rocm, vulkan, cpu, cuda, vllm-cuda), models from 0.