Cluster1 sources· last seen 9h ago· first seen 9h ago

Ran the same models across Strix Halo, RTX 3090, and RTX 5070 because I wanted my own numbers

I kept seeing inference-speed claims for these models and wanting an apples-to-apples comparison on the hardware I actually have. So I built a harness and a public page that dumps every run as YAML. The dataset: 55 runs, three rigs, five backends (rocm, vulkan, cpu, cuda, vllm-cuda), models from 0.

Lead: r/LocalLLaMABigness: 22ranacrossstrixhalortx

Open primary source

📡 Coverage

1 news source

🟠 Hacker News

🔴 Reddit

34 upvotes across 1 sub

📈 Google Trends

Full methodology: How scoring works

Receipts (all sources)

Ran the same models across Strix Halo, RTX 3090, and RTX 5070 because I wanted my own numbers

REDDIT · r/LocalLLaMA · 9h ago · ⬆ 34 · 💬 15

score 110

Related clusters

Strix Halo Llama.cpp MTP Benchmarks: 27B Gets Much Faster, 35B Is Mixed

1 sources · bigness 29 · 16h ago

Strix Halo ROCm + MTP Notes (May 2026)

1 sources · bigness 14 · 3h ago