Big1 sources· last seen 3h ago· first seen 3h ago

microsoft/Phi-4-reasoning-vision-15B · Hugging Face

# Phi-4-Reasoning-Vision-15B is a compact open-weight multimodal reasoning model built on the Phi-4-Reasoning language model backbone and the SigLIP-2 vision encoder, using a mid-fusion architecture. In this architecture, the vision encoder first converts images into visual tokens, which are then

Lead: r/LocalLLaMABigness: 60microsoftphi-4-reasoning-vision-15bhuggingface
📡 Coverage
10
1 news source
🟠 Hacker News
0
🔴 Reddit
64
127 upvotes across 1 sub
📈 Google Trends
93
Microsoft: 93/100 ↑7%
Full methodology: How scoring works

Receipts (all sources)

microsoft/Phi-4-reasoning-vision-15B · Hugging Face
REDDIT · r/LocalLLaMA · 3h ago · ⬆ 127 · 💬 23
score 128

# Phi-4-Reasoning-Vision-15B is a compact open-weight multimodal reasoning model built on the Phi-4-Reasoning language model backbone and the SigLIP-2 vision encoder, using a mid-fusion architecture. In this architecture, the vision encoder first converts images into visual tokens, which are then