Big1 sources· last seen 3h ago· first seen 3h ago

microsoft/Phi-4-reasoning-vision-15B · Hugging Face

# Phi-4-Reasoning-Vision-15B is a compact open-weight multimodal reasoning model built on the Phi-4-Reasoning language model backbone and the SigLIP-2 vision encoder, using a mid-fusion architecture. In this architecture, the vision encoder first converts images into visual tokens, which are then

Lead: r/LocalLLaMABigness: 60microsoftphi-4-reasoning-vision-15bhuggingface

Open primary source

📡 Coverage

1 news source

🟠 Hacker News

🔴 Reddit

127 upvotes across 1 sub

📈 Google Trends

Microsoft: 93/100 ↑7%

Full methodology: How scoring works

Receipts (all sources)

microsoft/Phi-4-reasoning-vision-15B · Hugging Face

REDDIT · r/LocalLLaMA · 3h ago · ⬆ 127 · 💬 23

score 128