Cluster1 sources· last seen 4h ago· first seen 4h ago

Built a zero allocation, header only C++ Qwen tokenizer that is nearly 20x faster than openai Tiktoken

I'm into HPC, and C++ static, zero allocation and zero dependancy software. I was studying BPE tokenizers, how do they work, so decided to build that project. I hardcoded qwen tokenizer for LLMs developers. I really know that whole Tokenization phase in llm inference is worth less than 2% of whole

Lead: r/LocalLLaMABigness: 21builtzeroallocationheaderqwen

Open primary source

📡 Coverage

1 news source

🟠 Hacker News

🔴 Reddit

36 upvotes across 1 sub

📈 Google Trends

Full methodology: How scoring works

Receipts (all sources)

Built a zero allocation, header only C++ Qwen tokenizer that is nearly 20x faster than openai Tiktoken

REDDIT · r/LocalLLaMA · 4h ago · ⬆ 36 · 💬 5

score 118