Cluster1 sources· last seen 4h ago· first seen 4h ago
Built a zero allocation, header only C++ Qwen tokenizer that is nearly 20x faster than openai Tiktoken
I'm into HPC, and C++ static, zero allocation and zero dependancy software. I was studying BPE tokenizers, how do they work, so decided to build that project. I hardcoded qwen tokenizer for LLMs developers. I really know that whole Tokenization phase in llm inference is worth less than 2% of whole
Lead: r/LocalLLaMABigness: 21builtzeroallocationheaderqwen
📡 Coverage
10
1 news source
🟠 Hacker News
0
🔴 Reddit
48
36 upvotes across 1 sub
📈 Google Trends
0
Full methodology: How scoring works
Receipts (all sources)
Built a zero allocation, header only C++ Qwen tokenizer that is nearly 20x faster than openai Tiktoken
REDDIT · r/LocalLLaMA · 4h ago · ⬆ 36 · 💬 5
score 118
I'm into HPC, and C++ static, zero allocation and zero dependancy software. I was studying BPE tokenizers, how do they work, so decided to build that project. I hardcoded qwen tokenizer for LLMs developers. I really know that whole Tokenization phase in llm inference is worth less than 2% of whole