Cluster1 sources· last seen 4h ago· first seen 4h ago

Built a zero allocation, header only C++ Qwen tokenizer that is nearly 20x faster than openai Tiktoken

I'm into HPC, and C++ static, zero allocation and zero dependancy software. I was studying BPE tokenizers, how do they work, so decided to build that project. I hardcoded qwen tokenizer for LLMs developers. I really know that whole Tokenization phase in llm inference is worth less than 2% of whole

Lead: r/LocalLLaMABigness: 21builtzeroallocationheaderqwen
📡 Coverage
10
1 news source
🟠 Hacker News
0
🔴 Reddit
48
36 upvotes across 1 sub
📈 Google Trends
0
Full methodology: How scoring works

Receipts (all sources)

I'm into HPC, and C++ static, zero allocation and zero dependancy software. I was studying BPE tokenizers, how do they work, so decided to build that project. I hardcoded qwen tokenizer for LLMs developers. I really know that whole Tokenization phase in llm inference is worth less than 2% of whole