What changed
Live Event Page
Significance-Gain Pair Encoding for LLMs: A Statistical Alternative to Frequency-Based Subword Merging
arXiv:2603.19261v1 Announce Type: new Abstract: Subword tokenization is a key design choice for modern language models, including large language models (LLMs), with byte- and ch...
Early report
Major update
Updated Mar 23, 2026, 4:00 AM UTC