Vectorizing the Trie: Efficient Constrained Decoding for LLM-based Generative Retrieval on Accelerators Paper • 2602.22647 • Published 26 days ago • 4