view article Article Speeding Up LLM Decoding with Advanced Universal Assisted Generation Techniques jmamou β’ Mar 24, 2025 β’ 20
view article Article Universal Assisted Generation: Faster Decoding with Any Assistant Model +6 danielkorat, orenpereg, mber, jmamou, joaogante, lewtun, Nadav-Timor, moshew β’ Oct 29, 2024 β’ 61
view article Article Faster Assisted Generation with Dynamic Speculation +5 jmamou, orenpereg, joaogante, lewtun, danielkorat, Nadav-Timor, moshew β’ Oct 8, 2024 β’ 51
view article Article Accelerate StarCoder with π€ Optimum Intel on Xeon: Q8/Q4 and Speculative Decoding +9 ofirzaf, echarlaix, imargulis, danielkorat, jmamou, guybd, orenpereg, moshew, Haihao, aayasin, FanZhao β’ Jan 30, 2024 β’ 9
view article Article Accelerate StarCoder with π€ Optimum Intel on Xeon: Q8/Q4 and Speculative Decoding +9 ofirzaf, echarlaix, imargulis, danielkorat, jmamou, guybd, orenpereg, moshew, Haihao, aayasin, FanZhao β’ Jan 30, 2024 β’ 9