Adapters: A Unified Library for Parameter-Efficient and Modular Transfer Learning Paper • 2311.11077 • Published Nov 18, 2023 • 29
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect Paper • 2403.03853 • Published Mar 6, 2024 • 66
DarwinLM: Evolutionary Structured Pruning of Large Language Models Paper • 2502.07780 • Published Feb 11, 2025 • 18
The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks Paper • 2502.08235 • Published Feb 12, 2025 • 59
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published Feb 4, 2025 • 258
Small Models, Big Impact: Efficient Corpus and Graph-Based Adaptation of Small Multilingual Language Models for Low-Resource Languages Paper • 2502.10140 • Published Feb 14, 2025 • 9
Fine-Tuning Small Language Models for Domain-Specific AI: An Edge AI Perspective Paper • 2503.01933 • Published Mar 3, 2025 • 13
Fast Inference from Transformers via Speculative Decoding Paper • 2211.17192 • Published Nov 30, 2022 • 11
Neural Thickets: Diverse Task Experts Are Dense Around Pretrained Weights Paper • 2603.12228 • Published Mar 12 • 12
Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation Paper • 2603.19220 • Published Mar 19 • 66
Structured 3D Latents for Scalable and Versatile 3D Generation Paper • 2412.01506 • Published Dec 2, 2024 • 89