RAE Collection Collection for Diffusion Transformers with Representation Autoencoders β’ 1 item β’ Updated Oct 14, 2025 β’ 11
view article Article Tokenization in Transformers v5: Simpler, Clearer, and More Modular +4 Dec 18, 2025 β’ 119
Pre-training Dataset Samples Collection A collection of pre-training datasets samples of sizes 10M, 100M and 1B tokens. Ideal for use in quick experimentation and ablations. β’ 19 items β’ Updated Dec 25, 2025 β’ 18
view article Article A Review on the Evolvement of Load Balancing Strategy in MoE LLMs: Pitfalls and Lessons Feb 4, 2025 β’ 28
AERIS: Argonne Earth Systems Model for Reliable and Skillful Predictions Paper β’ 2509.13523 β’ Published Sep 16, 2025 β’ 7
AERIS: Argonne Earth Systems Model for Reliable and Skillful Predictions Paper β’ 2509.13523 β’ Published Sep 16, 2025 β’ 7 β’ 2
AERIS: Argonne Earth Systems Model for Reliable and Skillful Predictions Paper β’ 2509.13523 β’ Published Sep 16, 2025 β’ 7
Nemotron-Pre-Training-Datasets Collection Large scale pre-training datasets used in the Nemotron family of models. β’ 11 items β’ Updated 4 days ago β’ 96
HuggingFaceTB/SmolLM2-1.7B-Instruct Text Generation β’ 2B β’ Updated Apr 21, 2025 β’ 72.7k β’ 709
SphereDiff: Tuning-free Omnidirectional Panoramic Image and Video Generation via Spherical Latent Representation Paper β’ 2504.14396 β’ Published Apr 19, 2025 β’ 27