Mishig Davaadorj's picture

Mishig Davaadorj

mishig

·

AI & ML interests

NP-completeness, grammars, universality

Recent Activity

updated a dataset about 15 hours ago

huggingchat/papers-content

commented on a paper 3 days ago

Beyond Language Modeling: An Exploration of Multimodal Pretraining

updated a Space 3 days ago

lerobot/visualize_dataset

View all activity

Organizations

commented a paper 3 days ago

Beyond Language Modeling: An Exploration of Multimodal Pretraining

Paper • 2603.03276 • Published 6 days ago • 76 •

commented a paper 4 days ago

Speculative Speculative Decoding

Paper • 2603.03251 • Published 6 days ago • 2 •

commented a paper 6 days ago

CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation

Paper • 2602.24286 • Published 10 days ago • 80 •

commented a paper 7 days ago

dLLM: Simple Diffusion Language Modeling

Paper • 2602.22661 • Published 11 days ago • 123 •

commented a paper 14 days ago

Nanbeige4.1-3B: A Small General Model that Reasons, Aligns, and Acts

Paper • 2602.13367 • Published 24 days ago • 31 •

New activity in huggingface/badges 14 days ago

Upload 8 files

#33 opened 10 months ago by

commented a paper 19 days ago

When Models Manipulate Manifolds: The Geometry of a Counting Task

Paper • 2601.04480 • Published Jan 8 • 4 •

New activity in lerobot/visualize_dataset 22 days ago

Dataset Visualizer

#3 opened 3 months ago by

commented a paper 25 days ago

Green-VLA: Staged Vision-Language-Action Model for Generalist Robots

Paper • 2602.00919 • Published Jan 31 • 313 •

commented 2 papers 26 days ago

OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration

Paper • 2602.05400 • Published Feb 5 • 343 •

Towards Scalable Pre-training of Visual Tokenizers for Generation

Paper • 2512.13687 • Published Dec 15, 2025 • 106 •

commented 2 papers 27 days ago

Modality Gap-Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models

Paper • 2602.07026 • Published Feb 2 • 138 •

Learning a Generative Meta-Model of LLM Activations

Paper • 2602.06964 • Published about 1 month ago • 3 •

commented a paper 29 days ago

Generative Modeling via Drifting

Paper • 2602.04770 • Published Feb 4 • 3 •

New activity in hf-doc-build/doc-build about 1 month ago

Upload v3.8.1.zip

#54 opened about 1 month ago by

commented a paper about 1 month ago

Shaping capabilities with token-level data filtering

Paper • 2601.21571 • Published Jan 29 • 27 •

New activity in hf-doc-build/doc-build about 2 months ago

Delete jobs docs

#53 opened about 2 months ago by

reachy_mini

#52 opened about 2 months ago by

commented a paper about 2 months ago

GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

Paper • 2601.05242 • Published Jan 8 • 229 •

commented a paper 2 months ago

Guiding a Diffusion Transformer with the Internal Dynamics of Itself

Paper • 2512.24176 • Published Dec 30, 2025 • 8 •