Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up
MiniMaxAI 's Collections
MiniMax-M2
VTP
MiniMax-M1
SynLogic
One-RL-to-See-Them-All
MiniMax-Speech
MiniMax-M2.1
MiniMax-01

VTP

updated Apr 15

Towards Scalable Pre-training of Visual Tokenizers for Generation

Upvote
42

  • MiniMaxAI/VTP-Small-f16d64

    Image Feature Extraction • 0.2B • Updated Dec 16, 2025 • 110 • 14

  • MiniMaxAI/VTP-Base-f16d64

    Image Feature Extraction • Updated Dec 16, 2025 • 102 • 20

  • MiniMaxAI/VTP-Large-f16d64

    Image Feature Extraction • 0.7B • Updated Dec 16, 2025 • 586 • 15

  • Towards Scalable Pre-training of Visual Tokenizers for Generation

    Paper • 2512.13687 • Published Dec 15, 2025 • 106
Upvote
42
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs