-
Economies of Open Intelligence: Tracing Power & Participation in the Model Ecosystem
Paper • 2512.03073 • Published • 7 -
Anatomy of a Machine Learning Ecosystem: 2 Million Models on Hugging Face
Paper • 2508.06811 • Published • 5 -
Bridging the Data Provenance Gap Across Text, Speech and Video
Paper • 2412.17847 • Published • 12 -
Ecosystem of Large Language Models for Code
Paper • 2405.16746 • Published
AI & ML interests
Data and Research leveraging the Hugging Face Hub for AI ecosystem research
Recent Activity
Hugging Face Data for Research
A resource hub for researchers studying AI ecosystem development and adoption using data from the Hugging Face platform.
Why the Hub matters for research
The Hugging Face Hub offers a rich source of data for understanding how the AI ecosystem evolves. Information about models, datasets, Spaces, papers, and community activity is publicly accessible—making it possible to analyze trends in model development, dataset usage, research directions, and adoption patterns over time.
How to access the data
Pre-compiled datasets provide the simplest entry point. We recommend starting with community-maintained snapshots such as:
- cfahlgren1/hub-stats — Daily snapshots of models, datasets, Spaces, papers, and related metadata in Parquet format, suitable for large-scale analysis.
For custom views or real-time access, use the Hub API.
The API supports programmatic access to repository metadata, search, and more.
See the OpenAPI specification and documentation for details.
Python users can rely on the huggingface_hub client.
Understanding data limitations
Metrics such as download counts are useful but imperfect. They reflect a complex process shaped by infrastructure, caching, and usage patterns. They work well for:
- Global and temporal trends
- Relative comparisons across resources
- Longitudinal studies of adoption
They are less reliable for fine-grained rankings or absolute comparisons. When designing studies, consider what the data actually measures and how infrastructure and noise may affect your conclusions.
Explore and connect
- Collections — Curated datasets and research using Hub data
- Discussions — Questions, feedback, and collaboration: Join the conversation
We welcome researchers interested in responsible use of Hub data for ecosystem studies.
-
Hub Stats
📈38Visualize Hugging Face Hub growth and stats
-
Open Model Evolution
📊15Build and explore interactive dashboards in your browser
-
Hub Model Tree Stats
🌴Aggregated stats about derived models for an author.
-
Open Source AI Year In Review 2025
📚27Reviewing Progress of the Open Source Ecosystem
-
Economies of Open Intelligence: Tracing Power & Participation in the Model Ecosystem
Paper • 2512.03073 • Published • 7 -
Anatomy of a Machine Learning Ecosystem: 2 Million Models on Hugging Face
Paper • 2508.06811 • Published • 5 -
Bridging the Data Provenance Gap Across Text, Speech and Video
Paper • 2412.17847 • Published • 12 -
Ecosystem of Large Language Models for Code
Paper • 2405.16746 • Published
-
Hub Stats
📈38Visualize Hugging Face Hub growth and stats
-
Open Model Evolution
📊15Build and explore interactive dashboards in your browser
-
Hub Model Tree Stats
🌴Aggregated stats about derived models for an author.
-
Open Source AI Year In Review 2025
📚27Reviewing Progress of the Open Source Ecosystem