AI & ML interests

Data and Research leveraging the Hugging Face Hub for AI ecosystem research

Recent Activity

Organization Card

Hugging Face Data for Research

A resource hub for researchers studying AI ecosystem development and adoption using data from the Hugging Face platform.

Why the Hub matters for research

The Hugging Face Hub offers a rich source of data for understanding how the AI ecosystem evolves. Information about models, datasets, Spaces, papers, and community activity is publicly accessible—making it possible to analyze trends in model development, dataset usage, research directions, and adoption patterns over time.

How to access the data

Pre-compiled datasets provide the simplest entry point. We recommend starting with community-maintained snapshots such as:

  • cfahlgren1/hub-stats — Daily snapshots of models, datasets, Spaces, papers, and related metadata in Parquet format, suitable for large-scale analysis.

For custom views or real-time access, use the Hub API. The API supports programmatic access to repository metadata, search, and more. See the OpenAPI specification and documentation for details. Python users can rely on the huggingface_hub client.

Understanding data limitations

Metrics such as download counts are useful but imperfect. They reflect a complex process shaped by infrastructure, caching, and usage patterns. They work well for:

  • Global and temporal trends
  • Relative comparisons across resources
  • Longitudinal studies of adoption

They are less reliable for fine-grained rankings or absolute comparisons. When designing studies, consider what the data actually measures and how infrastructure and noise may affect your conclusions.

Explore and connect

  • Collections — Curated datasets and research using Hub data
  • Discussions — Questions, feedback, and collaboration: Join the conversation

We welcome researchers interested in responsible use of Hub data for ecosystem studies.

models 0

None public yet

datasets 0

None public yet