WildEval

non-profit

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

yuntian-deng authored a paper 21 days ago

Long Grounded Thoughts: Distilling Compositional Visual Reasoning Chains at Scale

yuntian-deng authored a paper 21 days ago

DetectZoo: A Unified Toolkit for AI-Generated Content Detection Across Text, Audio, and Image Modalities

yuntian-deng authored a paper 21 days ago

Code2LoRA: Hypernetwork-Generated Adapters for Code Language Models under Software Evolution

View all activity

spaces 1

Zebra Logic Bench

Explore and evaluate Zebra Logic models

models 0

None public yet

datasets 9

WildEval/ZebraLogic

Viewer • Updated Feb 4, 2025 • 4.26k • 2.31k • 15

WildEval/G-PlanET

Viewer • Updated Aug 1, 2024 • 1.42k • 19 • 1

WildEval/ZeroEval

Viewer • Updated Jul 23, 2024 • 4.61k • 471

WildEval/WildBench-V2

Viewer • Updated May 22, 2024 • 2.05k • 122

WildEval/WildBench-Results-v2-internal

Viewer • Updated May 21, 2024 • 30k • 59

WildEval/WildBench-Results-V2

Viewer • Updated May 20, 2024 • 10.2k • 99

WildEval/WildBench-v2-dev

Viewer • Updated Apr 19, 2024 • 5.99k • 7

WildEval/WildBench-dev

Viewer • Updated Apr 19, 2024 • 14.1k • 11 • 1

WildEval/NaturalChats

Viewer • Updated Apr 18, 2024 • 641k • 1