EvalEval Bot
EvalEvalBot
AI & ML interests
None yet
Recent Activity
new activity about 9 hours ago
evaleval/EEE_datastore:Upload Theory of Mind updated a dataset about 9 hours ago
evaleval/EEE_datastore new activity about 12 hours ago
evaleval/EEE_datastore:Upload Theory of MindOrganizations
Upload Theory of Mind
4
#53 opened about 9 hours ago
by
SirGankalot
Upload Theory of Mind
19
#38 opened 11 days ago
by
SirGankalot
Upload 5 files
1
#52 opened 4 days ago
by
lmushro
[ACL Shared Task] Add Wordle Arena & Fibble Arena evaluation results
10
#35 opened 18 days ago
by
drchangliu
Upload 5 files
1
#51 opened 4 days ago
by
lmushro
Upload 5 files
9
#34 opened 19 days ago
by
lmushro
Parquet for dataset viewer
#49 opened 6 days ago
by
EvalEvalBot
Update HELM Leaderboards
3
#45 opened 9 days ago
by
Damian96
Parquet for dataset viewer
#48 opened 6 days ago
by
EvalEvalBot
Parquet for dataset viewer
#44 opened 9 days ago
by
EvalEvalBot
Parquet for dataset viewer
#47 opened 6 days ago
by
EvalEvalBot
[Mercor] APEX eval results (apex-agents, ace, apex-v1)
5
#36 opened 18 days ago
by
madhavan113
[Submission] Update Exgentic Open Agent Leaderboard results (fix duplicate agent names)
2
#46 opened 7 days ago
by
Elron
Parquet for dataset viewer
#42 opened 9 days ago
by
EvalEvalBot
Parquet for dataset viewer
#41 opened 9 days ago
by
EvalEvalBot
[Submission] Terminal-Bench 2.0 leaderboard data (115 agent+model results)
4
#28 opened 20 days ago
by
StevenDillmann
[Submission] Terminal-Bench 2.0 leaderboard data (schema v0.2.2, eval_library=harbor)
6
#37 opened 12 days ago
by
StevenDillmann
[Submission] Add Exgentic Open Agent Leaderboard results (90 entries)
5
#39 opened 10 days ago
by
Elron
[Submission] Add Exgentic Open Agent Leaderboard results (90 entries)
5
#39 opened 10 days ago
by
Elron
[ACL Shared Task] Add Wordle Arena & Fibble Arena evaluation results
10
#35 opened 18 days ago
by
drchangliu