DiagnosticIQ: A Benchmark for LLM-Based Industrial Maintenance Action Recommendation from Symbolic Rules Paper • 2605.08614 • Published 14 days ago • 7
MCP-Cosmos: World Model-Augmented Agents for Complex Task Execution in MCP Environments Paper • 2605.09131 • Published 14 days ago • 55
Enterprise Agents and Benchmarks Collection Enterprise agent ecosystem featuring AssetOpsBench (industrial) and ITBench (SRE, FinOps, CISO), CUGA to accelerate AI Automation • 18 items • Updated 3 days ago • 17
view article Article AssetOpsBench: Bridging the Gap Between AI Agent Benchmarks and Industrial Reality ibm-research • Jan 21 • 33
AssetOpsBench: Benchmarking AI Agents for Task Automation in Industrial Asset Operations and Maintenance Paper • 2506.03828 • Published Jun 4, 2025 • 20
AssetOpsBench: Benchmarking AI Agents for Task Automation in Industrial Asset Operations and Maintenance Paper • 2506.03828 • Published Jun 4, 2025 • 20