by hkust-nlp · Agent Tool · ★ 389
Last updated: · Indexed by AgentSkillsHub · Auto-synced every 8h
The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution Introduction Toolathlon is a benchmark to assess language agents' general tool use in realistic environments. It features 600+ diverse tools based on real-world software environments. Each task requires long-horizon tool calls to complete. Below we show a demo task where the agent needs to automatically check assignments in the email box, and grade them on Canvas. News [2025.12.12] 📣 We have setup a new doc
| Stars | 389 |
| Forks | 40 |
| Language | Python |
| Category | Agent Tool |
| Quality Score | 32.2/100 |
| Open Issues | 12 |
| Last Updated | 2026-06-09 |
| Created | 2025-10-24 |
| Platforms | python |
| Est. Tokens | ~23330k |
Looking for a Toolathlon alternative? If you're comparing Toolathlon with other agent tool tools, these 6 projects are the closest alternatives on Agent Skills Hub — ranked by topic overlap, star count, and community traction.
This repository contains a collection of Agent Skills developed by GudaStudio, enabling seamless collaboration
Supercharge Claude Code with 11 AI agents, 36 commands & 15 skills — the claude-code plugin framework inspired
Skill to give Claude Code (and any coding agent) the ability to generate beautiful and practical Excalidraw di
A collection of Agent skills and Claude Code plugins for HashiCorp products.
A collection of standardized Agent Skills to teach GitHub Copilot, Claude, Gemini and Cursor about modern Andr
Claude Code Skill Factory — A powerful open-source toolkit for building and deploying production-ready Claude
Explore other popular agent tool tools:
Toolathlon is [ICLR 2026] The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution. It is categorized as a Agent Tool with 389 GitHub stars.
Toolathlon is primarily written in Python.
You can find installation instructions and usage details in the Toolathlon GitHub repository at github.com/hkust-nlp/Toolathlon. The project has 389 stars and 40 forks, indicating an active community.
The top alternatives to Toolathlon on Agent Skills Hub include skills, claude-forge, excalidraw-diagram-skill. Each offers a different approach to the same problem space — compare them side-by-side by stars, quality score, and community activity.