Toolathlon — Agent Tool by hkust-nlp

by hkust-nlp · Agent Tool · ★ 389

Last updated: · Indexed by AgentSkillsHub · Auto-synced every 8h

About Toolathlon

The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution Introduction Toolathlon is a benchmark to assess language agents' general tool use in realistic environments. It features 600+ diverse tools based on real-world software environments. Each task requires long-horizon tool calls to complete. Below we show a demo task where the agent needs to automatically check assignments in the email box, and grade them on Canvas. News [2025.12.12] 📣 We have setup a new doc

Quick Facts

Stars389
Forks40
LanguagePython
CategoryAgent Tool
Quality Score32.2/100
Open Issues12
Last Updated2026-06-09
Created2025-10-24
Platformspython
Est. Tokens~23330k

Toolathlon alternative? Top 6 similar tools

Looking for a Toolathlon alternative? If you're comparing Toolathlon with other agent tool tools, these 6 projects are the closest alternatives on Agent Skills Hub — ranked by topic overlap, star count, and community traction.

  • skills by GuDaStudio · ⭐ 1.9k

    This repository contains a collection of Agent Skills developed by GudaStudio, enabling seamless collaboration

  • claude-forge by sangrokjung · ⭐ 751

    Supercharge Claude Code with 11 AI agents, 36 commands & 15 skills — the claude-code plugin framework inspired

  • excalidraw-diagram-skill by coleam00 · ⭐ 718

    Skill to give Claude Code (and any coding agent) the ability to generate beautiful and practical Excalidraw di

  • agent-skills by hashicorp · ⭐ 639

    A collection of Agent skills and Claude Code plugins for HashiCorp products.

  • awesome-android-agent-skills by new-silvermoon · ⭐ 588

    A collection of standardized Agent Skills to teach GitHub Copilot, Claude, Gemini and Cursor about modern Andr

  • claude-code-skill-factory by alirezarezvani · ⭐ 571

    Claude Code Skill Factory — A powerful open-source toolkit for building and deploying production-ready Claude

More Agent Tool Tools

Explore other popular agent tool tools:

View all Agent Tool tools →

Popular Python Agent Tools

Frequently Asked Questions

What is Toolathlon?

Toolathlon is [ICLR 2026] The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution. It is categorized as a Agent Tool with 389 GitHub stars.

What programming language is Toolathlon written in?

Toolathlon is primarily written in Python.

How do I install or use Toolathlon?

You can find installation instructions and usage details in the Toolathlon GitHub repository at github.com/hkust-nlp/Toolathlon. The project has 389 stars and 40 forks, indicating an active community.

What are the best alternatives to Toolathlon?

The top alternatives to Toolathlon on Agent Skills Hub include skills, claude-forge, excalidraw-diagram-skill. Each offers a different approach to the same problem space — compare them side-by-side by stars, quality score, and community activity.

View on GitHub → Browse Agent Tool tools