by THUDM · Agent Tool · ★ 3.2k
Last updated: · Indexed by AgentSkillsHub · Auto-synced every 8h
A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)
| Stars | 3,214 |
| Forks | 240 |
| Language | Python |
| Category | Agent Tool |
| License | Apache-2.0 |
| Quality Score | 40.7/100 |
| Open Issues | 68 |
| Last Updated | 2026-02-08 |
| Created | 2023-07-28 |
| Platforms | python |
| Est. Tokens | ~1970k |
Looking for a AgentBench alternative? If you're comparing AgentBench with other agent tool tools, these 6 projects are the closest alternatives on Agent Skills Hub — ranked by topic overlap, star count, and community traction.
Harness LLMs with Multi-Agent Programming
A curated list of Generative AI tools, works, models, and references
Your agent in your terminal, equipped with local tools: writes code, uses the terminal, browses the web. Make
总结Prompt&LLM论文,开源数据&模型,AIGC应用
Conquer Any Code in VSCode: One-Click Comments, Conversions, UI-to-Code, and AI Batch Processing of Files! 在 V
Awesome things about LLM-powered agents. Papers / Repos / Blogs / ...
Explore other popular agent tool tools:
AgentBench is A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24). It is categorized as a Agent Tool with 3.2k GitHub stars.
AgentBench is primarily written in Python. It covers topics such as chatgpt, gpt-4, llm.
You can find installation instructions and usage details in the AgentBench GitHub repository at github.com/THUDM/AgentBench. The project has 3.2k stars and 240 forks, indicating an active community.
AgentBench is released under the Apache-2.0 license, making it free to use and modify according to the license terms.
The top alternatives to AgentBench on Agent Skills Hub include langroid, awesome-generative-ai, gptme. Each offers a different approach to the same problem space — compare them side-by-side by stars, quality score, and community activity.