LLM-Agent-Benchmark-List — Agent Tool by zhangxjohn

by zhangxjohn · Agent Tool · ★ 164

Last updated: · Indexed by AgentSkillsHub · Auto-synced every 8h

About LLM-Agent-Benchmark-List

A banchmark list for evaluation of large language models.

agentbenchmarklarge-language-modelsllmsurvey

Quick Facts

Stars164
Forks11
CategoryAgent Tool
LicenseApache-2.0
Quality Score42.7/100
Open Issues3
Last Updated2026-04-16
Created2024-01-29
Est. Tokens~12k

LLM-Agent-Benchmark-List alternative? Top 6 similar tools

Looking for a LLM-Agent-Benchmark-List alternative? If you're comparing LLM-Agent-Benchmark-List with other agent tool tools, these 6 projects are the closest alternatives on Agent Skills Hub — ranked by topic overlap, star count, and community traction.

  • bigcodebench by bigcode-project · ⭐ 485

    [ICLR'25] BigCodeBench: Benchmarking Code Generation Towards AGI

  • ClawProBench by suyoumo · ⭐ 576

    ClawProBench is a live-first benchmark harness for evaluating LLM agents in the OpenClaw runtime with determ

  • LLM-Tool-Survey by quchangle1 · ⭐ 481

    This is the repository for the Tool Learning survey.

  • OpenClawProBench by suyoumo · ⭐ 340

    OpenClawProBench is a live-first benchmark harness for evaluating LLM agents in the OpenClaw runtime with de

  • OpenRCA by microsoft · ⭐ 318

    [ICLR'25] OpenRCA: Can Large Language Models Locate the Root Cause of Software Failures?

  • awesome-lifelong-llm-agent by qianlima-lab · ⭐ 279

    TPAMI 2026 | This repository collects awesome survey, resource, and paper for lifelong learning LLM agents

More Agent Tool Tools

Explore other popular agent tool tools:

View all Agent Tool tools →

Frequently Asked Questions

What is LLM-Agent-Benchmark-List?

LLM-Agent-Benchmark-List is A banchmark list for evaluation of large language models.. It is categorized as a Agent Tool with 164 GitHub stars.

How do I install or use LLM-Agent-Benchmark-List?

You can find installation instructions and usage details in the LLM-Agent-Benchmark-List GitHub repository at github.com/zhangxjohn/LLM-Agent-Benchmark-List. The project has 164 stars and 11 forks, indicating an active community.

What license does LLM-Agent-Benchmark-List use?

LLM-Agent-Benchmark-List is released under the Apache-2.0 license, making it free to use and modify according to the license terms.

What are the best alternatives to LLM-Agent-Benchmark-List?

The top alternatives to LLM-Agent-Benchmark-List on Agent Skills Hub include bigcodebench, ClawProBench, LLM-Tool-Survey. Each offers a different approach to the same problem space — compare them side-by-side by stars, quality score, and community activity.

View on GitHub → Browse Agent Tool tools