by onejune2018 · Agent Tool · ★ 615
Last updated: · Indexed by AgentSkillsHub · Auto-synced every 8h
Awesome LLM Eval English | 中文 Awesome-LLM-Eval: a curated list of tools, datasets/benchmark, demos, leaderboard, papers, docs and models, mainly for Evaluation on Large Language Models and exploring the boundaries and limits of Generative AI. The is the official project of our survey: Beyond Benchmark: LLMs Evaluation with an Anthropomorphic and Value-oriented Roadmap. NOTE: As we cannot update the arXiv paper in real time, please refer to this repo for the latest updates and the paper may be updated later. We also welcome any pull request or issues to help us improve this work. Your contributions will be acknowledged in acknowledgements. If you find our survey useful, please kindly cite our paper: Table of Contents News [Tools]
| Stars | 615 |
| Forks | 51 |
| Category | Agent Tool |
| License | MIT |
| Quality Score | 39.5/100 |
| Open Issues | 9 |
| Last Updated | 2025-11-24 |
| Created | 2023-04-26 |
| Platforms | aws |
| Est. Tokens | ~1132k |
Looking for a Awesome-LLM-Eval alternative? If you're comparing Awesome-LLM-Eval with other agent tool tools, these 6 projects are the closest alternatives on Agent Skills Hub — ranked by topic overlap, star count, and community traction.
🦙 Integrating LLMs into structured NLP pipelines
User friendly CLI tool for AI tasks. Stop thinking about LLMs and prompts, start getting results!
The SmythOS Runtime Environment (SRE) is an open-source, cloud-native runtime for agentic AI. Secure, modular,
A set of tools that gives agents powerful capabilities.
Agent samples built using the Strands Agents SDK.
♾️ Private Agent Fleet with Spec Coding. Each agent gets their own GPU-accelerated desktop. Run Claude, Codex,
Explore other popular agent tool tools:
Awesome-LLM-Eval is Awesome-LLM-Eval: a curated list of tools, datasets/benchmark, demos, leaderboard, papers, docs and models, mainly for Evaluation on LLMs. 一个由工具、基准/数据、演示、排行榜和大模型等组成的精选列表,主要面向基础大模型评测,旨在探求生成式AI的技术边界.. It is categorized as a Agent Tool with 615 GitHub stars.
You can find installation instructions and usage details in the Awesome-LLM-Eval GitHub repository at github.com/onejune2018/Awesome-LLM-Eval. The project has 615 stars and 51 forks, indicating an active community.
Awesome-LLM-Eval is released under the MIT license, making it free to use and modify according to the license terms.
The top alternatives to Awesome-LLM-Eval on Agent Skills Hub include spacy-llm, cai, sre. Each offers a different approach to the same problem space — compare them side-by-side by stars, quality score, and community activity.