chinese-llm-benchmark — Agent Tool by jeinlee1991

by jeinlee1991 · Agent Tool · ★ 6.2k

Last updated: 2026-06-27 · Indexed by AgentSkillsHub · Auto-synced every 8h

About chinese-llm-benchmark

非线智能 NoneLinear - ReLE评测：中文AI大模型能力评测（持续更新） ReLE （Really Reliable Live Evaluation for LLM），原名CLiB 目前已囊括384个大模型，覆盖chatgpt、gpt-5.5、谷歌gemini-3.1-pro、Claude-4.8、文心ERNIE-X1.1、ERNIE-5.1、qwen3.7-max、qwen3.7-plus、百川、讯飞星火、商汤senseChat等商用模型，以及step3.7-flash、kimi-k2.6、ernie4.5、MiniMax-M3、deepseek-v4、Qwen3.6、llama4、智谱GLM-5.1、MiMo-V2、LongCat、gemma4、mistral等开源大模型。支持多维度能力评测，包括教育、医疗与心理健康、金融、法律与行政公务、推理与数学计算、语言与指令遵从、agent与工具调用等7个领域，以及细分的300个维度（比如牙科、高中语文…）。详见我们的技术报告ReLE: A Scalable System and Structured Benchmark for Diagnosing Capability Anisotropy in Chinese LLMs 媒体报道(机器之心):全球304个中文大模型实测：没有“全能王者”，ReLE凭70%降本方案破解评估困局不仅提供排行榜，也提供规模超200万的大模型缺陷库！方便广大社区研究分析、改进大模型。为您的私有大模型提供免费评测服务，联系我们(非线智能 ReLE benchmark团队)：加微信目录 🔄最近更新 ⚓GitHub热门大模型评测项目 📝大模型基本信息 📊排行榜 0、多模态排行榜 1、综合能力排行榜 1.1 推理类模型排行榜 1.2 商用大模型排行榜（含开源模型的付费API） 1.3 开源大模型排行榜 2、教育排行榜 2.1 小学学科    2.3 中考TODO 2.4 高中学科    2.6 高等教育TODO 2.7 考研TODO  |  2.8 教师资格TODO 3、医疗与心理健康排行榜 3.1 医师  

agentic-ai artificial-intelligence llm-agent llm-evaluation

Quick Facts

Stars	6,223
Forks	254
Category	Agent Tool
Quality Score	50.9624801952756/100
Open Issues	15
Last Updated	2026-06-27
Created	2023-06-04
Platforms	claude-code, gemini
Est. Tokens	~28k

chinese-llm-benchmark alternative? Top 6 similar tools

Looking for a chinese-llm-benchmark alternative? If you're comparing chinese-llm-benchmark with other agent tool tools, these 6 projects are the closest alternatives on Agent Skills Hub — ranked by topic overlap, star count, and community traction.

PocketFlow by The-Pocket · ⭐ 10.3k
Pocket Flow: 100-line LLM framework. Let Agents build Agents!
generative-ai by genieincodebottle · ⭐ 2.3k
Comprehensive resources on Generative AI, including a detailed roadmap, projects, use cases, interview prepara
AgenticRAG-Survey by asinghcsu · ⭐ 1.5k
Agentic-RAG explores advanced Retrieval-Augmented Generation systems enhanced with AI LLM agents.
sim by simstudioai · ⭐ 28.9k
Build, deploy, and orchestrate AI agents. Sim is the central intelligence layer for your AI workforce.
agentscope by agentscope-ai · ⭐ 27.4k
Build and run agents you can see, understand and trust.
repomix by yamadashy · ⭐ 26.8k
📦 Repomix is a powerful tool that packs your entire repository into a single, AI-friendly file. Perfect for w

More Agent Tool Tools

Explore other popular agent tool tools:

View all Agent Tool tools →

Frequently Asked Questions

What is chinese-llm-benchmark?

chinese-llm-benchmark is 非线智能 NoneLinear - ReLE评测：中文AI大模型能力评测（持续更新）：目前已囊括374个大模型，覆盖chatgpt、gpt-5.4、谷歌gemini-3.1-pro、Claude-4.6、文心ERNIE-X1.1、ERNIE-5.0、qwen3.6-max、qwen3.6-plus、百川、讯飞星火、商汤senseChat等商用模型，以及step3.5-flash、kimi-k2.. It is categorized as a Agent Tool with 6.2k GitHub stars.

How do I install or use chinese-llm-benchmark?

You can find installation instructions and usage details in the chinese-llm-benchmark GitHub repository at github.com/jeinlee1991/chinese-llm-benchmark. The project has 6.2k stars and 254 forks, indicating an active community.

What are the best alternatives to chinese-llm-benchmark?

The top alternatives to chinese-llm-benchmark on Agent Skills Hub include PocketFlow, generative-ai, AgenticRAG-Survey. Each offers a different approach to the same problem space — compare them side-by-side by stars, quality score, and community activity.

View on GitHub → Browse Agent Tool tools