bench — Agent Tool by arthur-ai

by arthur-ai · Agent Tool · ★ 428

Last updated: · Indexed by AgentSkillsHub · Auto-synced every 8h

About bench

Bench Bench is a tool for evaluating LLMs for production use cases. Whether you are comparing different LLMs, considering different prompts, or testing generation hyperparameters like temperature and # tokens, Bench provides one touch point for all your LLM performance evaluation. If you have encountered a need for any of the following in your LLM work, then Bench can help with your evaluation: to standardize the workflow of LLM evaluation with a common interface across tasks and use cases to test whether open source LLMs can do as well as the top closed-source LLM API providers on your specific data to translate the rankings on LLM leaderboards and benchmarks into scores that you care about for your actual use case Join the bench community on Discord. For bug fixes and feature requests, please file a Github issue.

llmmlops

Quick Facts

Stars428
Forks42
LanguageTypeScript
CategoryAgent Tool
LicenseMIT
Quality Score61.037885384596/100
Open Issues1
Last Updated2024-05-10
Created2023-07-07
Platformsnode
Est. Tokens~738k

bench alternative? Top 6 similar tools

Looking for a bench alternative? If you're comparing bench with other agent tool tools, these 6 projects are the closest alternatives on Agent Skills Hub — ranked by topic overlap, star count, and community traction.

  • GPTRouter by Writesonic · ⭐ 456

    Smoothly Manage Multiple LLMs (OpenAI, Anthropic, Azure) and Image Models (Dall-E, SDXL), Speed Up Responses,

  • awesome-AI-toolkit by balavenkatesh3322 · ⭐ 200

    A curated, comprehensive collection of open-source AI tools, frameworks, datasets, courses, and seminal papers

  • orloj by OrlojHQ · ⭐ 103

    An orchestration runtime for multi-agent AI systems. Declare agents, tools, and policies as YAML; Orloj schedu

  • mcp-router by mcp-router · ⭐ 2.1k

    A Unified MCP Server Management App (MCP Manager).

  • paperbanana by llmsresearch · ⭐ 2.0k

    Open source implementation and extension of Google Research’s PaperBanana for automated academic figures, diag

  • mcphub.nvim by ravitemer · ⭐ 1.7k

    An MCP client for Neovim that seamlessly integrates MCP servers into your editing workflow with an intuitive i

More Agent Tool Tools

Explore other popular agent tool tools:

View all Agent Tool tools →

Popular TypeScript Agent Tools

Frequently Asked Questions

What is bench?

bench is A tool for evaluating LLMs. It is categorized as a Agent Tool with 428 GitHub stars.

What programming language is bench written in?

bench is primarily written in TypeScript. It covers topics such as llm, mlops.

How do I install or use bench?

You can find installation instructions and usage details in the bench GitHub repository at github.com/arthur-ai/bench. The project has 428 stars and 42 forks, indicating an active community.

What license does bench use?

bench is released under the MIT license, making it free to use and modify according to the license terms.

What are the best alternatives to bench?

The top alternatives to bench on Agent Skills Hub include GPTRouter, awesome-AI-toolkit, orloj. Each offers a different approach to the same problem space — compare them side-by-side by stars, quality score, and community activity.

View on GitHub → Browse Agent Tool tools