by rlancemartin · Agent Tool · ★ 1.1k
Last updated: · Indexed by AgentSkillsHub · Auto-synced every 8h
:brain: :memo: Note See the HuggingFace space for this app: https://huggingface.co/spaces/rlancemartin/auto-evaluator Note See the hosted app: https://autoevaluator.langchain.com/ Note Code for the hosted app is also open source: https://github.com/langchain-ai/auto-evaluator This is a lightweight evaluation tool for question-answering using Langchain to: Ask the user to input a set of documents of interest Apply an LLM () to auto-generate - pairs from these docs Generate a question-answering chain with a specified set of UI-chosen configurations Use the chain to generate a response to each Use an LLM () to score the response relative to the Explore scoring across various chain configurations Run as Streamlit app Inputs Number of questions to auto-generate (if the user does not supply an eval set) Method for text splitting Chunk size for text splitting Chunk overlap for text splitting Embedding method for chunks Chunk retrieval method Neighbors for retrieval LLM for summarization of retrieved chunks Prompt choice for model self-grading Blog https://blog.langchain.dev/auto-eval-of-question-answering-tas
| Stars | 1,089 |
| Forks | 93 |
| Language | Python |
| Category | Agent Tool |
| Quality Score | 53.0788166791513/100 |
| Open Issues | 3 |
| Last Updated | 2023-05-10 |
| Created | 2023-04-14 |
| Platforms | python |
| Est. Tokens | ~2967k |
Looking for a auto-evaluator alternative? If you're comparing auto-evaluator with other agent tool tools, these 6 projects are the closest alternatives on Agent Skills Hub — ranked by topic overlap, star count, and community traction.
The secure, validated skill registry for professional AI coding agents. Extend Antigravity, Claude Code, Curso
Raptor turns Claude Code into a general-purpose AI offensive/defensive security agent. By using Claude.md and
This repository contains a collection of Agent Skills developed by GudaStudio, enabling seamless collaboration
Supercharge Claude Code with 11 AI agents, 36 commands & 15 skills — the claude-code plugin framework inspired
Skill to give Claude Code (and any coding agent) the ability to generate beautiful and practical Excalidraw di
A collection of Agent skills and Claude Code plugins for HashiCorp products.
Explore other popular agent tool tools:
auto-evaluator is Evaluation tool for LLM QA chains. It is categorized as a Agent Tool with 1.1k GitHub stars.
auto-evaluator is primarily written in Python.
You can find installation instructions and usage details in the auto-evaluator GitHub repository at github.com/rlancemartin/auto-evaluator. The project has 1.1k stars and 93 forks, indicating an active community.
The top alternatives to auto-evaluator on Agent Skills Hub include agent-skills, raptor, skills. Each offers a different approach to the same problem space — compare them side-by-side by stars, quality score, and community activity.