by ivanfioravanti · LLM Plugin · ★ 67
Last updated: · Indexed by AgentSkillsHub · Auto-synced every 8h
LLM Context Benchmarks Benchmark prompt-processing and generation throughput across context sizes (0.5k–128k tokens) for many inference engines: Ollama (API & CLI), MLX, MLX Distributed, MLX-VLM, llama.cpp, LM Studio, Exo, Apple Foundation Models Serve, vMLX, oMLX, Paroquant, and any OpenAI-compatible endpoint. Optimized for Apple Silicon but works anywhere Python runs. Installation Engine-specific setup: (Optional) pre-commit hooks for Black + isort: Running Benchmarks bash List engines uv run benchmark --list-engines Generate test files (only needed once) uv run generate-context-files prideandprejudice.txt Run a benchmark (engine + model) uv run benchmark mlx mlx-community/Qwen3-4B-Instruct-2507-4bit uv run
| Stars | 67 |
| Forks | 9 |
| Language | Python |
| Category | LLM Plugin |
| License | Apache-2.0 |
| Quality Score | 55.668/100 |
| Open Issues | 4 |
| Last Updated | 2026-06-13 |
| Created | 2025-08-06 |
| Platforms | cli, python |
| Est. Tokens | ~15k |
Looking for a llm_context_benchmarks alternative? If you're comparing llm_context_benchmarks with other llm plugin tools, these 6 projects are the closest alternatives on Agent Skills Hub — ranked by topic overlap, star count, and community traction.
Web app for teams of 20+ members. In-built connections to major LLMs via API. Share chats, prompts, and agents
Python client library for improving your LLM app accuracy
Automatic LLM-based video generation using the manim library. Usage of a code-writer and code-reviewer feedbac
Moonshot - A simple and modular tool to evaluate and red-team any LLM application.
Real-time behavioral enforcement for Claude Code. Monitors AI actions, detects violations, and interrupts misb
Bayesian Optimization as a Coverage Tool for Evaluating LLMs. Accurate evaluation (benchmarking) that's 10 tim
Explore other popular llm plugin tools:
llm_context_benchmarks is 📊 LLM Context Benchmarks - A comprehensive benchmarking tool for testing LLMs with varying context sizes using Ollama. Features dual benchmark modes (API/CLI), automatic hardware detection (optimized. It is categorized as a LLM Plugin with 67 GitHub stars.
llm_context_benchmarks is primarily written in Python. It covers topics such as ai, benchmarking, llms.
You can find installation instructions and usage details in the llm_context_benchmarks GitHub repository at github.com/ivanfioravanti/llm_context_benchmarks. The project has 67 stars and 9 forks, indicating an active community.
llm_context_benchmarks is released under the Apache-2.0 license, making it free to use and modify according to the license terms.
The top alternatives to llm_context_benchmarks on Agent Skills Hub include weam, log10, manim-generator. Each offers a different approach to the same problem space — compare them side-by-side by stars, quality score, and community activity.