by jjang-ai · MCP Server · ★ 701
Last updated: · Indexed by AgentSkillsHub · Auto-synced every 8h
MLX Inference Server for Apple Silicon Self-hosted inference server for LLMs, VLMs, and image generation on Apple Silicon. OpenAI + Anthropic + Ollama compatible HTTP API. Self-hosted; no third-party API keys required. Native MTP artifact detection and family-specific cache policy gates keep speculative/cache settings explicit and model-safe. Looking for a native Swift macOS app or Swift inference engine? See osaurus.ai. <img src="htt
| Stars | 701 |
| Forks | 73 |
| Language | Python |
| Category | MCP Server |
| License | Apache-2.0 |
| Quality Score | 52.616/100 |
| Open Issues | 36 |
| Last Updated | 2026-06-19 |
| Created | 2026-02-18 |
| Platforms | mcp, python |
| Est. Tokens | ~21k |
These tools work well together with vmlx for enhanced workflows:
Looking for a vmlx alternative? If you're comparing vmlx with other mcp server tools, these 6 projects are the closest alternatives on Agent Skills Hub — ranked by topic overlap, star count, and community traction.
Native LLM inference server for Apple Silicon. OpenAI + Anthropic API compatible. No Python. Includes MLX Core
gemini drawing MCP & skill through browser, can be used in openclaw or any agent that supports MCP. Gemini画图 M
Overture is an open-source, locally running web interface delivered as an MCP (Model Context Protocol) server
Local-first persistent agentic memory powered by Recursive Memory Harness (RMH). Open source must win.
'afm' command cli: macOS server and single prompt mode that exposes Apple's Foundation and MLX Models and othe
A persistent local memory for AI, LLMs, or Copilot in VS Code.
Explore other popular mcp server tools:
vmlx is vMLX - JANGTQ Uber Compressed MLX Models - L2 Disk Cache (survives restart) + L1 Paged (super fast ttft) + Hybrid SSM Scheduler + Cont Batching + etc!. It is categorized as a MCP Server with 701 GitHub stars.
vmlx is primarily written in Python. It covers topics such as anthropic-api, kvcache-compression, kvcache-optimization.
You can find installation instructions and usage details in the vmlx GitHub repository at github.com/jjang-ai/vmlx. The project has 701 stars and 73 forks, indicating an active community.
vmlx is released under the Apache-2.0 license, making it free to use and modify according to the license terms.
The top alternatives to vmlx on Agent Skills Hub include mlx-serve, gemini-skill, Overture. Each offers a different approach to the same problem space — compare them side-by-side by stars, quality score, and community activity.