vmlx — MCP Server by jjang-ai

by jjang-ai · MCP Server · ★ 701

Last updated: · Indexed by AgentSkillsHub · Auto-synced every 8h

About vmlx

MLX Inference Server for Apple Silicon Self-hosted inference server for LLMs, VLMs, and image generation on Apple Silicon. OpenAI + Anthropic + Ollama compatible HTTP API. Self-hosted; no third-party API keys required. Native MTP artifact detection and family-specific cache policy gates keep speculative/cache settings explicit and model-safe. Looking for a native Swift macOS app or Swift inference engine? See osaurus.ai. <img src="htt

anthropic-apikvcache-compressionkvcache-optimizationkvcache-reusellmlmstudiomacbookmcp-servermlxmlxllm

Quick Facts

Stars701
Forks73
LanguagePython
CategoryMCP Server
LicenseApache-2.0
Quality Score52.616/100
Open Issues36
Last Updated2026-06-19
Created2026-02-18
Platformsmcp, python
Est. Tokens~21k

Compatible Skills

These tools work well together with vmlx for enhanced workflows:

  • mlx-omni-server — semantic(0.35)+complementary+shared_fw(openai)+rare_topics+same_lang+shared_platform (60%)

vmlx alternative? Top 6 similar tools

Looking for a vmlx alternative? If you're comparing vmlx with other mcp server tools, these 6 projects are the closest alternatives on Agent Skills Hub — ranked by topic overlap, star count, and community traction.

  • mlx-serve by ddalcu · ⭐ 164

    Native LLM inference server for Apple Silicon. OpenAI + Anthropic API compatible. No Python. Includes MLX Core

  • gemini-skill by WJZ-P · ⭐ 823

    gemini drawing MCP & skill through browser, can be used in openclaw or any agent that supports MCP. Gemini画图 M

  • Overture by SixHq · ⭐ 592

    Overture is an open-source, locally running web interface delivered as an MCP (Model Context Protocol) server

  • Ori-Mnemos by aayoawoyemi · ⭐ 311

    Local-first persistent agentic memory powered by Recursive Memory Harness (RMH). Open source must win.

  • maclocal-api by scouzi1966 · ⭐ 302

    'afm' command cli: macOS server and single prompt mode that exposes Apple's Foundation and MLX Models and othe

  • persistent-ai-memory by savantskie · ⭐ 226

    A persistent local memory for AI, LLMs, or Copilot in VS Code.

More MCP Server Tools

Explore other popular mcp server tools:

View all MCP Server tools →

Popular Python Agent Tools

Frequently Asked Questions

What is vmlx?

vmlx is vMLX - JANGTQ Uber Compressed MLX Models - L2 Disk Cache (survives restart) + L1 Paged (super fast ttft) + Hybrid SSM Scheduler + Cont Batching + etc!. It is categorized as a MCP Server with 701 GitHub stars.

What programming language is vmlx written in?

vmlx is primarily written in Python. It covers topics such as anthropic-api, kvcache-compression, kvcache-optimization.

How do I install or use vmlx?

You can find installation instructions and usage details in the vmlx GitHub repository at github.com/jjang-ai/vmlx. The project has 701 stars and 73 forks, indicating an active community.

What license does vmlx use?

vmlx is released under the Apache-2.0 license, making it free to use and modify according to the license terms.

What are the best alternatives to vmlx?

The top alternatives to vmlx on Agent Skills Hub include mlx-serve, gemini-skill, Overture. Each offers a different approach to the same problem space — compare them side-by-side by stars, quality score, and community activity.

View on GitHub → Browse MCP Server tools