vllm-mlx — MCP Server by waybarrios

Last updated: 2026-05-02 · Indexed by AgentSkillsHub · Auto-synced every 8h

About vllm-mlx

vLLM-MLX vLLM-like inference for Apple Silicon - GPU-accelerated Text, Image, Video & Audio on Mac Overview vllm-mlx brings native Apple Silicon GPU acceleration to vLLM by integrating: MLX: Apple's ML framework with unified memory and Metal kernels mlx-lm: Optimized LLM inference with KV cache and quantization mlx-vlm: Vision-language models for multimodal inference mlx-audio: Speech-to-Text and Text-to-Speech with native voices mlx-embeddings: Text embeddings for semantic search and RAG Features Multimodal - Text, Image, Video & Audio in one platform Native GPU acceleration on Apple Silicon (M1, M2, M3, M4) Native TTS voices - Spanish, French, Chinese, Japanese + 5 more languages OpenAI API compatible - drop-in replacement for OpenAI client Anthropic Messages API - native /v1/me

anthropic apple-silicon audio-processing claude-code computer-vision image-understanding inference llm machine-learning macos

Quick Facts

Stars	1,075
Forks	154
Language	Python
Category	MCP Server
License	Apache-2.0
Quality Score	52.57/100
Open Issues	41
Last Updated	2026-05-02
Created	2025-12-06
Platforms	claude-code, mcp, python
Est. Tokens	~782k

Compatible Skills

These tools work well together with vllm-mlx for enhanced workflows:

mlx-omni-server — semantic(0.33)+complementary+shared_fw(openai)+rare_topics+same_lang+similar_pop+shared_platform (78%)
Toolio — semantic(0.29)+complementary+rare_topics+same_lang+similar_pop+shared_platform (64%)
claude-stt — semantic(0.24)+complementary+rare_topics+same_lang+similar_pop+shared_platform (62%)
mlx-llm — semantic(0.36)+complementary+rare_topics+same_lang+similar_pop+shared_platform (62%)
PyVision — semantic(0.23)+complementary+rare_topics+same_lang+similar_pop+shared_platform (62%)

vllm-mlx alternative? Top 6 similar tools

Looking for a vllm-mlx alternative? If you're comparing vllm-mlx with other mcp server tools, these 6 projects are the closest alternatives on Agent Skills Hub — ranked by topic overlap, star count, and community traction.

claude-code-local by nicedreamzapp · ⭐ 2.4k
Run Claude Code 100% on-device with local AI on Apple Silicon. MLX-native Anthropic-API server, 65 tok/s Qwen
claude-forge by sangrokjung · ⭐ 681
Supercharge Claude Code with 11 AI agents, 36 commands & 15 skills — the claude-code plugin framework inspired
orbit by schmitech · ⭐ 253
The self-hosted AI gateway for production RAG across LLMs, databases, APIs, and files.
osaurus by osaurus-ai · ⭐ 5.2k
Own your AI. The native macOS harness for AI agents -- any model, persistent memory, autonomous execution, cry
openwhispr by OpenWhispr · ⭐ 2.9k
Voice-to-text dictation app with local (Nvidia Parakeet/Whisper) and cloud models (BYOK). Privacy-first and av
any-llm by mozilla-ai · ⭐ 1.9k
Communicate with an LLM provider using a single interface

More MCP Server Tools

Explore other popular mcp server tools:

n8n ⭐ 186.8k
everything-claude-code ⭐ 172.3k
dify ⭐ 140.2k
open-webui ⭐ 135.5k
gemini-cli ⭐ 103.2k
awesome-mcp-servers ⭐ 86.1k
servers ⭐ 84.0k
ragflow ⭐ 79.0k
lobehub ⭐ 76.1k
cc-switch ⭐ 57.8k

View all MCP Server tools →

Popular Python Agent Tools

AutoGPT ⭐ 184.0k · Agent Tool
langflow ⭐ 147.7k · Agent Tool
langchain ⭐ 135.8k · Agent Tool
open-webui ⭐ 135.5k · MCP Server
hermes-agent ⭐ 133.8k · Codex Skill

Frequently Asked Questions

What is vllm-mlx?

vllm-mlx is OpenAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models (Llama, Qwen-VL, LLaVA) with continuous batching, MCP tool calling, and multimodal support. Native MLX bac. It is categorized as a MCP Server with 1.1k GitHub stars.

What programming language is vllm-mlx written in?

vllm-mlx is primarily written in Python. It covers topics such as anthropic, apple-silicon, audio-processing.

How do I install or use vllm-mlx?

You can find installation instructions and usage details in the vllm-mlx GitHub repository at github.com/waybarrios/vllm-mlx. The project has 1.1k stars and 154 forks, indicating an active community.

What license does vllm-mlx use?

vllm-mlx is released under the Apache-2.0 license, making it free to use and modify according to the license terms.

What are the best alternatives to vllm-mlx?

The top alternatives to vllm-mlx on Agent Skills Hub include claude-code-local, claude-forge, orbit. Each offers a different approach to the same problem space — compare them side-by-side by stars, quality score, and community activity.

View on GitHub → Browse MCP Server tools