vllm-mlx — MCP Server by waybarrios

by waybarrios · MCP Server · ★ 1.1k

Last updated: · Indexed by AgentSkillsHub · Auto-synced every 8h

About vllm-mlx

vLLM-MLX vLLM-like inference for Apple Silicon - GPU-accelerated Text, Image, Video & Audio on Mac Overview vllm-mlx brings native Apple Silicon GPU acceleration to vLLM by integrating: MLX: Apple's ML framework with unified memory and Metal kernels mlx-lm: Optimized LLM inference with KV cache and quantization mlx-vlm: Vision-language models for multimodal inference mlx-audio: Speech-to-Text and Text-to-Speech with native voices mlx-embeddings: Text embeddings for semantic search and RAG Features Multimodal - Text, Image, Video & Audio in one platform Native GPU acceleration on Apple Silicon (M1, M2, M3, M4) Native TTS voices - Spanish, French, Chinese, Japanese + 5 more languages OpenAI API compatible - drop-in replacement for OpenAI client Anthropic Messages API - native /v1/me

anthropicapple-siliconaudio-processingclaude-codecomputer-visionimage-understandinginferencellmmachine-learningmacos

Quick Facts

Stars1,075
Forks154
LanguagePython
CategoryMCP Server
LicenseApache-2.0
Quality Score52.57/100
Open Issues41
Last Updated2026-05-02
Created2025-12-06
Platformsclaude-code, mcp, python
Est. Tokens~782k

Compatible Skills

These tools work well together with vllm-mlx for enhanced workflows:

  • mlx-omni-server — semantic(0.33)+complementary+shared_fw(openai)+rare_topics+same_lang+similar_pop+shared_platform (78%)
  • Toolio — semantic(0.29)+complementary+rare_topics+same_lang+similar_pop+shared_platform (64%)
  • claude-stt — semantic(0.24)+complementary+rare_topics+same_lang+similar_pop+shared_platform (62%)
  • mlx-llm — semantic(0.36)+complementary+rare_topics+same_lang+similar_pop+shared_platform (62%)
  • PyVision — semantic(0.23)+complementary+rare_topics+same_lang+similar_pop+shared_platform (62%)

vllm-mlx alternative? Top 6 similar tools

Looking for a vllm-mlx alternative? If you're comparing vllm-mlx with other mcp server tools, these 6 projects are the closest alternatives on Agent Skills Hub — ranked by topic overlap, star count, and community traction.

  • claude-code-local by nicedreamzapp · ⭐ 2.4k

    Run Claude Code 100% on-device with local AI on Apple Silicon. MLX-native Anthropic-API server, 65 tok/s Qwen

  • claude-forge by sangrokjung · ⭐ 681

    Supercharge Claude Code with 11 AI agents, 36 commands & 15 skills — the claude-code plugin framework inspired

  • orbit by schmitech · ⭐ 253

    The self-hosted AI gateway for production RAG across LLMs, databases, APIs, and files.

  • osaurus by osaurus-ai · ⭐ 5.2k

    Own your AI. The native macOS harness for AI agents -- any model, persistent memory, autonomous execution, cry

  • openwhispr by OpenWhispr · ⭐ 2.9k

    Voice-to-text dictation app with local (Nvidia Parakeet/Whisper) and cloud models (BYOK). Privacy-first and av

  • any-llm by mozilla-ai · ⭐ 1.9k

    Communicate with an LLM provider using a single interface

More MCP Server Tools

Explore other popular mcp server tools:

View all MCP Server tools →

Popular Python Agent Tools

Frequently Asked Questions

What is vllm-mlx?

vllm-mlx is OpenAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models (Llama, Qwen-VL, LLaVA) with continuous batching, MCP tool calling, and multimodal support. Native MLX bac. It is categorized as a MCP Server with 1.1k GitHub stars.

What programming language is vllm-mlx written in?

vllm-mlx is primarily written in Python. It covers topics such as anthropic, apple-silicon, audio-processing.

How do I install or use vllm-mlx?

You can find installation instructions and usage details in the vllm-mlx GitHub repository at github.com/waybarrios/vllm-mlx. The project has 1.1k stars and 154 forks, indicating an active community.

What license does vllm-mlx use?

vllm-mlx is released under the Apache-2.0 license, making it free to use and modify according to the license terms.

What are the best alternatives to vllm-mlx?

The top alternatives to vllm-mlx on Agent Skills Hub include claude-code-local, claude-forge, orbit. Each offers a different approach to the same problem space — compare them side-by-side by stars, quality score, and community activity.

View on GitHub → Browse MCP Server tools