AI meeting stack — real-time transcription, automatic minutes, action item extraction, multilingual captions, audio capture, Lark/Feishu/Zoom integration. Every standup, sales call, podcast covered.
Meeting Tools & Transcription tools are AI-powered software designed to help developers and teams tackle meeting tools & transcription-related tasks more efficiently. These tools are typically published as open-source projects on GitHub and can be integrated into existing workflows via MCP (Model Context Protocol), Claude Skills, or standalone agent frameworks. On Agent Skills Hub, we index 25 quality-scored meeting tools & transcription tools across languages including Rust, TypeScript, Python.
In 2026, the AI agent ecosystem is maturing rapidly. Meeting Tools & Transcription tools can significantly boost development efficiency by automating repetitive tasks, reducing human error, and providing intelligent suggestions. The top 3 tools — minutes, project-raven, call.md — have earned an average of 1,452 GitHub stars, reflecting strong community validation. 22 of the listed tools come with clear open-source licenses, ensuring freedom to use and modify.
When choosing a meeting tools & transcription tool, consider these factors: 1) Community activity — GitHub stars and recent commit frequency indicate reliability; 2) Integration method — check if it supports MCP, Claude, or your preferred agent framework; 3) Language compatibility — the most common language in this list is Rust; 4) Quality score — Agent Skills Hub's composite score evaluates code quality, documentation completeness, and maintenance activity. Our recommendation: start with minutes — it ranks highest in both star count and quality score.
Every meeting, every idea, every voice note — searchable by your AI. Open-source, privacy-first conversation memory layer.
Open-source AI meeting copilot - real-time transcription, echo cancellation, and AI assistance. Captures system audio + mic, cancels echo via WebRTC AEC3, transcribes with Deepgram, and gives you Claude/OpenAI help during meetings. Runs locally on macOS and Windows.
Turn meetings into live agent loops. Record, transcribe, and analyze meetings with real-time AI intelligence — before, during, and after calls.
System audio capture + multi-provider ASR + local-first AI review workspace. Floating live captions, 12 ASR backends, 60+ languages, AI summary/chat/mindmap, Open API, MCP server, and Agent Skill.
Speech-to-text input for Claude Code with live streaming dictation
Apple PodCast Transcription with OpenAI's Whisper
Input0 — A macOS voice input tool: hold a hotkey to record, release to transcribe locally via STT, refine with LLM, and auto-paste into the active text field.
AI agent skill: read Lark meeting transcripts, extract action items, and actually get them done
Industrial-grade speech recognition toolkit: 170x realtime, 50+ languages, speaker diarization, emotion detection, streaming, and OpenAI-compatible API.
🎙️ AI Dictation App - Open Source and Local-first ⚡ Type 3x faster, no keyboard needed. 🆓 Powered by open source models, works offline, fast and accurate.
Voice-to-text dictation app with local (Nvidia Parakeet/Whisper) and cloud models (BYOK). Privacy-first and available cross-platform.
hns is a speech-to-text CLI tool to transcribe your voice from your microphone directly to clipboard. Integrate hns with Claude Code, Ollama, LLM, and more CLI tools for powerful workflows.
AgentCall lets AI Agents join meetings with voice, video & screen-share to build together. Supports Google Meet, Teams, Zoom (Beta)
OpenAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models (Llama, Qwen-VL, LLaVA) with continuous batching, MCP tool calling, and multimodal support. Native MLX backend, 400+ tok/s. Works with Claude Code.
Voice-driven writing, input, and cross-app work for your desktop.
Open-source, accurate and easy-to-use video speech recognition & clipping tool. LLM-based AI clipping integrated.
SirChatalot is a Telegram bot leveraging ChatGPT, Claude or YandexGPT. It uses Whisper for speech-to-text and DALL-E, Stability AI or YandexART for image creation. It can use vision capabilities, tools and semantic search in vector DB.
An offline AI-powered video analysis tool with object detection (YOLO), image captioning (BLIP), speech transcription (Whisper), audio event detection (PANNs), and AI-generated summaries (LLMs via Ollama). It ensures privacy and offline use with a user-friendly GUI.
13 Claude Code skills for video production (transcribe / translate / dub / multicam / subtitles / reframe) + WeChat publishing. Compatible with Claude Code, OpenAI Codex CLI, Cursor, Gemini.
An MCP Server for audio transcription using OpenAI
小智ESP32的Java企业级管理平台,提供设备监控、音色定制、角色切换和对话记录管理的前后端及服务端一体化解决方案
Give Claude the ability to watch and understand videos — Claude Code plugin with frame extraction and multimodal audio analysis
MLX Omni Server is a local inference server powered by Apple's MLX framework, specifically designed for Apple Silicon (M-series) chips. It implements OpenAI-compatible API endpoints, enabling seamless integration with existing OpenAI SDK clients while leveraging the power of local ML inference.
| Tool | Stars | Language | License | Score |
|---|---|---|---|---|
| minutes | ★ 1.2k | Rust | MIT | 44 |
| project-raven | ★ 415 | TypeScript | MIT | 48 |
| call.md | ★ 316 | TypeScript | — | 38 |
| DeLive | ★ 174 | TypeScript | Apache-2.0 | 37 |
| claude-stt | ★ 363 | Python | MIT | 39 |
| whisper-subtitles | ★ 348 | Jupyter Notebook | MIT | 29 |
| input0 | ★ 260 | Rust | — | 35 |
| lark-minutes-tasks | ★ 56 | — | MIT | 31 |
| FunASR | ★ 16.6k | Python | MIT | 48 |
| amical | ★ 1.3k | TypeScript | MIT | 39 |
| openwhispr | ★ 3.5k | TypeScript | MIT | 44 |
| hns | ★ 87 | Python | MIT | 33 |
| agentcall | ★ 72 | Python | MIT | 44 |
| vllm-mlx | ★ 1.3k | Python | Apache-2.0 | 47 |
| AriaType | ★ 72 | Rust | AGPL-3.0 | 34 |
| FunClip | ★ 5.6k | Python | MIT | 53 |
| voicemode | ★ 1.2k | Python | MIT | 41 |
| SirChatalot | ★ 72 | Python | GPL-3.0 | 31 |
| ai-powered-video-analyzer | ★ 58 | Python | — | 36 |
| claude-skills | ★ 64 | Python | MIT | 40 |
| mcp-server-whisper | ★ 52 | Python | MIT | 42 |
| xiaozhi-esp32-server-java | ★ 1.3k | Java | MIT | 43 |
| claude-video-vision | ★ 640 | TypeScript | MIT | 46 |
| mlx-omni-server | ★ 714 | Python | MIT | 40 |
| bolna | ★ 657 | Python | MIT | 39 |
The top meeting tools & transcription in 2026 are minutes, project-raven, call.md. Agent Skills Hub ranks 25 options by GitHub stars, quality score (6 dimensions including completeness, examples, and agent readiness), and recent activity. The list is rebuilt every 8 hours from live GitHub data.
minutes (1.2k stars) is the most adopted choice for general meeting tools & transcription workflows, written in Rust. project-raven (415 stars) is a strong alternative and uses TypeScript instead. Pick by your existing stack: match the language and runtime your team already uses to minimize integration cost. If unsure, start with minutes — it has the deepest community and the most examples online.
Avoid pre-built meeting tools & transcription when (1) your use case requires deep customization that the tool's plugin system doesn't support, (2) you have strict compliance requirements that ban third-party dependencies, (3) the tool's maintenance is inactive (last commit >6 months ago), or (4) your data volume is small enough that a 50-line custom script is cheaper than learning the tool. For most production workflows above 100 requests/day, the time savings from a maintained tool outweigh the customization loss.
Meeting Tools & Transcription focuses specifically on ai meeting stack — real-time transcription, automatic minutes, action item extraction, multilingual captions, audio capture, lark/feishu/zoom integration. every standup, sales call, podcast covered. Voice Agents is a related but distinct category — see https://agentskillshub.top/best/voice-agent/ for those tools. The two often appear in the same agent pipeline but solve different problems: choose meeting tools & transcription when your primary goal is the specific task, and voice agents when the workflow is broader.
For most teams, yes. minutes has 1.2k stars worth of community testing, handles edge cases you haven't thought of, and ships with documentation. Build your own only when (1) your requirements are deeply non-standard, (2) you have a security/compliance reason to avoid OSS dependencies, or (3) the maintenance burden is small enough (<200 lines of code) that you'll save time long-term. The break-even point is usually around 2-3 weeks of dev time saved.
Most meeting tools & transcription listed are open source under permissive licenses (MIT, Apache 2.0). A handful offer paid managed/cloud versions on top of free self-hosted core. Always check the LICENSE file on each tool's GitHub repository before commercial use — some use AGPL or non-commercial restrictions that may not fit your deployment model.