by am-will · Agent Tool · ★ 53
Last updated: · Indexed by AgentSkillsHub · Auto-synced every 8h
Snag Beta - This tool is in active development. If you encounter any issues, please report them. Screenshot-to-text CLI tool powered by vision AI (Google Gemini, OpenRouter, or Z.AI). Capture any region of your screen and instantly get a markdown description in your clipboard - ready to paste into an LLM, document, or anywhere else. Features Region selection - Click and drag to capture any part of your screen Multi-monitor support - Works across all your displays Smart transcription - Handles text, code, diagrams, charts, UI elements, and images Instant clipboard - Results copied automatically, ready to paste Multiple providers - Google Gemini, OpenRouter, or Z.AI (GLM-4.6V) Cross-platform - Linux (X11/Wayland), Windows, macOS Installation Linux Dependencies X11: Wayland: macOS Permissions On first run, macOS will prompt for Screen Recording permissions: Grant permission when prompted Restart the app (required for permissions to take effect) On second ru
| Stars | 53 |
| Forks | 7 |
| Language | Python |
| Category | Agent Tool |
| License | MIT |
| Quality Score | 71.4379550143465/100 |
| Last Updated | 2026-01-11 |
| Created | 2026-01-09 |
| Platforms | cli, gemini, python |
| Est. Tokens | ~7k |
Explore other popular agent tool tools:
snag is Screenshot-to-text CLI tool powered by Google Gemini vision. It is categorized as a Agent Tool with 53 GitHub stars.
snag is primarily written in Python.
You can find installation instructions and usage details in the snag GitHub repository at github.com/am-will/snag. The project has 53 stars and 7 forks, indicating an active community.
snag is released under the MIT license, making it free to use and modify according to the license terms.