VisualAgentBench — Agent Tool by THUDM

by THUDM · Agent Tool · ★ 256

Last updated: · Indexed by AgentSkillsHub · Auto-synced every 8h

About VisualAgentBench

VisualAgentBench (VAB) 🌐 Website 🗂️ VAB Training (ModelScope) VisualAgentBench: Towards Large Multimodal Models as Visual Foundation Agents VisualAgentBench (VAB) is the first benchmark designed to systematically evaluate and develop large multi models (LMMs) as visual foundation agents, which comprises 5 distinct environments across 3 types of representative visual agent tasks (Embodied, GUI, and Visual Design) https://github.com/user-attachments/assets/4a1a5980-48f9-4a70-a900-e5f58ded69b4 VAB-OmniGibson (Embodied) VAB-Minecraft (Embodied) VAB-Mobile (GUI) VAB-WebArena-Lite (GUI, based on WebArena and VisualWebArena) VAB-CSS (Visual Design) Compared to its predecessor AgentBench, VAB highlights visual inputs and the enabling of Foundation Agent capability development with training open LLMs/LMMs on trajectories. Table of Contents Quick Start Dataset Summary Leaderboard Quick Start Acknowledgement [Cita

gptllm-agentmultimodal-large-language-models

Quick Facts

Stars256
Forks9
LanguagePython
CategoryAgent Tool
LicenseApache-2.0
Quality Score39.2/100
Open Issues16
Last Updated2025-04-24
Created2024-08-08
Platformspython
Est. Tokens~378k

Compatible Skills

These tools work well together with VisualAgentBench for enhanced workflows:

  • multimind-sdk — semantic(0.31)+complementary+rare_topics+same_lang+similar_pop+shared_platform (60%)
  • MLLM-Tool — semantic(0.34)+complementary+same_lang+similar_pop+shared_platform (57%)
  • OpenAdapt — semantic(0.34)+complementary+same_lang+similar_pop+shared_platform (57%)
  • SimplerLLM — semantic(0.30)+complementary+same_lang+similar_pop+shared_platform (56%)
  • multimodal-agents-course — semantic(0.27)+complementary+same_lang+similar_pop+shared_platform (54%)

VisualAgentBench alternative? Top 6 similar tools

Looking for a VisualAgentBench alternative? If you're comparing VisualAgentBench with other agent tool tools, these 6 projects are the closest alternatives on Agent Skills Hub — ranked by topic overlap, star count, and community traction.

  • tribe by StreetLamb · ⭐ 1.1k

    Low code tool to rapidly build and coordinate multi-agent teams

  • awesome-openclaw by SamurAIGPT · ⭐ 957

    A curated list of OpenClaw resources, tools, skills, tutorials & articles. OpenClaw (formerly Moltbot / Clawdb

  • claude-delegator by jarrodwatts · ⭐ 893

    Delegate tasks to Codex and Gemini directly from within Claude Code.

  • wcgw by rusiaaman · ⭐ 655

    Shell and coding agent on mcp clients

  • Deep-Research-skills by Weizhena · ⭐ 640

    Structured deep research skill for Claude Code/Open Code/Codex with human-in-the-loop control

  • llama-cpp-agent by Maximilian-Winter · ⭐ 620

    The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allow

More Agent Tool Tools

Explore other popular agent tool tools:

View all Agent Tool tools →

Popular Python Agent Tools

Frequently Asked Questions

What is VisualAgentBench?

VisualAgentBench is Towards Large Multimodal Models as Visual Foundation Agents. It is categorized as a Agent Tool with 256 GitHub stars.

What programming language is VisualAgentBench written in?

VisualAgentBench is primarily written in Python. It covers topics such as gpt, llm-agent, multimodal-large-language-models.

How do I install or use VisualAgentBench?

You can find installation instructions and usage details in the VisualAgentBench GitHub repository at github.com/THUDM/VisualAgentBench. The project has 256 stars and 9 forks, indicating an active community.

What license does VisualAgentBench use?

VisualAgentBench is released under the Apache-2.0 license, making it free to use and modify according to the license terms.

What are the best alternatives to VisualAgentBench?

The top alternatives to VisualAgentBench on Agent Skills Hub include tribe, awesome-openclaw, claude-delegator. Each offers a different approach to the same problem space — compare them side-by-side by stars, quality score, and community activity.

View on GitHub → Browse Agent Tool tools