by allenai · Agent Tool · ★ 108
Last updated: · Indexed by AgentSkillsHub · Auto-synced every 8h
WildGuard: Open One-stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs Authors: Seungju Han ⭐, Kavel Rao ⭐, Allyson Ettinger ☀️, Liwei Jiang ☀️, Yuchen Lin, Nathan Lambert, Yejin Choi, Nouha Dziri ⭐ Co-first authors, ☀️ co-second authors 🌟 WildGuard will appear at NeurIPS 2024 Datasets & Benchmarks! 🌟 WildGuard is a safety classification model for user-model chat exchanges. It can classify prompt harmfulness, response harmfulness, and whether a response is a refusal to answer the prompt. Please see our companion repository Safety-Eval for the details of evaluations run in the WildGuard paper. Installation Quick Start python from wildguard import loadwildguard if name == 'main': # Load the
| Stars | 108 |
| Forks | 11 |
| Language | Python |
| Category | Agent Tool |
| Quality Score | 67.4848422587015/100 |
| Open Issues | 3 |
| Last Updated | 2024-12-02 |
| Created | 2024-06-13 |
| Platforms | python |
| Est. Tokens | ~7k |
Looking for a wildguard alternative? If you're comparing wildguard with other agent tool tools, these 5 projects are the closest alternatives on Agent Skills Hub — ranked by topic overlap, star count, and community traction.
Lightweight registry to discover, install, and manage all public Claude plugins and agent skills for your favo
Claude Code Skills for software engineering workflows - Git automation, testing, and code review
A Claude Code skill that turns PDFs, docs, and codebases into Obsidian study vaults
86 product management skills from Lenny's Podcast for Claude Code and AI agents. Hiring, user research, strate
Power rename/refactor tool (now with agent skill support!)
Explore other popular agent tool tools:
wildguard is Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs. It is categorized as a Agent Tool with 108 GitHub stars.
wildguard is primarily written in Python.
You can find installation instructions and usage details in the wildguard GitHub repository at github.com/allenai/wildguard. The project has 108 stars and 11 forks, indicating an active community.
The top alternatives to wildguard on Agent Skills Hub include claude-plugins, claude-skills-marketplace, tutor-skills. Each offers a different approach to the same problem space — compare them side-by-side by stars, quality score, and community activity.