wildguard — Agent Tool by allenai

by allenai · Agent Tool · ★ 108

Last updated: · Indexed by AgentSkillsHub · Auto-synced every 8h

About wildguard

WildGuard: Open One-stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs Authors: Seungju Han ⭐, Kavel Rao ⭐, Allyson Ettinger ☀️, Liwei Jiang ☀️, Yuchen Lin, Nathan Lambert, Yejin Choi, Nouha Dziri ⭐ Co-first authors, ☀️ co-second authors 🌟 WildGuard will appear at NeurIPS 2024 Datasets & Benchmarks! 🌟 WildGuard is a safety classification model for user-model chat exchanges. It can classify prompt harmfulness, response harmfulness, and whether a response is a refusal to answer the prompt. Please see our companion repository Safety-Eval for the details of evaluations run in the WildGuard paper. Installation Quick Start python from wildguard import loadwildguard if name == 'main': # Load the

Quick Facts

Stars108
Forks11
LanguagePython
CategoryAgent Tool
Quality Score67.4848422587015/100
Open Issues3
Last Updated2024-12-02
Created2024-06-13
Platformspython
Est. Tokens~7k

wildguard alternative? Top 5 similar tools

Looking for a wildguard alternative? If you're comparing wildguard with other agent tool tools, these 5 projects are the closest alternatives on Agent Skills Hub — ranked by topic overlap, star count, and community traction.

  • claude-plugins by Kamalnrf · ⭐ 522

    Lightweight registry to discover, install, and manage all public Claude plugins and agent skills for your favo

  • claude-skills-marketplace by mhattingpete · ⭐ 442

    Claude Code Skills for software engineering workflows - Git automation, testing, and code review

  • tutor-skills by RoundTable02 · ⭐ 400

    A Claude Code skill that turns PDFs, docs, and codebases into Obsidian study vaults

  • lenny-skills by RefoundAI · ⭐ 382

    86 product management skills from Lenny's Podcast for Claude Code and AI agents. Hiring, user research, strate

  • repren by jlevy · ⭐ 371

    Power rename/refactor tool (now with agent skill support!)

More Agent Tool Tools

Explore other popular agent tool tools:

View all Agent Tool tools →

Popular Python Agent Tools

Frequently Asked Questions

What is wildguard?

wildguard is Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs. It is categorized as a Agent Tool with 108 GitHub stars.

What programming language is wildguard written in?

wildguard is primarily written in Python.

How do I install or use wildguard?

You can find installation instructions and usage details in the wildguard GitHub repository at github.com/allenai/wildguard. The project has 108 stars and 11 forks, indicating an active community.

What are the best alternatives to wildguard?

The top alternatives to wildguard on Agent Skills Hub include claude-plugins, claude-skills-marketplace, tutor-skills. Each offers a different approach to the same problem space — compare them side-by-side by stars, quality score, and community activity.

View on GitHub → Browse Agent Tool tools