How We Secure 43,000+ AI Agent Tools: A Solo Developer's Security Journey
When you run a directory of 43,000+ AI agent tools, every new repository is a potential attack vector. This is the story of how we went from zero security to a multi-layered defense system — built by one developer, inspired by the open-source community.
The Wake-Up Call
Agent Skills Hub started as a simple idea: index every AI agent tool on GitHub and help developers find the right one. We built the crawler, the scoring engine, the frontend. By March 2026, we had 25,000+ repositories indexed. Life was good.
Then we started noticing things.
A "Claude Code skill" that piped environment variables to an external server via curl. An "MCP server" that silently installed a cron job to download and execute remote scripts. A "productivity tool" whose README contained base64-encoded payloads.
We were indexing these tools. Linking to them. Recommending them in our scenario pages. And we had zero security scanning.
That was the wake-up call.
The Timeline
curl|bash, rm -rf, credential access. Three grades: safe/caution/unsafe.
Five commits. Eleven days. From zero to a multi-layered security system.
Phase 1: The Naive Scanner
The first version was embarrassingly simple. A handful of regex patterns looking for obvious red flags:
// V1: "Is this obviously malicious?"
if (readme.match(/curl.*\|.*bash/)) grade = "unsafe";
if (readme.match(/rm\s+-rf\s+\//)) grade = "unsafe";
if (readme.match(/\.env/)) grade = "caution";
It caught the obvious stuff. But it also flagged every legitimate tool that had curl | bash install instructions (which is... most of them). The false positive rate was terrible.
We needed something smarter.
Phase 2: LLM as Second Opinion
The idea was simple: let the regex scanner do the fast first pass, then send flagged repos to an LLM for semantic analysis. The LLM could understand context — a curl | bash installing Homebrew is fine; a curl | bash downloading from a random IP is not.
We built a system prompt that explicitly lists common false positives:
Common FALSE POSITIVES to watch for:
- curl|bash install scripts for well-known package managers
- API key configuration instructions (OPENAI_API_KEY, etc.)
- sudo usage in Docker setup or system package installation
- eval() in legitimate template engines or REPL tools
Initially we used the Anthropic API, but switched to MiniMax's OpenAI-compatible endpoint for cost efficiency. The LLM returns structured JSON with grade, confidence score, and specific findings with mitigations.
This cut false positives significantly. But the rule engine itself was still too primitive.
Phase 3: The SlowMist Rewrite
Everything changed when we discovered slowmist/slowmist-agent-security (302 stars).
SlowMist is a well-known blockchain security team. Their Agent Security Framework defines 11 categories of red flags specifically for AI agent tools. Not generic security patterns — agent-specific threats like memory theft, agent config exfiltration, and skill supply chain attacks.
We rewrote the entire scanner around their framework.
The 11 Red-Flag Categories
| # | Category | What It Catches | Example Pattern |
|---|---|---|---|
| 1 | Data Exfiltration | Sending local data to external servers | curl -d $(cat ~/.ssh/id_rsa) |
| 2 | Credential Harvest | Extracting API keys from environment | env | grep -i key |
| 3 | Sensitive Dir Access | Reading SSH keys, AWS credentials | cat ~/.aws/credentials |
| 4 | Agent Memory Theft | Stealing agent memory/identity files | cat MEMORY.md | curl |
| 5 | Dynamic Code Exec | Running obfuscated/dynamic code | base64 -d | bash |
| 6 | Privilege Escalation | Gaining root access | chmod 777, chown root |
| 7 | Persistence | Surviving reboots via cron/bashrc | crontab, >> ~/.bashrc |
| 8 | Reverse Shell | Opening backdoor connections | nc -e /bin/sh |
| 9 | Destructive | Wiping filesystem | rm -rf / |
| 10 | Obfuscation | Hiding malicious intent | hex-encoded payloads, rot13+eval |
| 11 | Supply Chain | Install-and-execute attacks | npm install x && node x |
Category #4 — Agent Memory Theft — was the eye-opener. Traditional security scanners don't look for MEMORY.md or SOUL.md exfiltration. But in the AI agent world, these files contain your entire context, preferences, and work history. Stealing them is the agent equivalent of identity theft.
The Trust Hierarchy
Not all red flags are equal. A sudo command in an Anthropic official repo is very different from a sudo command in a zero-star repo with no license.
We implemented a 5-tier trust hierarchy that adjusts severity based on the source:
| Tier | Label | Criteria | Effect on Grading |
|---|---|---|---|
| 1 | Official Org | anthropics, openai, google, microsoft, nvidia... | High flags → caution (not unsafe) |
| 2 | Known Security Team | slowmist, trailofbits, openzeppelin... | High flags → caution |
| 3 | High-Star + Licensed | ≥1,000 stars + open-source license | High flags → caution |
| 4 | Moderate Trust | ≥100 stars + license | Single high flag → caution; multiple → unsafe |
| 5 | Unknown Source | Everything else | Any high flag → unsafe |
The key insight: trust is not binary. A graduated system catches real threats while giving established projects the benefit of the doubt.
Code Block Awareness
One subtle but critical feature: the scanner checks if a pattern match occurs inside a Markdown code block. A curl | bash in a README's prose is suspicious. The same pattern inside a fenced code block (teaching users how to install) is usually legitimate.
function isInCodeBlock(text, pos) {
const before = text.slice(0, pos);
return (before.split("```").length - 1) % 2 === 1;
}
This single function eliminated about 40% of false positives.
Phase 4: Database Security
While building the content security layer, we discovered something worse: our database was wide open.
The subscribers table — containing email addresses of 58 newsletter subscribers — had permissive Row Level Security (RLS) policies. The Supabase anon key (which is public, embedded in frontend JavaScript) could:
- SELECT all rows — anyone could dump all subscriber emails
- UPDATE any row — anyone could modify verification status
- DELETE any row — anyone could unsubscribe other users
The fix: remove all direct table access and move everything to SECURITY DEFINER RPC functions:
-- Before: direct table access (DANGEROUS)
INSERT INTO subscribers (email) VALUES ('user@example.com');
-- After: controlled RPC (SAFE)
SELECT subscribe('user@example.com');
-- Function handles validation, token generation, duplicate check
-- Runs with elevated privileges, returns only status string
Three functions — subscribe(), verify_email(), unsubscribe() — each with strict input validation and minimal return data. The anon key can call these RPCs but cannot touch the table directly.
A gotcha we hit: PostgreSQL's gen_random_bytes() lives in the extensions schema. Our initial migration failed because we set search_path = public without including extensions. A subtle bug that only showed up in production.
The Architecture Today
Layer 1: Rule Engine
27 regex patterns across 3 severity levels (critical/high/medium). Pure string matching, zero API calls. Runs in <10ms per repo. Available both server-side (Python) and client-side (TypeScript).
Layer 2: Trust Hierarchy
5-tier source reputation system. Adjusts severity grades based on author org, star count, and license presence. Prevents false positives on established projects.
Layer 3: LLM Analysis
Semantic deep-dive for flagged repos. Understands context, identifies false positives, returns structured findings with confidence scores.
Layer 4: Database RLS
SECURITY DEFINER RPC functions for all write operations. Zero direct table access from frontend. Anon key is truly read-only.
Numbers
- 27 security detection rules (2 critical + 16 high + 9 medium)
- 5 trust tiers with graduated severity adjustment
- 11 red-flag categories from SlowMist framework
- 3 SECURITY DEFINER RPC functions protecting subscriber data
- <10ms per-repo scan time (pure regex, no API calls)
- ~40% false positive reduction from code block awareness
- 5 commits over 11 days to build the entire system
Projects That Inspired Us
Lessons Learned
1. Agent-specific threats are real
Traditional security tools miss agent-specific attack vectors. Nobody was scanning for MEMORY.md exfiltration or .claude/sessions theft before SlowMist published their framework. If you're building in the agent ecosystem, you need agent-aware security.
2. Trust hierarchy beats binary classification
Early versions classified everything as safe or unsafe. This was useless — too many false positives on legitimate tools, and users stopped paying attention. The 5-tier trust system lets us say "this pattern is concerning, but the source is reputable, so proceed with caution" instead of just "UNSAFE".
3. Client-side scanning is underrated
Our scanner runs entirely in the browser (TypeScript). No API calls, no backend dependency. Users can analyze any GitHub repo in real-time without us ever seeing their query. Privacy by architecture.
4. RLS is not security by default
Supabase's Row Level Security creates a false sense of safety. Enabling RLS without writing proper policies is worse than no RLS — it makes you think you're protected when you're not. Always audit your policies with the anon key.
5. The community is the best security team
Every project listed above was found through our own directory. We index 43,000+ tools, and the security tools in our own index taught us how to secure the directory itself. That's the beauty of open source.
What's Next
- Automated scanning in sync pipeline — flag new repos during ingestion, not just on-demand
- Security badges on skill pages — show scan results directly on each tool's detail page
- Community reporting — let users flag suspicious tools and contribute detection rules
- Supply chain graph — track dependency relationships between agent tools
The AI agent ecosystem is growing at 500+ new repos per week. Security can't be an afterthought. If you're building agent tools, consider running our scanner on your own README — or better yet, contribute to SlowMist's framework.
Try the scanner: Visit Agent Skills Hub and click "Analyzer" to scan any GitHub repo in real-time.
Source code: Our scanner is open-source at github.com/ZhuYansen/agent-skills-hub.