How to Check if an AI Skill is Safe Before Installing

By Michael Browne · March 8, 2026 (updated April 2, 2026) · 5 min read

You found an AI skill or MCP server you want to install. Maybe it was recommended on Reddit, listed on Smithery, or came up in a Claude Code conversation. Before you give it access to your terminal, code, and environment variables — here's how to check if it's safe.

Why This Matters

AI skills and MCP servers run with permissions most npm packages never get. They can execute shell commands, read files, access environment variables, and make network requests. A malicious skill can steal credentials, exfiltrate data, or inject instructions that override your AI agent's behavior.

In January 2026, the ClawHavoc attack planted 1,184 malicious skills on a major marketplace. One trojan was downloaded 7,700 times. Research from ToxicSkills found 36% of skills on one major registry contained prompt injection patterns.

Checking takes 30 seconds. Cleaning up after a compromise takes days.

Step-by-Step: Evaluate Any AI Skill

Find the Source URL or Package Name

Every legitimate AI skill has source code in a public repository or package registry. The URL or package name is usually in the install instructions or marketplace listing. If a skill has no public source code, that's a red flag — do not install it.

Copy whatever identifier you have. It might be:

A GitHub repo: https://github.com/owner/repo-name or owner/repo
An npm package: @scope/package-name or npm:package-name
A Smithery listing: https://smithery.ai/server/owner/name
An OpenClaw skill URL

Run the Trust Scan

Go to mcpskills.io and paste whatever you have — GitHub URL, npm package name, Smithery URL, or owner/repo. Click Scan.

The scanner resolves your input to the source repository automatically. It pulls data from the GitHub API, npm registry, and OpenSSF Scorecard, scores up to 15 signals, and returns results in about 10 seconds. No sign-up required.

If the repo is detected as an AI skill or MCP server, the scanner automatically activates Skills Mode — increasing security weight and running 5 additional safety scans.

If no source repository can be found (e.g., an npm package with no linked GitHub repo), you'll get a Limited Score based on registry metadata only — publish history, downloads, maintainers, and license. This covers fewer signals, so the result is clearly labeled.

Check the Trust Tier

The scan returns one of four trust tiers:

Tier	Score	What It Means
Verified	7.5+	Strong scores across all 4 dimensions. Safe to build on.
Established	4.5+	Solid overall but check specific weak dimensions before relying on it.
New	<4.5	Insufficient data or weak signals. Use with caution — verify manually.
Blocked	—	Safety scan detected confirmed threat patterns. Do not install.

Review the 4 Dimensions

Don't just look at the composite score. Check each dimension individually:

Alive (Is it maintained?) — Low score means slow issue response, no recent commits, or stale releases. A tool that's trending toward abandonment is a risk even if it's popular.

Legit (Is the author credible?) — Verified organizations score higher. Solo developers with no track record score lower. Community adoption (stars + forks) also factors in.

Solid (Is it secure?) — OpenSSF Scorecard data, dependency management (Dependabot/Renovate), and for AI skills: safety scan results for all 5 threat patterns.

Usable (Can I work with it?) — README quality, spec compliance, license clarity. A tool with no install instructions or undocumented tools is harder to use safely.

Read the Safety Scan Results (AI Skills Only)

For repos detected as AI skills, the scanner runs 5 safety checks based on real attack patterns:

Prompt injection — Hidden instructions in SKILL.md or README that could override your AI agent

Shell execution — Piped commands (curl | sh) or eval() with network content

Credential theft — Code that reads SSH keys, AWS credentials, or browser storage

Network exfiltration — Discord webhooks, Telegram bots, raw IP addresses used as data sinks

Obfuscated payloads — Base64 strings, hex sequences, or String.fromCharCode chains hiding intent

Any findings are flagged with the specific file and code snippet. Unlock the full report ($9) to see every finding in detail.

Red Flags to Watch For

Do Not Install If:

Trust tier is Blocked — confirmed threat patterns detected
No public source code — closed-source skills can't be audited
Alive dimension is below 3.0 — likely abandoned, won't get security patches
Safety scan found prompt injection — the skill may override your AI agent's instructions
Author has no other repos and account was created recently — possible throwaway
The skill requests permissions it shouldn't need (a "weather tool" reading SSH keys)

Good Signs

Verified trust tier. Org-backed author. Active issue response (<7 days). Clean safety scan. Clear license (MIT/Apache). Comprehensive README with install instructions and examples.

What About Skills Not on GitHub?

mcpskills.io now accepts input from multiple registries — not just GitHub. You can paste an npm package name, a Smithery URL, or an OpenClaw skill, and the scanner will resolve it to the source repo automatically.

If a skill has a source repo on GitHub that's linked from its npm or registry listing, you'll get the full 15-signal score. If there's no linked repo, you'll get a partial score from registry metadata (publish history, downloads, maintainers, license).

For skills hosted on platforms we don't yet support (GitLab, Bitbucket, self-hosted Forgejo), you can apply the same framework manually: Check commit recency (last 30 days?). Check if issues get responses. Look at the author's profile and other projects. Search the code for the 5 safety patterns.

If the skill has no public source code at all, treat it as untrusted by default.

For MCP servers specifically, see The MCP Pre-Install Audit — eight checks before you run anything locally, with the three controls (sandboxing, manifest hashing, version pinning) that no scanner replaces.

Scan From Your IDE

Don't want to leave your editor? Install the mcpskills MCP server and score repos directly inside Claude Code or Cursor:

claude mcp add mcpskills -- npx @mcpskillsio/server

Then ask: "Is @anthropic-ai/sdk safe?" or "Score modelcontextprotocol/servers" — and get a trust score without switching windows. Works with GitHub repos, npm packages, and registry URLs.

FAQ

How long does a scan take?

About 10 seconds. The algorithm resolves your input (GitHub repo, npm package, or registry URL), then fetches data from GitHub API, npm registry, and OpenSSF Scorecard in parallel.

Is scanning free?

Yes. Free scans show the trust tier and 4 dimension scores. Free tier allows 3 scans per day from the website. Single full reports (all 15 signals + safety findings + vulnerability intelligence) are $2 each, or upgrade to Pro ($9/mo or $39/yr) for unlimited.

Does a "Verified" tier guarantee safety?

No. Trust scoring is an assessment based on available signals, not a security guarantee. A Verified score means strong signals across all dimensions, but no automated tool can catch every possible threat. Always review permissions before installing.

Can I score private repos?

Not currently. mcpskills.io scores public repos and packages across GitHub, npm, Smithery, and OpenClaw. Private repos and unpublished packages can't be scored.

What's the difference between Standard Mode and Skills Mode?

Skills Mode activates when the scanner detects an AI skill or MCP server (via SKILL.md, server.json, or MCP metadata). It increases security weight to 34% and adds 5 safety scans. Standard Mode uses balanced weights and scores 10 signals without safety scanning.

Latest Research

CLAWHUB · 200 SKILLS

State of ClawHub Trust →

0% declared their security posture

MCP REGISTRY · 202 SERVERS

State of MCP Security →

83% carry a disqualifier flag

Check before you install

Paste any repo, npm package, or registry URL and get a trust score in 10 seconds.

Scan Now — Free