The MCP Pre-Install Audit

By Michael Browne · May 4, 2026 · 9 min read

Before you run a local MCP process, treat it like a package install with shell, filesystem, and network access — because that's what it is. The protocol sandboxes none of it. OX Security recently submitted a benign proof-of-concept malicious MCP server to 11 public registries. Nine — including LobeHub and Cursor Directory — published it with no security review.

Here's the audit you should run before installing one. And here's how to automate most of it.

Why this matters now

The MCP ecosystem in 2026 looks a lot like npm in 2015: explosive growth, minimal vetting, and a trust model built on the assumption that everything published is benign. Except MCP servers have broader access than npm packages — your filesystem, shell, network — without a sandbox.

Three numbers from independent third-party research, all from the last six months:

Ten minutes of auditing prevents seconds of compromise. Here's the audit. Then we'll show you the one-call version.

The 8-step audit

Each step describes the manual check, the criterion for "good," and what mcpskills automates. Honest scope notes for the steps mcpskills doesn't fully replace are flagged inline.

Step 1 — Verify the publisher

Manual: Confirm the npm scope and GitHub org are vendor-owned. Check that the README, package metadata, and vendor docs all cross-reference each other. Official MCP servers live under @modelcontextprotocol/, maintained by the Agentic AI Foundation; vendors like GitHub, Brave, Sentry, and Cloudflare maintain their own official servers under their own scopes.

Good: The npm scope matches the GitHub org. The README links back to the vendor's docs, and the vendor's docs link to this repo. If the publisher is some random npm account named brave-search-mcp-server-v2, that's a red flag — typosquatting works because the name looks right but the publisher is wrong.

What mcpskills does: the author_credibility signal (weight 0.10) cross-checks npm and GitHub maintainer history, account age, and release cadence. The community_adoption signal weighs stars, downloads, and dependents.

Step 2 — Read the source

Manual: Most MCP servers are 200 to 500 lines. Grep for what matters: child_process, eval, dynamic require, filesystem writes outside the server's own directory, postinstall scripts, and outbound HTTP destinations.

Good: The server's stated purpose matches what the code actually does. A search MCP makes outbound calls only to its search API. If you see child_process.exec in something advertised as a search tool, something is wrong.

What mcpskills does: the tool_safety signal (weight 0.12) runs seven sub-checks across 20 source files — credential access, shell execution, network exfiltration, obfuscation, prompt injection, and as of v3.2.0: public network binding and risky npm lifecycle scripts. Findings include file path, line context, and severity.

Step 3 — Run npm audit

Manual: Run npm audit and prioritize HIGH and CRITICAL vulnerabilities in runtime dependencies. Moderate vulns in transitive dev dependencies can usually wait; HIGH or CRITICAL in a runtime path that handles user input is a blocker until patched.

Good: Zero HIGH or CRITICAL in runtime deps. npm audit returns "found 0 vulnerabilities."

What mcpskills does: the known_vulnerabilities signal (weight 0.06, v3.0.0) queries OSV.dev (which unifies GHSA, npm, PyPA, Go, and RustSec advisories), CISA KEV (the federal catalog of vulnerabilities with confirmed in-the-wild exploitation), and FIRST.org EPSS (the 30-day exploit-probability scoring system). Any unpatched critical OR any CVE on CISA KEV triggers the CRITICAL_CVE disqualifier and hard-gates the trust tier to "blocked." This is strictly more information than npm audit surfaces — KEV and EPSS are not in npm audit's output.

Step 4 — Inspect tool descriptions for prompt injection

Manual: Read every MCP tool's description field. Tool descriptions are instructions the LLM reads and follows; a poisoned description can hijack agent behavior. Flag cross-tool references, file-access instructions outside the tool's purpose, hidden Unicode characters (zero-width joiners, RTL overrides), or unusually long encoded content.

Good: Each description matches the tool's stated purpose. No imperative instructions targeting the LLM ("when asked X, do Y"). No mentions of files outside the tool's scope. No base64, hex, or zero-width sequences. Cursor's MCPoison flaw (CVE-2025-54136) — disclosed by Check Point in August 2025 — let an attacker swap a previously-approved MCP config for a malicious one without re-triggering the approval prompt; that's the class of attack a tool-description audit is built to surface early.

What mcpskills does: tool_safety covers prompt-injection patterns including hidden HTML comments, zero-width joiners, RTL overrides, and "ignore previous instructions" patterns at critical severity. Honest scope: we read static source today; live tool-manifest inspection (start the server, fetch the manifest, hash it for rug-pull detection) is on the roadmap, not shipped. Use a runtime scanner alongside.

Step 5 — Check transport and binding

Manual: STDIO transport executes as a child process with your full user permissions — no network surface to worry about. For HTTP or SSE transports, check what address the server binds to: 127.0.0.1 is correct (localhost only); 0.0.0.0 exposes the server to your entire network and to DNS-rebinding attacks through the browser.

Good: STDIO transport, or HTTP bound to 127.0.0.1. 0.0.0.0 can be expected inside Docker or container deployments — flag it for review, not as malice, and confirm there's a real reason.

What mcpskills does: the unsafeBinding sub-check (added in v3.2.0, medium severity) flags HTTP servers with 0.0.0.0 bindings via Express .listen(), Python uvicorn host=, FastAPI host kwargs, CLI --host flags, and HOST= environment variable patterns. It surfaces the binding for review without hard-blocking — containers legitimately use it.

Step 6 — Run a manifest scanner

Manual: Tools like mcp-scan walk every MCP server in your config, hash the tool manifests, and check for prompt injection, tool poisoning, and schema changes between versions. The tool-pinning feature is the important one: on first scan, mcp-scan hashes each tool's description; if the description changes on a future scan (the "rug pull" pattern — Day 1 clean, Day 7 malicious), it alerts you.

Good: Scan passes; tool manifests unchanged from the last known-good scan.

Pick whichever scanner you prefer for runtime defense. mcpskills runs upstream of that — it tells you which servers are worth installing in the first place. mcpskills doesn't run a live manifest scan today; instead, our daily monitoring re-scores watched packages every 24 hours and alerts on score deltas of 0.3 or more. That's a different rug-pull detector, applied at the package level rather than the tool-manifest level. Use both.

Step 7 — Pin the version

Manual: Hardcode the exact version in your config. npx @scope/server floats; npx @scope/server@2.4.2 doesn't. When you want to update, re-run the audit on the new version first.

Good: All install snippets in your config and your team's docs reference an exact version, not latest or a major-version range.

Honest scope: mcpskills doesn't enforce version pinning — that's an operational discipline you have to apply yourself. Our release_cadence and commit_recency signals show whether a project's update tempo matches the version you've decided to pin.

Step 8 — Sandbox the first run

Manual: Test the server in an isolated environment before adding it to your real config — Docker with --network none, or a fresh VM. If the server crashes because it can't phone home to an unexpected endpoint, that's information you want before it has access to your real environment.

Good: The server functions correctly under restricted permissions. Any outbound network call it makes is one you expected.

Honest scope: mcpskills doesn't replace sandboxing. We score the package's static properties; runtime sandboxing is a separate control you should still apply for high-privilege servers, especially those that need filesystem or shell access.

What mcpskills does NOT replace

Pre-install trust scoring is a probability assessment from static signals. Three controls remain your responsibility:

  1. Sandboxing. Static analysis catches obvious dangerous patterns; it can't model multi-step tool-call dynamics. For high-privilege servers, run them in a sandbox first.
  2. Live tool-manifest hashing. Manifest scanners like mcp-scan inspect what a configured server actually exposes at runtime, and detect changes between scans. mcpskills doesn't do live manifest inspection today (it's on our roadmap). Use a scanner alongside.
  3. Version pinning. mcpskills monitors packages we know about; you should still pin the exact version you've decided to run. Floating versions defeat both pre-install scoring and runtime scanning.

The trust score is a strong prior, not a verdict. These three controls turn the prior into a defensible install decision.

The one-call version

Most of the 8-step audit collapses into a single API call or MCP tool invocation. mcpskills' auto_gate tool returns a proceed: true/false decision, the trust tier, the top contributing signals, and any disqualifiers — in roughly the time it takes to run npm audit.

Three real examples from our score cache:

Example A — A Verified package: openai/openai-node

Composite score 8.97. All four dimensions strong (Alive 9.1, Legit 9.4, Solid 8.5, Usable 9.2). No disqualifiers. auto_gate returns:

{ "proceed": true, "tier": "verified", "score": 8.97,
  "reason": "15 signals strong; certified",
  "topSignals": ["author_credibility 9.5", "community_adoption 9.8"] }

Translation: the 8-step audit collapses to "yes." Install with reasonable confidence; pin a version; sandbox if you're paranoid.

Example B — An Established package with caveats: modelcontextprotocol/servers

Composite 6.85. The canonical MCP servers reference repo from the Agentic AI Foundation. It's Established, not Verified — a real teachable moment. The reason it doesn't clear the Verified bar: scoring against the most recent npm-published version surfaces release-cadence and dependency-health signals that haven't fully caught up. auto_gate returns proceed: true with notes; the package is fine, but worth the conscious read rather than the autopilot install.

Example C — A Blocked package: iflytek/skillhub

Composite 6.06, but tier is Blocked because of a disqualifier. auto_gate returns proceed: false; the recommendation is to look at alternative agent-skill registries in the same category. This is what catching a problem at the trust-layer looks like — before the package ever lands in your config.

mcpskills is the trust layer for AI skills and MCP servers — it scores publishers, source code, dependencies, vulnerabilities, and supply-chain risk so you can decide what to install before running anything locally. Full algorithm: /methodology.

We audited our own MCP server

Before publishing this article, we ran the same 8-step audit on @mcpskillsio/server v2.4.1. The full artifact lives in our research directory; the short version: 4 PASS, 1 INFO, 2 REVIEW, 1 FAIL → PATCHED.

The FAIL was on Step 3: npm audit against v2.4.1 found one HIGH and two moderate vulnerabilities, all transitive through @modelcontextprotocol/sdk's express/hono dependency chain (HIGH was path-to-regexp 8.0.0–8.3.0 ReDoS; moderates were @hono/node-server middleware bypass and seven hono advisories). None were reachable through our code path, but the right response was to patch anyway: npm audit fix was non-breaking, all 20 mcp-server tests pass on the upgrade, and we published @mcpskillsio/server v2.4.2 the same day. npm audit on v2.4.2 returns "found 0 vulnerabilities."

The two REVIEW items are documentation gaps in our README (no version-pinning guidance, no runtime trust-boundary disclosure) and are queued for the next release. The INFO item is mcp-scan; we're adding it to CI on the next release branch.

We hold ourselves to the standard we're asking other maintainers to meet. If you want to verify the v2.4.2 audit yourself, install the package and run npm audit against it: npm install @mcpskillsio/server@2.4.2 && cd node_modules/@mcpskillsio/server && npm audit should return "found 0 vulnerabilities."

Two ways to use this

Free: paste any MCP server URL, npm package, or registry link at mcpskills.io for a trust score. Three free scans per day.

In your agent: install the mcpskills MCP server and call auto_gate from inside Claude Code, Cursor, or any MCP client:

claude mcp add mcpskills -- npx @mcpskillsio/server@2.4.2

Then ask: "Should I install @modelcontextprotocol/server-github?" — and get a trust-scored decision, with the 8-step audit collapsed into a single tool call. Maintainers who want a monitored gold trust badge for their README can apply at /badges.

FAQ

Does a Verified score guarantee safety?
No. Trust scoring is a probability assessment based on signals available before runtime. A Verified score means strong signals across 15 dimensions and 7 safety pattern checks, but no automated tool can model every possible runtime threat. The three controls in the "What mcpskills does NOT replace" section above remain your responsibility.
How is this different from mcp-scan?
Pick whichever scanner you prefer for runtime defense. mcpskills runs upstream of that — it tells you which servers are worth installing in the first place. mcp-scan inspects MCP servers you've already configured; mcpskills is the trust layer that decides what should run before configuration happens. They complement each other.
What does mcpskills check that npm audit doesn't?
CISA KEV (vulnerabilities with confirmed in-the-wild exploitation) and FIRST.org EPSS (30-day exploit probability) are not in npm audit output. Any CVE on KEV hard-gates the trust tier to "blocked." mcpskills also runs static safety pattern scans (prompt injection, shell execution, credential access, network exfiltration, obfuscation, public network binding, and risky npm lifecycle scripts) that npm audit doesn't perform at all.
Can I score private repos?
Not currently. mcpskills.io scores public repos and packages across GitHub, npm, Smithery, and OpenClaw. Private repos and unpublished packages can't be scored.
What changed in algorithm v3.2.0?
Two new sub-checks under tool_safety: unsafeBinding (HTTP servers binding to 0.0.0.0, medium severity) and packageScriptRisk (risky npm lifecycle scripts — high for curl|sh / base64|sh / child_process; medium for opaque local scripts; none for vanilla tsc/build/husky install). Calibration on real-world packages showed 0% false-positive rate across 83 sampled npm packages. Full glossary: /glossary.

Run the audit in one call

Paste any MCP server, npm package, or registry URL and get a trust-scored install decision in 10 seconds.

Scan Now — Free