Two months ago I scored 202 servers from the official MCP Registry and found that 83% carried a disqualifier flag. That was a sample. This is the population: 2,233 MCP servers, AI skills, and supporting packages, scored through the same production engine, pulled from four registries at once — the MCP Registry, ClawHub, npm, and GitHub. Every score is in the public dataset at /data/latest.json (CC BY 4.0), so you can check my arithmetic.
The headline isn't a scandal. There's no new ClawHavoc in this data. The headline is a distribution: the MCP ecosystem has no top end. Out of 2,233 projects, not one scores above 9 out of 10 — and the best-resourced official SDKs on earth top out at 8.97. Below them, two-thirds of everything published clusters in a flat, undifferentiated 5-to-7 band. I've started calling it the trust middle, and once you see it you can't unsee it.
TL;DR
Mean composite: 5.78/10. Median: 5.71. Tiers: 227 Verified (10.2%), 1,689 Established (75.6%), 253 New (11.3%), 64 Blocked (2.9%). The dataset skews GitHub-backed (96.1%) with a thin npm-only slice (3.9%). The single most important cut in the whole report is the one between general developer tools that happen to speak MCP and purpose-built MCP servers and skills — they behave like two different ecosystems.
The Ceiling Nobody Passes
Start with the distribution, because it's the finding everything else hangs off. Here's where all 2,233 composite scores land, bucketed into one-point bands:
Two empty edges and a giant hump. Nothing scores below 3, because the genuinely dangerous projects don't get a low grade — they get hard-gated to Blocked and removed from the curve entirely. And nothing scores above 9, which is the part that surprised me. I expected the top of the MCP world to have a few near-perfect entries. It doesn't.
Look at who sits at the actual ceiling:
| Project | Score | Tier | Stars |
|---|---|---|---|
| openai/openai-node | 8.97 | Verified | 10,852 |
| auth0/nextjs-auth0 | 8.81 | Verified | 2,296 |
| prisma/prisma | 8.75 | Verified | 45,849 |
| anthropics/anthropic-sdk-typescript | 8.53 | Verified | 1,817 |
| supabase/supabase | 8.48 | Verified | 99,959 |
| stripe/stripe-node | 8.47 | Verified | 4,384 |
| ollama/ollama | 8.22 | Verified | 166,624 |
| langchain-ai/langchainjs | 8.19 | Verified | 17,411 |
These are the most mature, best-funded, most-audited projects that touch this ecosystem — OpenAI's official SDK, Stripe, Prisma, Supabase, Ollama at 166,000 stars. They are doing everything right, and they land in the low-to-mid 8s. The remaining gap to 9+ isn't laziness; it's the signals that even great open-source projects rarely max out at once: published security policies, signed releases, OpenSSF Scorecard adoption, branch protection, and deep contributor diversity and fast issue response and a clean dependency tree, all simultaneously. The 9–10 band isn't unreachable. It's just empty today, which tells you how much headroom the whole ecosystem still has.
The Fat Middle
1,489 of 2,233 projects — 66.7% — score between 5.0 and 7.0. This is the band where you cannot tell quality from a glance, a star count, or a registry badge. A 5.4 and a 6.8 look identical on a marketplace listing. Both install with one command. Neither is broken, neither is excellent, and the difference between them is exactly the thing a human shopper never checks: license clarity, bus factor, whether the CI workflow leaks secrets, whether a tool quietly shells out.
The middle is the actual problem the ecosystem has. Outright malicious skills are rare and, increasingly, caught. What's everywhere is the merely-mediocre — projects good enough to publish, popular enough to install, and unproven enough that you're taking on risk you can't see. The trust middle is where install decisions go to be guessed.
Two Ecosystems Wearing One Name
Purpose-built MCP servers and AI skills are markedly less trustworthy than the general developer tools that merely speak MCP. The mcpskills engine runs in two modes: Standard Mode for general repositories, and Skills Mode, which auto-activates for purpose-built MCP servers and AI skills (detected via server.json, SKILL.md, and MCP keywords) and turns on the safety scanner plus skill-spec compliance. Split the 2,233 projects by mode and the averages tell two completely different stories:
A general developer tool in this dataset is three times more likely to earn Verified than a purpose-built MCP server — and the MCP-native projects are the only ones tripping the Blocked tier at all (64 of them, 3.5%). The stuff written specifically to be plugged into your agent is, on average, the least trustworthy stuff in the catalog.
That's not a slur on MCP authors; it's the maturity curve. The Stripes and Prismas have a decade of governance baked in — multiple maintainers, security policies, release signing — and they grew MCP-compatible. A purpose-built MCP server is usually someone's recent project: one author, a handful of stars, no second contributor, a LICENSE file that may or may not have been added. The ecosystem that exists because of MCP is younger and thinner than the ecosystem that merely speaks MCP, and the trust scores draw that line cleanly.
The Floor: 64 Blocked, Nothing Below 3
Sixty-four projects (2.9%) are hard-gated to Blocked. A Blocked tier doesn't mean confirmed-malicious — the safety scanner flags patterns, not intent — but it means at least one disqualifier fired: static analysis caught a dangerous pattern (shell execution on tool arguments, credential-path reads), a CI workflow checked out untrusted PR code, a critical CVE sits unpatched at the installable version, or the project is a single author with effectively no adoption. Every one of the 64 is Skills Mode. None are general SDKs.
The reassuring half of this: the floor is clean. Zero projects scored below 3.0, because the disqualifier system pulls the dangerous ones out of the distribution rather than letting them sit as "low scores." The unreassuring half: a marketplace shows you none of this. All 64 install with the same one-liner as the 8.97.
Licensing Hasn't Moved
Roughly 490 of the 2,233 scored projects (21.9%) carry no clear license — a figure essentially unchanged from the 21% measured in April's smaller sample.
That 490 is either projects with no license at all, or a NOASSERTION that GitHub couldn't resolve to a real SPDX identifier. MIT covers 54.2% of the field and Apache-2.0 another 16.7%, but the no-license fifth has held steady even as the dataset grew tenfold. It's the most fixable disqualifier in the whole system — one file — and it remains the most common one. Publishing speed is still outrunning governance.
What This Means for You
What This Data Doesn't Tell You
The dataset is everything currently in the public score cache (2,233 entries as of June 9, 2026), assembled by the nightly crawler across the MCP Registry, GitHub topic and keyword search, npm, and prior ClawHub batch scans. It is not a clean random sample of a defined frame — it's the working catalog — so read the percentages as the shape of "what's discoverable and scoreable," which is also exactly the shape a developer browsing for an MCP server actually encounters.
The Baseline, and What Comes Next
This is edition one. The version archive is now watching all 2,233.
Every project in this dataset is now tracked for version-level change — new tools slipped into a server, install scripts that appear overnight, maintainer flips, same-version republishes, newly added network endpoints. This report is the baseline; future editions will track the deltas the moment they appear. That's the part no point-in-time scanner can do, and it ships weekly as The Trust Diff.
Methodology
Scoring: every project ran through the production mcpskills.io engine — the same 15-signal algorithm available at mcpskills.io, across four dimensions (Alive, Legit, Solid, Usable). Skills Mode auto-detects MCP servers and AI skills via server.json / SKILL.md / MCP keywords and adds the static safety scanner, skill-spec compliance, and tool-safety checks. The known_vulnerabilities signal queries OSV.dev (unified GHSA + npm + PyPA + Go + RustSec), CISA KEV (actively-exploited), and FIRST.org EPSS (exploit probability) at the currently-installable version.
Dataset: 2,233 entries from the public score cache, generated 2026-06-09, published at /data/latest.json under CC BY 4.0 — also available as a Hugging Face dataset. Full algorithm: /methodology.
Companion reports: State of MCP Security — April 2026 · State of ClawHub Trust — April 2026.
Data sources
Every score in this report is reproducible from public data. The trust algorithm is an opinionated combination; the inputs are not.
- MCP Registry API — official catalog of Model Context Protocol servers. registry.modelcontextprotocol.io
- GitHub REST API — repository metadata, contributor graph, commit cadence, releases, issue responsiveness, license detection, file tree (for
SKILL.md/server.jsondetection and source scanning). docs.github.com/en/rest - OpenSSF Scorecard — security-posture signals (branch protection, signed releases, dependency-update tooling, dangerous workflow patterns). scorecard.dev
- OSV.dev — unified vulnerability database queried at the currently-installable version. osv.dev
- CISA Known Exploited Vulnerabilities (KEV) — confirmed in-the-wild exploitation; any KEV CVE hard-gates the tier to
blocked. cisa.gov - FIRST.org EPSS — 30-day exploit probability, used to weight non-KEV vulnerabilities. first.org/epss
- npm Registry — package metadata, weekly downloads, maintainer graph; used for partial scoring of npm-published projects without GitHub source. docs.npmjs.com
Prior research that frames this work
- Trail of Bits — "ClawHavoc" (Jan 2026): 1,184 malicious AI skills on a major marketplace; registry presence is not a trust signal. blog.trailofbits.com
- OX Security (Apr 2026): 9 of 11 public MCP registries published a benign malicious-PoC server with no review. ox.security
- Snyk — "ToxicSkills" (Apr 2025): 36.82% of sampled skills had at least one security flaw. arxiv.org/abs/2504.03767
Score your own MCP server
Free trust report — paste any GitHub repo, npm package name, or registry URL.
Open Scanner