Is CLAUDE.md a security risk?

It can be. CLAUDE.md and similar instruction files (.cursorrules, AGENTS.md, Copilot instructions) are auto-loaded into the model's context at launch and treated as trusted — so a malicious one in a cloned repo can carry hidden instructions the model follows. Anthropic's docs note CLAUDE.md is 'context, not enforced configuration.' Related config like .claude/settings.json hooks is worse: it can auto-run shell commands when you open the repo, which is how the June 2026 Miasma worm stole credentials from developers who opened compromised Microsoft repositories.

What was the Miasma worm / Microsoft GitHub attack in June 2026?

Around June 5–8, 2026, GitHub auto-disabled 70+ (reported 73) Microsoft repositories across four organizations after a malicious commit to Azure/durabletask. Security firms attribute it to a self-replicating worm called Miasma. It planted agent-config files — a .claude/settings.json SessionStart hook, a .gemini/settings.json hook, a .cursor/rules prompt injection, and a .vscode/tasks.json folderOpen task — that auto-ran a credential-stealing payload the moment a developer opened the repo in Claude Code, Gemini CLI, Cursor, or VS Code. Exact details were still emerging at publication.

How do I protect my AI coding agent from instruction-file attacks?

Review the instruction and config files (CLAUDE.md, .claude/settings.json, .cursor/rules, .vscode/tasks.json) in any repo before opening it in an agent — treat hooks like untrusted shell scripts. Don't use --dangerously-skip-permissions outside a sandbox. Respect first-run and external-@import approval dialogs (note trust verification is off under the -p headless flag). Vet and pin MCP servers' tool descriptions. Keep reference memory read-only. And enforce critical rules with a PreToolUse hook, which runs as code rather than relying on the model obeying CLAUDE.md.

Your CLAUDE.md is an attack surface — what the Miasma worm just proved

For two years the AI-coding security conversation was theoretical: could someone hide instructions in a file your agent reads? This week it stopped being theoretical. The files your coding agent auto-loads and trusts — CLAUDE.md, .cursorrules, settings hooks, MCP tool descriptions, accumulated memory — are an attack surface, and a self-replicating worm just demonstrated it against Microsoft in the open.

What happened: the Miasma worm

Around June 5–8, 2026, GitHub’s automated systems disabled 70+ Microsoft repositories (reported as 73 by security firms) across four organizations — Azure, Azure-Samples, Microsoft, and MicrosoftDocs — in what The Register clocked as a 105-second, two-wave takedown. Microsoft’s only on-record line on the repo takedown so far, via TechCrunch, is that it “temporarily removed some repositories as we investigated potential malicious content.” 404 Media broke it and is candid that “the exact contours of the breach are unclear.” This isn’t a phantom, though: Microsoft’s own Defender team had documented the broader Miasma campaign on June 2 — that earlier prong hit Red Hat’s npm packages and “harvest[ed] credentials from GitHub, npm, Amazon Web Service (AWS), Azure, Google Cloud Platform (GCP), HashiCorp Vault, Kubernetes, and developer systems.” Same campaign, a new delivery channel.

What is corroborated — by forensic write-ups from StepSecurity and others, and reported by The Hacker News — is the mechanism, and it is the whole point of this article. Starting from a malicious commit to Azure/durabletask — pushed, per the forensics, through a previously compromised contributor account that also ties this to the May 19 trojanized-PyPI attack — the worm (dubbed Miasma, assessed as a variant of the Mini Shai-Hulud worm) committed agent-config files designed to fire the instant a developer opened the repo in their coding tool:

.claude/settings.json — a Claude Code SessionStart hook that runs a payload on session start
.gemini/settings.json — the same trick for Gemini CLI
.cursor/rules/setup.mdc — a Cursor rules file carrying a prompt injection (alwaysApply: true)
.vscode/tasks.json — a VS Code task with "runOn": "folderOpen"

The payload, per StepSecurity, was a ~4.6 MB .github/setup.js that “executes automatically without user interaction” and harvests credentials. Open the repo, lose your secrets.

Two honest caveats, because this is still developing: the exact repo count and the full scope of stolen credentials aren’t nailed down (404 Media’s “contours unclear” is the right frame, and there’s no comprehensive Microsoft post-mortem yet), and this was a hybrid attack — three of the four vectors were auto-run config hooks/tasks, and only the Cursor .mdc was a true model-facing prompt injection. That nuance matters, and it makes the lesson worse, not better: you don’t even need to trick the model. You just need the agent to open a folder.

The thesis: auto-loaded means auto-trusted

Every coding agent has the same soft underbelly. To be useful, it reads files you didn’t write — and acts on some of them without asking. That set is bigger than people think:

Instruction files loaded at launch: CLAUDE.md, AGENTS.md, .cursorrules, Copilot instructions. Anthropic’s own docs note that CLAUDE.md files “are loaded in full at launch”.
Config hooks that run shell commands on lifecycle events — the exact vector Miasma used.
MCP tool descriptions, which are model-visible instructions you never see in full.
Agent memory, which persists across sessions and is re-read as fact.

The defining property is that all of it is auto-loaded and auto-trusted. A human reviews a pull request; almost nobody reviews the CLAUDE.md in a repo they just cloned, or the description fields of an MCP server they just added.

The research was already there

Miasma is the in-the-wild proof, but the vulnerability class has been documented for over a year:

Rules File Backdoor (Pillar Security, March 2025). Researchers showed you could plant hidden instructions in Cursor and GitHub Copilot rule files using “zero-width joiners, bidirectional text markers, and other invisible characters” — readable by the model, invisible to a human reviewer — so the agent silently emits backdoored code that “propagates through the codebase, with no trace in the chat history.” Telling detail: Cursor’s position was that it’s “not a vulnerability on their side,” and GitHub concluded users are “responsible for reviewing and accepting suggestions.” Translation: the instruction file is your problem to vet. CLAUDE.md and AGENTS.md are the same class of file — Pillar demonstrated it on Cursor/Copilot, and Miasma extended the family to .cursor/rules in anger.

OWASP LLM01 (Prompt Injection). The #1 entry on the OWASP LLM Top 10 names the property that makes hidden-unicode payloads work: prompt injections “do not need to be human-visible/readable, as long as the content is parsed by the model,” and indirect injection happens “when an LLM accepts input from external sources, such as websites or files.”

MCP Tool Poisoning (Invariant Labs, April 2025). A tool-poisoning attack embeds “malicious instructions within MCP tool descriptions that are invisible to users but visible to AI models.” Their proof-of-concept made an agent read the user’s SSH keys and exfiltrate them — because “AI models see the complete tool descriptions, including hidden instructions, while users typically only see simplified versions.”

Anthropic says the quiet part in its own docs

To its credit, Anthropic documents this directly — which is also the best evidence that it’s real. On agent memory: “a successful prompt injection could write malicious content into the store. Later sessions then read that content as trusted memory.” (This is exactly why we treated the “Dreams” memory-consolidation feature as a review-before-trust tool, not a self-improvement magic trick.)

And the single most important sentence for anyone relying on CLAUDE.md to enforce anything: Claude Code “treats them as context, not enforced configuration. To block an action regardless of what Claude decides, use a PreToolUse hook instead.” Your CLAUDE.md rule “never touch production” is a suggestion. Worse, if an attacker controls that file, it’s their suggestion.

Anthropic is also blunt about MCP: it “does not security-audit or manage any MCP server,” and about the ceiling of any defense: “no system is completely immune to all attacks.”

What actually defends against it

The good news: every meaningful mitigation is something you can do today, and most are documented behaviors you may have been clicking past.

Review the instruction and config files in any repo before you open it in an agent. This is the direct lesson of Miasma. A fresh clone’s .claude/settings.json, .cursor/rules/, .vscode/tasks.json, and CLAUDE.md deserve the same suspicion as a random shell script — because that’s what a hook is. Anthropic’s security docs say it plainly: “review suggested commands before approval” and run untrusted work in a VM.
Don’t bypass the permission prompts. Claude Code “uses strict read-only permissions by default” and gates sensitive operations on approval; --dangerously-skip-permissions exists, and the name is the warning. Outside a throwaway sandbox, don’t.
Respect the trust gates that already exist. The first time a project uses an external @import, Claude Code “shows an approval dialog listing the files”; first-time codebases and new MCP servers require trust verification. One sharp edge worth knowing: trust verification is disabled when running non-interactively with -p — so headless/CI runs need their own guardrails.
Scope and vet MCP servers; pin their tool definitions. Prefer servers you wrote or trust, snapshot the tool list at approval, and re-check it each session (Invariant ships mcp-scan for exactly this). A server that’s benign today can ship a poisoned description tomorrow.
Treat memory and rules as review-before-trust; make reference stores read-only. Auto memory is “plain markdown you can edit or delete at any time” — so audit it. Anthropic’s guidance: use read_only for any store the agent doesn’t need to write.
Enforce the things that matter with hooks, not prose. If a rule must hold regardless of what the model decides — or what a poisoned CLAUDE.md says — it belongs in a PreToolUse hook, which runs as code outside the model’s discretion. That’s the line between “context” and “control.”

The uncomfortable summary: the same auto-loading that makes coding agents feel magic is the attack surface. The community has been quietly building tools around the product’s seams; this is the seam that bites hardest. Miasma didn’t find a new vulnerability — it weaponized the one that’s been documented since early 2025, and aimed it at the most-cloned repos on earth. The fix isn’t a patch you wait for. It’s treating every file your agent auto-reads as untrusted input, because it is.

Companion reading

CLAUDE.md vs AGENTS.md — the auto-loaded instruction files at the center of this.
Claude’s “Dreams,” accurately — why memory consolidation is a review-before-trust tool, not magic.
Claude Code hooks — the PreToolUse hook is how you enforce a rule the model can’t be talked out of.
What Claude Code’s accessory ecosystem reveals — the seams the community keeps patching.
Claude Code’s “China backdoor,” explained — a real case of invisible, undisclosed behavior shipped into the tool itself.
GhostApproval: the symlink flaw in six coding agents — a malicious repo bending the approval prompt into writing outside your workspace.