Microsoft's AI Framework Has Been Broken Three Times in a Row. That's Not Bad Luck.
Two confirmed critical RCEs in Semantic Kernel, then a six-bypass full-chain disclosure weeks after the patch. The same structural mistake keeps shipping in agent frameworks. Here's the chain - and what to actually do about it.

Microsoft’s AI Framework Has Been Broken Three Times in a Row. That’s Not Bad Luck.
Microsoft Semantic Kernel now has two confirmed critical remote code execution vulnerabilities on record. A third disclosure landed on April 28, 2026, detailing a full-chain RCE built from six bypasses of mitigations that shipped just weeks earlier.
The framework is used to build and deploy AI agents across enterprise stacks, Copilot tooling, and multi-agent systems. And the same class of flaw has now appeared in Semantic Kernel, LangChain, and LlamaIndex. Three teams, three codebases, the same structural mistake. That is not a coincidence.
What Actually Broke
CVE-2026-25592 was the
first confirmed critical. A path traversal flaw in the .NET SDK’s
SessionsPythonPlugin - specifically in the DownloadFileAsync and
UploadFileAsync functions. The plugin accepted whatever file path the LLM’s
output handed it, including traversal strings like ../../etc/passwd, with no
check on whether the path stayed inside an intended directory. Fixed in
Microsoft.SemanticKernel.Core version 1.71.0. The GitHub security
advisory
lists the workaround: a Function Invocation Filter that allowlists the
localFilePath argument before it reaches the file system.
The second was CVE-2026-26030,
in the Python SDK. Eval injection inside InMemoryVectorStore’s filter
functionality. User-supplied filter strings reached a Python code execution
surface with no sanitization. Fixed in python-1.39.4. The NVD advisory
is blunt: avoid using InMemoryVectorStore in production until you are on the
patched version.
Now a researcher has dropped a full-chain RCE
disclosure
on r/netsec showing six bypasses of the April 2026 mitigations across .NET
v1.74 and Agent Framework 1.x. The chain combines the confused deputy
pattern - where an AI agent is tricked into using its own granted permissions
against the system that trusts it - with AutoInvoke behavior that executes
plugin functions based on LLM-generated output without an explicit confirmation
step.
The Pattern Nobody Wants to Name
The confused deputy problem in AI agents is documented and has been for over a year. Quarkslab published a detailed breakdown in January 2026. HashiCorp wrote about it in May 2025. The core of it: an AI agent operates with elevated permissions to do its job. An attacker injects malicious instructions into data the agent processes, like a document, a retrieved memory, or a tool response. The agent reads those instructions as legitimate and acts on them using the permissions it was given. The host system trusted the agent. The agent got played.
The reason this keeps reappearing across frameworks is structural. Developers building AI agents are thinking about what the agent should do. They are not thinking about what happens when the input the agent receives is adversarial. The LLM is not a trusted execution environment. It is a text transformer. If you wire tool-calling to its outputs without a verification layer, you have built a confused deputy by default.
Semantic Kernel’s AutoInvoke behavior is the exact mechanism that makes this
exploitable at scale. When AutoInvoke is enabled, the framework will call
registered plugin functions based on what the LLM says to call, with the
arguments the LLM says to pass. That is a feature. It is also a direct path
from prompt injection to arbitrary code execution if the plugin has file system
or shell access and the arguments are not validated.
What the Industry Is Getting Wrong
Most teams treating these as patch-and-move-on CVEs are misreading the situation. Two critical RCEs in three months, followed by a bypass chain weeks after the patch, is not a quality control problem. It is a signal that the security model underneath AI agent frameworks was not designed for adversarial input.
The attack surface for an AI agent is not the same as the attack surface for a traditional API. A traditional API receives structured input from a client you authenticated. An AI agent receives natural language from a model that processed documents, web content, retrieved memories, and prior tool outputs. Any of those surfaces can carry injected instructions. The agent has no way to distinguish a legitimate instruction from an injected one unless that verification is explicitly built into the architecture.
Nobody shipping agent frameworks is making that the default. The safe defaults are opt-in, when they exist at all.
What to Actually Do If You Are Building on Semantic Kernel
If you are on the .NET SDK, patch to Microsoft.SemanticKernel.Core >= 1.71.0
for CVE-2026-25592. If you are on the Python SDK, upgrade to python-1.39.4
or higher for CVE-2026-26030. Do not use InMemoryVectorStore in production
on any version below the patch.
Beyond the patches, the architecture questions matter more than the version numbers:
- Disable
AutoInvokeby default. Require explicit confirmation before any plugin function executes in response to LLM output. The productivity cost is real. So is the attack surface you close. - Validate all file path arguments before they reach the file system. An allowlist is not optional. Treat any path value coming from LLM output as untrusted input, because it is.
- Treat retrieved content as untrusted. Documents, memory stores, web search results, and tool responses are all potential injection surfaces. If your agent can act on content it retrieves, that content can instruct it.
- Scope permissions to the minimum the agent needs. The confused deputy attack only works because the agent has permissions worth stealing. A file write plugin with no path restriction and AutoInvoke enabled is a loaded gun sitting next to a language model.
The researcher’s bypass chain on the April mitigations is the part that should be uncomfortable. Microsoft patched. The patches shipped. Then six bypasses landed inside a month. That is the actual tempo of this problem.
The frameworks are shipping fast because the market demands it. The security model is catching up in public, on production systems, with real permissions attached. That is the situation. Build accordingly.
Sources:
Written by Nirav Joshi · Fullstack and Blockchain Developer
Newsletter
Want the next post like this?
Subscribe for occasional emails when I publish something worth your time.