Path & Payload

The Exploit With a Hallucinated CVSS Score: Breaking Down the First Confirmed AI-Developed Zero-Day

An AI model had been trained on enough security content to recognize that exploit documentation sometimes includes severity ratings β€” and so it generated one, confidently, that doesn't correspond to any real entry. It also wrote educational docstrings explaining the exploit's behavior, structured the code like a textbook Python exercise, and annotated everything with the kind of inline commentary that’s found in training data but not in operational tooling. Those artifacts β€” collectively inconsistent with how human developers write exploit code β€” are what tipped off Google Threat Intelligence Group.

This is the story of what GTIG found, what the May 2026 AI Threat Tracker report documents beyond the headline incident, and why this story is about more than the first confirmed AI-developed zero-day β€” it shows that the race on both sides has been underway longer than most defenders realize.

The Incident

The analysis here draws primarily from the GTIG AI Threat Tracker report released today on Google Cloud Blog. Supporting sourcing includes CyberScoop's interview with GTIG chief analyst John Hultquist and same-day coverage from BleepingComputer, CSO Online, SecurityWeek, and The Register.

Google withheld substantial detail β€” the threat group is unnamed, the vulnerability is unidentified, and the targeted software is described only as a popular open-source, web-based system administration tool. The stated reason is to protect the vendor and protect the disruption operation.

What is confirmed is that a prominent cybercrime group collaborated to plan a mass exploitation campaign. The vulnerability targeted a Python script enabling 2FA bypass on the unnamed administration tool (but Google notes that exploitation still required valid user credentials). A patch exists and has been applied. GTIG has high confidence that an AI model was used to both discover and weaponize the vulnerability β€” though the confidence is slightly stronger on weaponization than on discovery. Google is confident the model was not Gemini or Anthropic's Mythos.

The Chain: AI Across the Exploit Development Lifecycle

Stage 1: Vulnerability Discovery

ATT&CK: T1595.002 (Active Scanning: Vulnerability Scanning), T1588.006 (Obtain Capabilities: Vulnerabilities)

GTIG has high confidence AI was involved in identifying the vulnerability, but this is the part of the chain where certainty drops slightly. The report states AI "supported the discovery and weaponization" β€” with discovery qualified more carefully than weaponization. How the model was directed to search, what scope it was given, and whether it was operating autonomously or as an interactive tool for a human researcher are all unknown from available reporting.

What can be inferred: AI-assisted vulnerability research doesn't require a novel technique. The GTIG report makes clear that the North Korean group APT45 has been sending thousands of repetitive prompts to AI models to analyze known CVEs and validate proof-of-concept exploits. The approach is not creative, but the advantage is scale. The same principle likely applies here.

Stage 2: Exploit Development and Weaponization

ATT&CK: T1588.005 (Obtain Capabilities: Exploits), T1587.004 (Develop Capabilities: Exploits)

This is where the forensic evidence is clearest. The exploit code contained a suite of artifacts inconsistent with human authorship, including educational docstrings describing the exploit's behavior, heavily annotated code, textbook Python formatting characteristic of LLM training data, and a hallucinated CVSS score β€” a severity rating the model generated based on pattern-matching to familiar documentation formats, not based on any real CVSS entry.

This is significant for two reasons. First, it's the primary basis for GTIG's confidence assessment. They're not inferring AI involvement from behavioral metadata or telemetry β€” they found it in the code itself. Second, it suggests the model wasn't consulted lightly. The volume and distribution of AI-characteristic artifacts across the exploit indicates heavy involvement, not spot assistance with a specific function. The threat group appears to have delegated substantial portions of the development process to the model.

The annotated code likely looked to a human analyst to be unusually thorough for an exploit β€” the kind of inline explanation that a model produces, designed to look complete and professional, rather than something written under operational conditions.

Stage 3: Planned Mass Exploitation β€” Disrupted

ATT&CK: T1190 (Exploit Public-Facing Application), T1556.006 (Modify Authentication Process: Multi-Factor Authentication)

The threat group's intent was mass exploitation β€” not a targeted intrusion but a broad campaign against any vulnerable instance of the unnamed tool. GTIG's proactive counter-discovery, the mechanics of which haven't been detailed, identified the campaign in development and allowed the vendor to patch before the operation launched.

The disruption may have had an additional assist from the exploit itself. Reporting suggests that implementation errors in the code likely interfered with successful use. This is worth pausing on: the AI-generated exploit had a flaw that may have blunted its own effectiveness. The model produced something that looked complete β€” annotated, scored, formatted β€” but operational quality didn't match documentation quality. At this stage of AI-assisted exploit development, the output can be sophisticated in form while remaining unreliable in function.

But this asymmetry probably won't last.

The Pattern Behind the Incident

The zero-day finding is the headline, but the GTIG report documents something larger: AI is now operating across multiple threat actor categories simultaneously.

Nation-state actors:

APT45 (North Korea) has been sending thousands of repetitive prompts to AI models to recursively analyze CVEs and validate proof-of-concept exploits at a scale that would be unmanageable through manual research. The GTIG report describes this as producing a substantially more robust exploit arsenal than would be achievable otherwise.

UNC2814, a China-linked group with a history of targeting telecommunications and government entities across more than 42 countries, used persona-driven jailbreaking on Gemini to conduct vulnerability research on TP-Link firmware and Odette File Transfer Protocol implementations. They instructed the model to operate as a senior security auditor, then directed it to analyze firmware for pre-authentication remote code execution flaws.

Russian-nexus actors used AI-generated decoy code to obfuscate malware. Two specific families β€” CANFAIL and LONGSTREAM β€” contained AI-generated logic embedded specifically to complicate analysis, not to perform any operational function. The decoy code is defense evasion by noise.

ATT&CK: T1588.006 (APT45), T1595.002 + T1588.006 (UNC2814), T1027 (Russia/CANFAIL, LONGSTREAM)

Criminal actors and supply chain targeting:

A separate finding targets the other side of the equation: an adversary-usable skill plugin designed to be loaded into Claude Code β€” Anthropic's terminal-based agentic coding agent β€” containing vulnerability knowledge distilled from 85,000 real-world cases collected by a Chinese bug bounty platform. The plugin is a purpose-built capability distribution mechanism. Rather than training a model from scratch, an adversary packages high-fidelity vulnerability knowledge into a reusable configuration that any user of the agentic tool can load. AI capability as supply chain artifact.

Another finding concerns how threat actors are maintaining access to frontier AI models. Rather than using individual accounts, actors are building industrialized access infrastructure: automated account creation, proxy relay networks, and account-pooling systems that distribute usage across identities to avoid rate limiting and detection. The mechanics mirror botnet infrastructure applied to a different purpose β€” but the operational maturity they imply is significant.

Autonomous malware:

GTIG identified a malware family called PROMPTSPY that represents a qualitative shift from AI-assisted development to AI-assisted execution. Rather than a tool built with AI help and then deployed, PROMPTSPY uses an AI model at runtime to interpret system states and dynamically generate commands β€” meaning the malware's behavior is determined partly by what the model decides to do after observing the victim environment. The GTIG report describes this as a move toward autonomous attack orchestration. It's early-stage, but it belongs to a different risk category than what most defender frameworks are built to address.

MITRE ATLAS: AML.T0103 (Deploy AI Agent), AML.T0102 (Generate Malicious Commands), AML.T0040 (AI Model Inference API Access), related ATT&CK behavior includes command-and-control activity.

Where Defenders Had a Window

1. Proactive counter-discovery. GTIG's detection of the campaign before it launched is the most important defensive data point in this incident β€” and the one with the fewest transferable lessons for most organizations. GTIG has capabilities, intelligence relationships, and visibility that don't generalize. It does show, however, that the intelligence layer is crucial. Sharing threat intelligence with vendors and security community partners, rather than treating detection as a purely internal capability, creates the conditions for preemption.

2. The implementation error window. The exploit had a flaw that limited its effectiveness, but defenders shouldn’t assume that limitation will carry over to future variants. The same AI capability trajectory that produced this exploit will close the error gap. Relying on attacker implementation errors as a buffer is not a defensive posture.

3. Multi-factor authentication architecture. The targeted vulnerability was a 2FA bypass in a web-based administration tool. The existence of a bypass doesn't make MFA worthless β€” it means implementation quality matters as much as implementation presence. Phishing-resistant MFA (hardware keys, passkeys) and privileged access controls on administration interfaces reduce the viable attack surface regardless of whether a bypass technique was AI-generated or not.

4. AI artifact detection. GTIG identified this exploit partly because of AI-characteristic code patterns. Hallucinated metadata, educational docstrings, and textbook LLM formatting in exploit code are detectable β€” not reliably at scale yet, but the artifact classes are identifiable. Security teams conducting malware analysis or red team review are in a position to develop heuristics around these patterns now, rather than after the artifacts get refined out.

Technique in Focus: AI-Assisted Exploit Development β€” and the Symmetry Problem

T1587.004: Develop Capabilities: Exploits

Most of this report’s coverage outside security circles treats the AI-developed zero-day as a warning about an emerging threat. But this framing is already behind.

In late 2024, Google's Big Sleep AI agent β€” a collaboration between DeepMind and Project Zero β€” found its first real-world zero-day vulnerability in SQLite. In July 2025, Big Sleep was used to identify a separate SQLite vulnerability that was about to be weaponized, allowing Google to cut off the attack before deployment. Defensive use of AI for vulnerability discovery has been operational for over a year, and attackers were watching. The GTIG report confirms the offense side is now running the same play.

The "first confirmed" framing in the report and all downstream coverage is precise in one way and misleading in another. It's the first time GTIG found forensic evidence of an AI-developed exploit in a real-world campaign. But Hultquist told CyberScoop explicitly that it's probably not the first time this has happened β€” it's just the first time they could prove it. The implication for defenders is significant: if your posture has been treating AI-assisted exploitation as something to plan for, the planning horizon has passed. You're inside it now, operating in an environment where some of the exploits targeting your systems may already have been built by models β€” and the ones that don't include a hallucinated CVSS score won't announce themselves the way this one did.

Takeaway for Defenders

The patch exists. If you're running a popular open-source web-based administration tool with a 2FA implementation, Google and the vendor coordinated on closure before this was published. Whether that describes your environment is something your team will need to verify.

The GTIG report makes clear that AI is now operational across the attack lifecycle β€” from vulnerability discovery to weaponization, from malware obfuscation to autonomous execution. The response posture this demands isn't different in kind from sound security hygiene, but it changes the timeline assumptions underlying every prioritization decision. Exploit development cycles are compressing. The gap between vulnerability disclosure and weaponized exploit will continue to shrink as model capabilities grow.

For administration interfaces specifically: assess which are internet-facing, what authentication mechanisms are in place, and whether privileged access is appropriately scoped. Web-based administration tools with 2FA are confirmed targets of interest. Phishing-resistant MFA and network-level access controls on administration surfaces reduce the viable attack surface regardless of how the specific exploit was built.

On AI artifact detection: if your team does malware analysis or red team review, developing familiarity with the artifact classes GTIG identified is worth the time now. Hallucinated metadata, unusual documentation density, and LLM-characteristic code structure are not definitive indicators, but they're identifiable. In an environment where AI-assisted exploit development is confirmed operational, having that pattern recognition is more useful than not having it.

Sources

#Attack Chain Breakdowns