AUSTIN, Texas, March 11, 2026 (GLOBE NEWSWIRE) -- DryRun Security , the industry’s first AI-native, code security intelligence company, today released The Agentic Coding Security Report , new research examining how leading AI coding agents perform when building real applications.

The study found that while AI coding agents significantly accelerate software development, they also consistently introduce security vulnerabilities during the development process. Among the agents evaluated, Anthropic’s Claude, produced the highest number of unresolved high-severity security flaws in the final applications.

DryRun evaluated three leading coding agents Claude, Codex, and Gemini as they developed two full applications through sequential pull requests, mirroring how real engineering teams implement features over time.

Across the study:

26 of 30 pull requests (87%) introduced at least one vulnerability

143 security issues were identified across 38 security scans

The same vulnerability classes appeared repeatedly across all agents

None of the agents produced a fully secure application

“AI coding agents can produce working software at incredible speed, but security isn’t part of their default thinking,” said James Wickett, CEO of DryRun Security. “In our usage and experience, AI coding agents often missed adding security components or created authentication logic flaws. These mistakes and gaps are exactly where attackers win.”

Claude Produced the Most Unresolved High-Severity Vulnerabilities

While all three agents introduced security flaws during development, the study showed clear differences in their final security posture.

Claude produced the highest number of unresolved high-severity vulnerabilities in the final applications.

Codex ultimately finished with the fewest vulnerabilities and demonstrated stronger remediation behavior during development.

Gemini introduced multiple issues early in its work and interestingly, as it continued, it ended up removing some issues with later modifications. However, it still ended with several high-severity findings.



Despite these differences, no agent produced a fully secure application.

Recurring Security Failures Appeared Across Every Codebase

Several vulnerability classes appeared consistently across both applications and all agents, many aligned with the OWASP Top 10.

Four weaknesses appeared in every final codebase, all related to authentication:

Insecure JWT verification and management

Lack of application-level brute force protections

Open to token replay attacks

Insecure defaults for refresh token cookie configurations

In multiple cases, agents implemented security mechanisms but failed to apply them consistently across the system. For example, authentication middleware was created for REST APIs but never applied to WebSocket endpoints, leaving parts of the application exposed.

Agentic Development Requires Continuous Security Review

For the study, DryRun designed two applications, a web app to track family allergies and a browser-based racing game, and had each agent build features incrementally through pull requests much like real life agentic development.

Each change was analyzed with DryRun Security before the next feature was implemented, followed by a full DeepScan of the final codebases. The results show that security risk accumulates quickly during agent-driven development if code is not reviewed continuously and remediated as part of the process.

DryRun Security’s Contextual Security Analysis evaluates how applications behave in context, allowing teams to identify the systemic security gaps introduced by AI-generated code.

The full report can be downloaded here .

