How can Aona AI help with AI governance and security?

Aona AI provides enterprise-grade AI governance, shadow AI discovery, and security guardrails. It helps security and compliance teams see every AI tool in use, enforce usage policies, and prevent data leaks across the organisation.

What is Shadow AI and why is it a risk?

Shadow AI refers to AI tools and applications used by employees without IT or security team awareness or approval. It poses risks including data leakage, compliance violations, and unvetted AI outputs influencing business decisions.

How does Aona AI enforce AI usage policies?

Aona AI enforces AI usage policies by monitoring AI interactions in real time, blocking or alerting on policy violations, and providing visibility into how AI tools are being used across the organisation — all without disrupting employee productivity.

Is Aona AI suitable for regulated industries like finance and healthcare?

Yes. Aona AI is designed for regulated industries where data protection and compliance are critical. It supports GDPR, ISO 27001, and sector-specific requirements by providing audit trails, policy enforcement, and risk controls for AI usage.

When Your AI Coding Agent Becomes the Attacker: The Prompt Injection Threat Enterprises Are Ignoring

Q: What is the main takeaway from "When Your AI Coding Agent Becomes the Attacker: The Prompt Injection Threat Enterprises Are Ignoring"?

This article covers key insights and best practices related to When Your AI Coding Agent Becomes the Attacker: The Prompt Injection Threat Enterprises Are Ignoring. Read the full post for detailed guidance, actionable steps, and expert recommendations from the Aona AI team.

Security researchers just hijacked three of the most widely-deployed AI coding agents in enterprise environments. They didn't exploit a traditional software vulnerability. They didn't brute force any credentials. They wrote a pull request title.

That's it. A malicious PR title. Claude Code, Google's Gemini CLI Action, and Microsoft's GitHub Copilot — all compromised, all used to exfiltrate API keys and access tokens. And here's the part that should keep your security team up at night: not one of the affected vendors issued a CVE or published a public security advisory.

Researchers from Johns Hopkins University disclosed these findings this week, and they're calling the attack pattern "comment-and-control" — a deliberate play on command-and-control, because the entire attack infrastructure lives inside GitHub itself. No external servers. No malware. Just AI agents doing exactly what they were designed to do: read GitHub data and take action on it.

How the Attack Actually Works

AI coding agents like Claude Code Security Review and Gemini's CLI Action follow a predictable flow. They read GitHub content — PR titles, issue bodies, comments — treat that content as trusted context, and then execute tasks based on what they find. That design assumption is the vulnerability.

Researcher Aonan Guan from Johns Hopkins realised that if he could inject instructions into the data the AI was reading, he could redirect what the AI did. So he submitted a pull request and embedded malicious instructions directly into the PR title — essentially telling Claude to run bash commands and return the output as a "security finding."

Claude ran the commands. Then it posted the results, including environment credentials, as a PR comment. Attacker writes the title. Attacker reads the bot's review comment. Done.

For Gemini, the researchers escalated the injection through comment chains, ultimately tricking Gemini into publishing its own API key as a GitHub issue comment. GitHub paid $1,337 for that one — a cheeky nod from the security community.

GitHub Copilot's agent was harder. GitHub had added three runtime-level defenses: environment filtering, secret scanning, and a network firewall. Guan bypassed all three by hiding the malicious prompt inside an HTML comment — invisible to human reviewers in GitHub's rendered Markdown, but fully visible to the AI processing the raw content. The victim assigns the issue to Copilot to fix. Copilot leaks the secrets. The attacker cleans up their tracks by editing the issue title and deleting the bot's response.

The Quiet Disclosure Problem

Here's where this gets worse for enterprises. Anthropic was initially notified of the Claude Code flaw in October. They paid a $100 bug bounty — the lowest of the three — upgraded the severity to 9.4 (critical), and added a footnote to their documentation saying the action "is not hardened against prompt injection attacks."

No CVE. No public advisory. No notification to users running vulnerable versions.

Google paid $1,337 and credited the research team. Still no CVE. Microsoft, after initially saying it was a "known issue" they "were unable to reproduce," eventually paid $500 in March. No public disclosure.

Guan told The Register: "I know for sure that some of the users are pinned to a vulnerable version. If they don't publish an advisory, those users may never know they are vulnerable — or under attack."

That sentence describes the current state of enterprise AI security perfectly. Your team deploys an AI agent to help with code review. Researchers find it can be hijacked to steal every secret in your GitHub Actions environment. The vendor patches it quietly. You never know.

This Isn't an Edge Case

What makes this particularly uncomfortable is that the attack surface isn't exotic. GitHub Actions is used by hundreds of thousands of organisations. AI coding agents — Claude Code, Copilot, Gemini integrations — are being deployed at pace across engineering teams everywhere. Most organisations have no visibility into which AI tools their developers are actively using in CI/CD pipelines, let alone whether those tools are running patched versions.

This lands squarely in shadow AI territory. Not the kind where an employee opens ChatGPT in a browser tab — the kind where AI agents are embedded in core development workflows with access to production secrets, and nobody in security has a full inventory of what's running where.

And this week brought a second related story: a design flaw in Anthropic's Model Context Protocol (MCP), the open standard that lets AI models connect to external tools and data sources. The Ox security team found that MCP's use of STDIO as a local transport mechanism effectively allows arbitrary OS command execution — and the vulnerability pattern cascades through every project using Anthropic's official MCP SDK, covering 150 million package downloads across Python, TypeScript, Java, and Rust.

Anthropic declined to patch the protocol itself, calling the behavior "expected." Ten high and critical severity CVEs have since been issued for individual MCP-dependent projects. The root cause remains unfixed.

What Enterprises Need to Do Now

A few practical steps worth actioning this week:

Audit your GitHub Actions for AI agent integrations. Any workflow that invokes Claude Code, Gemini CLI Action, or GitHub Copilot should be reviewed. Check which version is pinned. Confirm whether it's been updated since late 2025.

Restrict external PR access to AI agent workflows. All three vendors recommend requiring maintainer approval before AI agents process external contributions. This won't prevent insider attacks, but it closes the most obvious external attack vector.

Treat PR and issue content as untrusted input. If an AI agent in your pipeline reads user-supplied content and takes action on it, that's a prompt injection surface. Organisations are only beginning to build the security models needed to handle this correctly.

Get visibility before you govern. The harder problem is that most security teams don't have a reliable inventory of which AI tools are actively running in their environment — especially in developer workflows. You can't patch what you don't know about, and you can't enforce policy on AI agents you haven't discovered.

This last point is where Aona's [Shadow AI Discovery](https://aona.ai/product) fits in. Getting a full picture of AI tool usage across the enterprise — including AI agents embedded in developer tooling — is the prerequisite for any meaningful governance. The attack described this week doesn't require an employee to deliberately take a risk. It exploits AI tools that IT already approved, running in pipelines IT owns, doing exactly what they were designed to do.

The threat model for enterprise AI security just got meaningfully more complex. The good news is that the attack is understood, the mitigations are documented, and none of this requires waiting for vendors to act. The question is whether your security team has the visibility to know where to start.

When Your AI Coding Agent Becomes the Attacker: The Prompt Injection Threat Enterprises Are Ignoring

In This Article

Key Takeaways

How the Attack Actually Works

The Quiet Disclosure Problem

This Isn't an Edge Case

What Enterprises Need to Do Now

Ready to Secure Your AI Adoption?

Explore more insights on AI governance & security

Shadow Agents: Your Employees Aren't Just Using Shadow AI Anymore — They're Building It

An AI Agent Hacked a Hardened OS in Four Hours. Here's What That Means for Enterprise Security.

Microsoft Just Added Shadow AI Controls to Edge. Here's Why That's Not Enough.

Product

Solutions

Resources

Compare

Compliance

Company

Contact