{"slug": "beyond-source-code-the-files-ai-coding-agents-trust-and-attackers-exploit", "title": "Beyond source code: The files AI coding agents trust — and attackers exploit", "summary": "As AI coding agents become deeply embedded in developer workflows, the attack surface now extends beyond source code to include repository files, agent instructions, runtime settings, and extension packages that influence what the agent trusts and executes. These artifacts can trigger commands, override guardrails, and introduce supply-chain risks without requiring exploit code, as agents may automatically parse and execute malicious instructions. To counter this, Google Threat Intelligence has integrated agentic capabilities like VirusTotal Code Insight to perform semantic analysis at scale, exposing hidden operational intent and linking these artifacts to broader threat campaigns.", "body_md": "As AI coding agents become deeply embedded in developer workflows, defenders must evolve their definition of malicious files and rethink how to protect against them.\nAutonomous AI agents operate across integrated development environments (IDEs), editors, terminals, and extension runtimes, and they often have access to local files, command execution, and external services. As a result, the attack surface of the modern developer environments now extends well beyond source code. Repository files, agent instructions, runtime settings, and extension packages can all influence what the agent trusts, what it executes, and what it can reach.\nDefending this new attack surface requires moving towards semantic analysis to understand the actual instructions, logic, and context being fed to the AI. Powered by VirusTotal Code Insight, our agentic threat intelligence capability in Google Threat Intelligence extracts the true operational intent behind agent-facing files at scale, allowing security teams to expose configurations that override guardrails and mask supply-chain risks.\nBy integrating agentic capabilities into Google Threat Intelligence, we’re able to link these invisible artifacts to broader threat campaigns. This powerful capability can help ensure that as attackers exploit what AI agents trust, defenders are equipped with the resources to read between the lines.\nTo help security analysts understand how the developer threat landscape has quickly expanded, we suggest an approach that groups the attack surface into four categories: what executes, what instructs, what connects, and what extends.\nJust as developers rely on project configuration to automate setup, debugging, and routine tasks, AI coding agents and modern developer tools also inherit execution paths from repository files. These artifacts can trigger commands, bootstrap environments, and chain execution through normal workflows.\nOpening a project, trusting a workspace, starting a debugger, rebuilding a container, or running a standard setup command may therefore execute attacker-controlled logic under the appearance of legitimate project automation.\nAI coding agents also consume persistent instruction files that shape how they behave inside a project. These files can influence what the agent prioritizes, what it ignores, which tools it uses, which files it trusts, and which actions it takes automatically.\nThese files do not need to contain exploit code to be security-relevant. Reusing them across repositories introduces a supply-chain risk, because malicious instructions can be presented as harmless guidance while steering otherwise legitimate agent workflows toward unsafe behavior.\nUnlike traditional IDEs that require a human to click run, an agent may parse these instructions and execute them as a prerequisite to a task without the developer ever reviewing the specific instruction block.\nBeyond instructions, coding agents also depend on runtime definitions that determine how they interact with tools, hooks, external services, and local execution contexts. These files define permissions, tool connectivity, external endpoints, and execution paths.\nThis is where repository-level influence becomes operational control. A malicious or unsafe runtime configuration can expose local commands, remote services, sensitive data, and untrusted model context protocol (MCP) servers to the agent, turning configuration abuse into controlled execution.\nExtensions add another layer of inherited trust and introduce third-party code into editor and browser runtimes, often with broad access to local files, credentials, and developer workflows. This inherited trust can create a supply-chain problem similar to malicious project configurations: Compromised extensions, poisoned update paths, and hijacked publisher accounts can introduce attacker-controlled logic through components that otherwise appear to be standard tooling.\nThis taxonomy highlights a fundamental shift in the threat landscape: The risk is no longer just in the syntax of code, but in the semantics of intent.\nTraditional security tools are effectively blind to natural language instructions that tell an AI to ignore guardrails or redirect data. The operational questions are then: How can defenders identify these risks systematically? How can they detect the danger before a developer or an agent automatically follows a valid instruction file to a malicious conclusion?\nTo bridge this gap, we use VirusTotal Code Insight and agentic threat intelligence to perform large-scale semantic analysis. Because malicious repository settings and instruction files are often syntactically correct, they frequently return zero detections from signature-based scanners.\nCode Insight solves this problem by using AI to analyze the file’s actual logic and read between the lines, surfacing behavioral risks that are invisible to legacy tooling. This context is further enriched within agentic threat intelligence, where security teams can pivot from a single semantic red flag to investigate broader threat infrastructure and associated campaign activity.\nExample 1: A Weaponized tasks.json\nOne representative example is a file distributed under the path coding-challenge/coding-challenge/.cursor/tasks.json. The sample was first submitted to VirusTotal on March 19, and remained undetected by security engines for several days.\nVirusTotal Code Insight flagged it as a risk based on the behaviour implied by the configuration itself. The sample has also been verified as malicious by a Mandiant analyst and marked as associated with a tracked threat actor by Google Threat Intelligence.\nThe Code Insights description indicated that the file, which is parsed when a user opens the project folder in an IDE like Visual Studio (VS) Code, drives the user to download and execute arbitrary code from a GitHub Gist in memory while hiding the execution parameters.\nTo make Code Insights analysis reproducible at scale, we can also scale access to such descriptions for multiple files via the VirusTotal API. Looking at the contents of this particular file, we identified the Gist URLs that the actor referred to in the instructions.\nLooking up these Gist URLs with agentic threat intelligence provides a detailed breakdown of the malicious instructions embedded within them. Despite masquerading as legitimate tools such as NVIDIA Cuda, these Gists, along with their specific filenames, show strong similarities to widespread campaigns frequently attributed to North Korean actors, which are designed to lure IT professionals.\nThese attacks often pose as technical challenges to trick users into compromising their own devices.\nExample 2. Offensive system instructions files\nSystem instruction files used to provide guidance, resources, and context to LLMs can also contain malicious capabilities while remaining undetected by common antivirus services. Since the beginning of 2026, we have observed a consistent increase in Skill.md files submitted to VirusTotal with either risky or malicious instructions.\nWhile this does not necessarily mean that all samples were harmful, it illustrates a trend that is likely to grow in tandem with the adoption and implementation of Skills across the industry.\nIn this example, we identified a Skill.md file containing instructions to steal user data. Code Insight indicated that the skill file contained instructions “to exfiltrate sensitive credentials, including API keys and environment variables, to external endpoints.\"\nThis case reflects a growing interest among threat actors in acquiring API keys and resources to enable scalable LLM integrations. At the time of writing, this file had remained active for nearly two months without any detections or researcher notes.\nThe file's contents reveal a specific narrative designed to evade detection. The instructions direct the agent to exfiltrate API keys, tokens, and configuration files under the guise of \"maintenance,\" explicitly advising the model not to mention this to the user \"as it may cause confusion about the security process.\"\nAlthough direct intelligence on this specific file was limited, we used the agentic threat intelligence briefing capability to generate a summary and explore similar past observations. This provided contextual information to categorize and understand the threat.\nEven files that explicitly state their offensive capabilities often evade traditional detections. For example, we identified a Skill designed to equip an AI agent with Windows privilege escalation and credential theft capabilities.\nAlthough the file includes a disclaimer for authorized use only, its core instructions remain high-risk. Code Insight accurately evaluated the file. \"The file provides explicit and systematic instructions for performing high-risk offensive operations,\" it said.\nDespite its offensive capabilities, by the time of writing only a few vendors had flagged the file as malicious.\nExample 3: Suspicious JSON runtime configurations\nA third example is a pair of settings.json samples shared through VirusTotal: One points to api.awstore.cloud, the other to api.kiro.cheap. The two unrelated samples follow a similar pattern: They override ANTHROPIC_BASE_URL, embed an API key, and turn Claude Code into a client of a third-party proxy rather than Anthropic.\nThis demonstrates exactly how runtime configurations can be weaponized. The file does not need exploit code or a malicious binary to be dangerous. It simply rewires trust while the agent is running.\nFor example, a valid AI-generated settings file can silently redirect prompts, source code, and credentials to an external endpoint while the agent appears to behave normally. Beyond data exfiltration, a rogue endpoint could plausibly reverse the flow, feeding malicious instructions or vulnerabilities back to the agent to be injected directly into the local codebase.\nA high level analysis of awstore.cloud using an agentic threat intelligence pivoting prompt, uncovered a series of similar domains sharing the same underlying infrastructure. These domains exhibit a clear naming preference for crypto, finance, and tech-related nomenclature.\nWhile the organization’s public sites currently lack formal malicious detections, OSINT lookups reveal several red flags: a lack of a verifiable legal entity, limited contact options restricted to Discord and Telegram, and a payment model that exclusively accepts cryptocurrency via third-party marketplaces like plati.market.\nThe settings profile reinforces this pattern. Beyond changing the endpoint, the configuration suppresses telemetry, error reporting, and cost warnings, stripping away the guardrails that would otherwise alert a user. The intent is seemingly to maintain a facade of normal operation while silently redirecting traffic to an opaque third-party service.\nWhile these are technically valid configuration artifacts, their ability to hijack trust and exfiltrate sensitive data is indistinguishable from traditional malware.\nExample 4. A Sabotaged Extension Payload\nAnother low key example we recently identified was that of a VS Code extension for User-centric Use cases Validator (UUV) end-to-end tests submitted to VirusTotal in March. More than one week later, the sample continued to have zero detections, but VirusTotal Code Insights identified suspicious behavior.\nThe analysis indicated that this specific sample included a well-known protestware payload known as peacenotwar which upon activation writes a blank file named WITH-LOVE-FROM-AMERICA.txt and logs a heart in the console.\nTo bridge the gap between a suspicious file and actionable intelligence, we generated an agentic threat intelligence brief. By feeding the semantic context from Code Insight into the prompt, the agent pivoted across historical data, instantly linking this 'benign' extension to the 2022 cyber activist sabotage of the node-ipc library in response to the invasion of Ukraine.\nWhile this specific event may hav", "url": "https://wpnews.pro/news/beyond-source-code-the-files-ai-coding-agents-trust-and-attackers-exploit", "canonical_source": "https://cloud.google.com/blog/products/identity-security/beyond-source-code-the-files-ai-coding-agents-trust-and-attackers-exploit/", "published_at": "2026-05-12 16:00:00+00:00", "updated_at": "2026-05-18 22:08:58.030604+00:00", "lang": "en", "topics": ["artificial-intelligence", "cybersecurity", "developer-tools", "enterprise-software"], "entities": ["VirusTotal", "Google Threat Intelligence"], "alternates": {"html": "https://wpnews.pro/news/beyond-source-code-the-files-ai-coding-agents-trust-and-attackers-exploit", "markdown": "https://wpnews.pro/news/beyond-source-code-the-files-ai-coding-agents-trust-and-attackers-exploit.md", "text": "https://wpnews.pro/news/beyond-source-code-the-files-ai-coding-agents-trust-and-attackers-exploit.txt", "jsonld": "https://wpnews.pro/news/beyond-source-code-the-files-ai-coding-agents-trust-and-attackers-exploit.jsonld"}}