# Article: Virtual panel: Security in the Machine Age: Expert Insights on AI Threat Evolution

> Source: <https://www.infoq.com/articles/security-ai-threat-evolution/?utm_campaign=infoq_content&utm_source=infoq&utm_medium=feed&utm_term=global>
> Published: 2026-06-29 11:00:00+00:00

### Key Takeaways

- Security engineers must evolve from securing deterministic software to defending probabilistic systems; understanding AI threat vectors like prompt injection, data poisoning, model drift, and RAG abuse is now essential, not optional.
- The most destructive AI-based attacks exploit the boundaries between components, where untrusted input meets system instructions, external data enters training pipelines, and AI systems connect to automation and privileged access.
- AI systems must be treated as unpredictable, goal-driven actors rather than as trusted software components, requiring continuous behavioral validation, action-level controls, and supervision rather than static security rules.
- Traditional security skills remain foundational, but must be extended with AI-specific capabilities, including AI threat modelling, adversarial testing, behavioral monitoring, data governance, and the ability to translate research into production defenses.
- Success in AI security depends on building resilience and visibility, not pursuing perfection. Organizations must invest in specialized monitoring, cross-functional collaboration between security and ML teams, and incident response capabilities designed for systems that learn and adapt.

This article is part of the "
|

## Introduction

Threat actors are leveraging AI to create a new generation of sophisticated attacks. We're observing adversarial machine learning techniques that change how models behave. In addition, there are highly personalized social engineering campaigns driven by generative AI. Lastly, we have adaptive threats that evolve in real time. AI-driven attack methods work faster and on a larger scale than traditional defense systems. This increasing threat challenges basic ideas about how we detect and prevent threats.

These developments force a fundamental rethinking of cybersecurity strategy. Traditional incident response playbooks often fall short with AI systems. These systems can behave unpredictably and show new, unexpected behaviors. Organizations must now create new forensic methods, special monitoring tools, and flexible response strategies. These methods are needed to handle the changing nature of AI-driven security incidents.

As organizations quickly adopt AI systems, security engineers must transform their roles. This change requires new skills, methods, and strategic thinking. This expert panel brings together leading practitioners to discuss how AI threats are changing. They will also examine the limits of traditional security methods and the key skills needed to protect against future attacks.

### The panelists:

- Elham Arshad – Cybersecurity Expert | Expertise in AI-Powered Security Solutions at Trentino Digitale
- Sabri Allani – PhD | AI and Cybersecurity Consultant at Expleo Group
- Vijay Dilwale – Principal Security Consultant at UltraViolet Cyber
- Igor Maljkovic – PhD Researcher in AI Security at the University of Genova, Italy

**InfoQ: What are the skills to develop to be a security engineer in the AI era?**

Elham Arshad: Security is really just about keeping our data and systems safe from attackers, and it’s always been a tug-of-war between those trying to break in and those trying to keep them out. There’s a saying: "To beat a lion, you must be a lion". The same idea applies here: Defending modern systems requires tools and expertise that are just as advanced as the threats targeting them.In the world of AI, this increased need for defense is even more true. AI systems are built on data, and traditional security revolves around safeguarding data through the CIA triad: Confidentiality, Integrity, and Availability. But once AI enters the picture, the security landscape becomes more complex. Models can be influenced, tricked, or misused in ways that go far beyond conventional software vulnerabilities.

That’s why, in the AI era, being a security engineer requires more than just having a foundation in cybersecurity. You also need a practical understanding of how machine learning works, how models are trained, how LLMs behave, how prompt tuning or fine-tuning prompts affect outputs, and how data pipelines shape model performance. Skills in programming, machine learning algorithms, and data science concepts are becoming essential.

In addition to these, modern AI security specialists also benefit from understanding:

- Model behavior and failure modes, such as how AI systems can drift, hallucinate, or be misled.
- Data governance to ensure training data is safe, clean, and free from malicious manipulation.
- AI-specific threats, such as model extraction, data poisoning, and prompt manipulation.

Sabri Allani: Security engineers need to expand from "code and infrastructure security" into "data, model, and agent security". The most important skills are:

- AI threat modeling (model, data, prompt/agent surfaces, supply chain, and runtime behaviors).
- Data security and integrity (provenance, poisoning resistance, secure labeling pipelines, access control, auditability).
- LLM/agent attack literacy (prompt injection, tool abuse, indirect injection via documents, model extraction risks, jailbreak patterns).
- Secure-by-design for RAG/agents (least privilege for tools, retrieval scoping, policy enforcement, safe tool execution patterns).
- Evaluation and red teaming tailored to AI systems (safety/security test suites, adversarial testing, regression testing for behavior).
- Observability and forensics for AI services (prompt/response telemetry, tool-call logs, retrieval traces, drift detection).
- Governance and risk management (NIST AI RMF / ISO/IEC 42001 mindset, measurable controls, accountability).

Vijay Dilwale: The biggest shift is moving from securing deterministic software to securing probabilistic systems. You don’t need to be a machine learning (ML) expert, but you do need to understand how AI systems fail in practice with things like prompt injection, indirect prompt injection, data poisoning, model drift, Retrieval-Augmented Generation (RAG) abuse, and unsafe tool or agent access.This understanding isn’t theoretical. It requires breaking the system on purpose, experimenting with prompts, seeding malicious content into retrieval sources, and observing how behavior changes when context or permissions shift. For example, with a RAG-based system, it is not enough to ask, "Is the model safe?", you must ask what happens when an attacker controls or influences the documents being retrieved, and whether the model treats that content as instructions or facts.

In most real-world deployments, the model itself isn’t the problem. The real risk comes from how the model is wired into data, identity, and automation, and how easily those connections can be abused over time. The most important skill isn’t a new framework, it’s learning to think adversarially about systems that learn and change over time, not just applications that execute fixed logic. That mindset shift is the foundation of AI security.

Igor Maljkovic: First, the fundamentals still matter. AI security engineers must develop a strong cybersecurity mindset. The core logic is unchanged: Attackers look for holes in systems to bypass controls and achieve their goals, while defenders must think proactively and adversarially, anticipate abuse, and adapt as technologies evolve. In that sense, AI security inherits much of the blue-team and red-team thinking from traditional security.At the same time, AI systems introduce new failure modes. This increase in failure modes makes end-to-end system understanding essential, understanding not just the model, but how data is collected, how the model is trained, how it is deployed, how the system is monitored, and how humans interact with it over time. AI security issues often emerge from how these pieces are connected rather than from a single component in isolation. So, most AI security failures occur at the boundaries between components, where assumptions about trust and responsibility tend to break down. For example, prompt injection is not a flaw in the model itself, but a failure at the boundary between untrusted user input and system-level instructions. When natural language input from users is treated as trusted context, attackers can craft inputs that override intended behavior, manipulate reasoning, or redirect the model toward unintended actions. Similarly, data poisoning typically occurs outside the model, at the interface between external data sources and the training pipeline, where manipulated data is ingested without sufficient validation.

Engineering skills are equally critical. An AI security engineer must be able to translate abstract risks into concrete solutions, moving from problem formulation to implementation. This need for multiple skills includes strong coding skills, familiarity with production environments, and the ability to design, build, deploy, and maintain security mechanisms under real-world constraints.

Another key skill is the ability to read, interpret, and reason about state-of-the-art scientific literature. AI security evolves rapidly, and engineers must be able to extract practical insights from research on new attacks, defenses, and evaluation methods, and assess what is mature enough to apply in practice.

Soft skills are really important. Clear communication and active listening are critical for explaining complex risks, understanding system and business requirements, collaborating with ML engineers and product teams, and making security trade-offs explicit to decision-makers.

Finally, a deep understanding of AI and its underlying mathematical foundations is essential. Security engineers cannot effectively secure AI systems they do not understand. For example, defending large language models without understanding the transformer architecture, attention mechanisms, or training dynamics makes it difficult to reason about failure modes such as hallucinations, prompt sensitivity, or information leakage. While this knowledge is often framed as "theory", in practice it is a critical skill because it enables engineers to anticipate risks, interpret model behavior, evaluate defenses, and distinguish between architectural limitations and implementation flaws.

**InfoQ: What is the most destructive AI based attack currently?**

Elham Arshad: We can think about AI-related attacks in two simple ways.First, considering security in AI, as many organizations adapt to AI for automation and decision-making, AI introduces new attack surfaces and becomes an easy target for adversaries because there are no specific policies or guidelines to make AI systems more secure.

Second, with AI-powered attacks, attackers now use AI to make their attacks smarter, faster, and harder to detect. AI can automate phishing, making modern cyber threats, and find zero-day vulnerabilities.

When comparing the two categories, the first group is currently more widespread and better documented in real-world environments. There is a growing body of research, case studies, and security reports showing how vulnerable AI models can become when their inputs, training data, or surrounding context are manipulated. These attacks directly threaten the confidentiality and integrity of the data that AI systems rely on, which can have serious consequences for any organization that integrates AI into its workflows.

If we look at specific examples, prompt manipulation, prompt injection, and jailbreaking stand out as some of the most common and concerning attacks, and target both confidentiality and integrity. They are relatively easy for attackers to perform, often requiring nothing more than cleverly crafted text or content. Unlike more traditional cyberattacks that depend on exploiting software bugs, these AI-focused attacks exploit the model’s behavior, reasoning patterns, and trust in input data. Because AI tools frequently interact with external content, such as web pages, documents, and user messages, they can be exposed to malicious instructions without anyone realizing it.

The likelihood of these attacks is higher than many other AI-related threats simply because the barrier to entry is low and the attack surface is broad. The potential impact can be much more devastating. Once an attacker influences the model’s responses or decisions, that effect can cascade into downstream systems, automated pipelines, or human decision-making processes.

Sabri Allani: When we talk about "the most destructive AI-based attack" today, I think the most damaging category is AI-augmented social engineering at scale. The reason is simple: AI-augmented social engineering consistently breaks the human layer, and AI makes it both highly personalized and massively scalable. Attackers can combine open-source intelligence (OSINT) and leaked data (e.g., org charts, vendor relationships, role changes, internal terminology, past email style, and public documents) to craft messages that feel legitimate to a technical audience because they contain the right context. This new form of attack is not just "better phishing", it is persuasion industrialized. With generative AI, the attacker can run thousands of tailored outreach attempts and maintain realistic follow-up conversations that respond convincingly to objections, urgency, and verification questions. The impact is often direct: credential theft, fraudulent payments, and workflow manipulation (e.g., changing supplier banking details and pushing urgent approvals). Even when organizations enforce multi-factor authentication (MFA), these campaigns increasingly target the surrounding processes: session/token hijacking, helpdesk and identity recovery workflows, or business processes that are "legitimate" but socially coerced. In short, AI makes social engineering more credible, faster, and harder to distinguish from normal business communication, which is exactly why it is so destructive.On the more technical side, the emerging class I’m most concerned about is indirect prompt injection combined with tool or connector abuse in agentic systems. As soon as an AI assistant is connected to enterprise tools (e.g., email, tickets, file shares, customer relationship management (CRM), continuous integration and continuous delivery (CI/CD), cloud APIs, and internal knowledge bases), the main risk shifts from "what the model says" to "what the model can do" with legitimate permissions. Indirect prompt injection is powerful because the attacker does not need to compromise the model itself. They only need to place malicious instructions inside untrusted content that the model will ingest during a normal workflow, such as a web page, a PDF, an email thread, a ticket description, or a shared document. If a user asks the assistant to summarize, draft, or act on that content, those embedded instructions can hijack the agent’s behavior by exfiltrating data through approved connectors, sending an email to an external address, creating a ticket that triggers a reset or access change, or retrieving and forwarding sensitive documents. The destructive nature here comes from actions that look legitimate: The agent is using authorized tools from an approved identity and executing operations that resemble normal automation. That appearance of legitimacy is why classic controls can struggle: The "attack" is not a suspicious exploit, it is a trusted system being manipulated into misusing its own access.

These two categories are especially damaging today because they attack the two hardest surfaces to secure: humans and automation operating under valid permissions. AI increases both the scale and the plausibility of the attack, and it shortens the time from initial contact to real operational impact.

Vijay Dilwale: It is not some clever model exploit. It is AI making social engineering work far better than it ever did before. We are seeing phishing emails that reference real projects, real coworkers, and current company events. We are seeing voice deepfakes that sound close enough to an executive to push a payment through or reset an account, especially when timed around urgency or pressure.What has changed is not the idea of these attacks, but the economics. Attackers can generate thousands of convincing, personalized messages, test what works, and adjust almost instantly. The damage comes from speed and realism, not from any breakthrough in AI itself. AI did not invent deception, it removed the friction that once limited how often and how well these attacks could be carried out.

Igor Maljkovic: There is no single "most destructive" AI-based attack that applies uniformly across all AI systems. The impact of an attack is tightly coupled to the model architecture, its role within a system, and the level of trust and autonomy it is granted.For example, prompt injection attacks can be highly destructive for large language models, especially when LLMs are embedded in agentic systems or connected to tools, data stores, or decision-making workflows. In such settings, an attacker can manipulate model behavior, override system instructions, or trigger unintended actions. The same attack, however, is meaningless for architectures such as convolutional neural networks used in computer vision, which do not process natural language.

Public perception often gravitates toward deepfakes and election interference as the most destructive AI-driven threats. In contrast, an AI Security lead would likely point to zero‑click agentic remote code execution, a scenario where an AI agent, given too much autonomy and insufficient guardrails, executes remote actions without direct user interaction. These attacks exploit the combination of high‑privilege automation and weak control boundaries.

What makes an AI attack truly destructive is context-dependent. The most destructive AI attacks are not defined by a single technique, but by a mismatch between how an AI system behaves and how much trust is placed in it. Different attacks exploit different properties (e.g., learning from data, instruction-following, generalization, or automation), but damage escalates when systems are deployed with assumptions about reliability or control that do not hold in practice. Attacks become truly damaging when AI systems are integrated into security-critical workflows without architecture-aware threat modeling, sufficient safeguards, validation, and continuous oversight.

In one concrete example playing out today, attackers may inject vulnerabilities into open-source code repositories. AI coding assistants trained on this poisoned data then suggest these same vulnerabilities to developers, who trust and adopt the AI-generated code. A single malicious training example can propagate across thousands of production systems. The destructiveness comes not from the attack's complexity, but from the trusted role AI tools now play in development workflows.

**InfoQ: What do you think the AI based attacks will evolve into, and what might be the next most destructive attack?**

Elham Arshad: Today, more and more organizations are adopting AI and building new APIs around it, and even search engines now depend on AI to deliver smarter, more helpful results. As this trend accelerates, most digital workflows and eventually most APIs will rely heavily on AI systems working together behind the scenes. But that also means a single failure can ripple through everything connected to it. If one model goes offline or starts producing malicious outputs, every tool or service built on top of it can be thrown off, slowing entire workflows or even causing complete system breakdowns.In environments where AI systems depend on one another, these failures can scale quickly, become difficult to trace, and impact many organizations at once. An autonomous AI-driven attack that targets AI systems themselves, spreads through supply chains, and uses social engineering to stay undetected could amplify this risk even further. Because these systems increasingly control real-world operations, the effects wouldn’t stop at digital disruption. Imagine this maliciousness happening in a critical sector like energy. If an AI-powered system that a utility relies on suddenly becomes unreliable, it could trigger operational interruptions or even contribute to large-scale outages. This risk is why it’s essential to design AI systems that are resilient, secure, and capable of failing safely, because as AI becomes the backbone of modern infrastructure, the stakes only get higher.

Sabri Allani: AI attacks will shift from "content generation" to autonomous, goal-driven operations: recon, phishing, lateral movement, and persistence optimized by feedback loops. The next major destructive step is agentic intrusion chains where an attacker influences an organization’s AI agents to:

- exfiltrate sensitive data via legitimate connectors
- execute harmful actions through automation tools
- maintain stealth by adapting to detections
In other words, an AI-enabled attacker exploits your AI-enabled workforce.

Vijay Dilwale: AI-based attacks are going to move away from obvious one-time exploits and toward slow, persistent manipulation.Instead of trying to break a model in a single interaction, attackers will influence what the system sees over time: The data it pulls in, the context it trusts, and the actions it is allowed to take. The goal is not to make the AI fail loudly but to make it behave slightly wrong again and again.

For example, imagine an AI agent that helps process invoices or approve expense reimbursements. If an attacker can influence the data the agent learns from or retrieve vendor records, past approvals, or policy documents, they do not need to break authentication or exploit code. Over time, the agent will start approving payments it should not, prioritizing the wrong vendors, or skipping review steps because it has learned that this is what "normal" looks like.

That behavior is what makes these attacks dangerous. There is no obvious breach or alert. The system is doing what it was designed to do, just with its decision-making quietly pushed in the wrong direction. At scale, that kind of manipulation can be far more damaging than a traditional exploit.

Igor Maljkovic: AI-based attacks are evolving from model-specific exploits into systemic attacks that target how AI systems are integrated, trusted, and allowed to act within larger environments. Rather than focusing solely on breaking a model directly, attackers are increasingly exploiting the inputs, context, and surrounding components that shape an AI system’s behavior.As AI systems become more autonomous and connected to tools and workflows, attacks will increasingly target the interfaces between models, data sources, decision logic, and execution layers. Manipulating these boundaries can cause AI systems to take harmful actions as part of their normal operation, without requiring continuous attacker interaction.

One emerging category is cross-system prompt manipulation, where attackers craft inputs that appear benign to individual AI systems but could orchestrate harmful outcomes when multiple AI agents interact and pass information between one another. As AI-to-AI communication becomes more prevalent and less supervised, the attack surface expands: Adversaries exploit trust assumptions between systems, using one AI system's output as a vector to influence another's behavior. The concern is not any single technique, but the compounding risk as autonomous agents increasingly operate in chains or networks with minimal human oversight.

In essence, future damage will come not from a specific exploit, but from misaligned autonomy. AI systems that are granted authority, integration, or operational responsibility that exceeds their robustness, allowing small manipulations to cascade into large-scale, real-world impact.

**InfoQ: When an AI system is compromised or behaves unexpectedly, traditional incident response playbooks may not apply. How are organizations adapting their Information Retrieval (IR) processes for AI-related incidents, and what new skills do security teams need?**

Elham Arshad: Organizations are starting to rethink their incident-response strategies because AI failures don’t look like traditional security problems. Instead of dealing only with malware, misconfigurations, or breached servers, teams now have to handle behavioral issues, things like prompt injection, poisoned training data, stolen model weights, or an LLM suddenly behaving in strange or unsafe ways.To keep up, companies are building cross-functional AI incident response teams. Security analysts now work side by side with ML engineers, data scientists, and software teams to create secure-by-design AI systems. They’re also expanding monitoring to cover prompts, model outputs, data pipelines, and anything that could signal that an AI system is drifting, being manipulated, or breaking down.

Finally, organizations are embedding AI governance into every stage of the incident-response lifecycle. After an incident, they don’t just ask "What broke?", they also ask, "Why did the model behave this way?". Answering these questions requires reviewing training data, model assumptions, and guardrails to understand whether the AI itself contributed to the problem and how to prevent it next time.

Sabri Allani: Organizations are building AI-specific incident playbooks that treat the model as a living component with unique evidence sources. Key adaptations include:

- Evidence collection beyond logs to include prompts, retrieved documents, tool-call traces, model/version hashes, policy snapshots, and training data lineage.
- Containment tactics tailored to AI such as disabling high-risk tools/connectors, narrowing retrieval scopes, rolling back model/prompt versions, and enforcing stricter guardrails.
- Behavior regression testing post-incident to confirm the system returns to a known-good behavioral baseline.
New skills include AI forensics, understanding model failure modes, evaluation methodology, and the ability to reason about "what the model was influenced by" (e.g., data, retrieval, tools, and policies).

Vijay Dilwale: This is still a very much evolving space, and I do not think anyone can fully say that they have AI incident response completely figured out yet. Most existing playbooks were written for systems where behavior is predictable, and actions are easy to trace. When AI systems behave unexpectedly, especially those that rely on retrieval or tools, those assumptions start to fall apart.From what I have seen so far, teams are adapting as they go rather than following a mature, standardized approach. For example, when an AI assistant surfaces sensitive information or takes an unintended action, the first question is often not "Who accessed what?", but "Why did the system decide to do this at that moment?". Traditional logs rarely answer that on their own. Teams end up reconstructing what the system saw, what data it retrieved, what context or instructions influenced it, and how that led to the final outcome. Incident response starts to look like a mix of forensic analysis, model evaluation, and system debugging.

This new threat also changes the skills security teams need. It becomes less about following a predefined checklist and more about being able to reason through system behavior after the fact and work closely with the engineers who built the AI. Most organizations are still early adopters here, and a lot of the reactions to new problems are experimental, but this is the direction incident response seems to be moving as real AI-related incidents begin to show up.

Igor Maljkovic: As AI systems become more autonomous, organizations are reshaping incident response around continuous evaluation rather than post‑incident investigation. Traditional IR assumes deterministic systems with traceable logs, but AI incidents often involve emergent behavior, data‑driven shifts, or hidden prompt‑level manipulations.As a result, companies are adopting AI‑aware monitoring layers that track model inputs, outputs, behavioral drift, and decision pathways, alongside model‑forensic tooling capable of replaying prompts, reconstructing contexts, and identifying contamination points across data or training pipelines.

Security teams are also forming cross-functional incident response units, bringing together AI security engineers, ML engineers, data scientists, and AI safety specialists. This collaboration is essential to reason about model behavior, alignment failures, feedback loops, and emergent behaviors that do not fit traditional vulnerability models.

The new skills required focus on understanding how AI systems behave under stress, how errors and manipulations propagate through AI pipelines, and how systems can change over time in unexpected ways. Security teams must investigate incidents that involve shifts in system behavior rather than clear-cut software failures. This behavioral shift requires moving beyond traditional vulnerability analysis toward reasoning about trust, influence, and control in AI-driven systems.

**InfoQ: As we move toward more autonomous AI agents and AGI systems, what fundamental changes in security architecture and mindset will organizations need to prepare for? What should security leaders be doing today to prepare for tomorrow's AI security challenges?**

Elham Arshad: As organizations move toward more autonomous AI agents and eventually artificial general intelligence (AGI), security can’t just protect infrastructure anymore. It has to protect against goal-driven software that can act, make decisions, chain tools, and potentially cause real harm if misused or compromised. These vulnerabilities require treating AI agents like real identities, not just features. Each agent needs its own profile, permissions, credentials, and lifecycle, just like an employee. Instead of trusting agents by default, companies need to adopt a ZeroTrust (ZT) mindset where agents only get the narrow capabilities they absolutely need, and every sensitive action is checked against policy.

For highly autonomous systems, security also becomes about understanding and monitoring normal behavior: how an agent usually calls APIs, what data it typically accesses, and the tasks it performs. Companies like Obsidian Security are already pushing this behavioral approach as essential for keeping agents safe and predictable. In order to keep control firmly in place, organizations should rely on event-driven guardrails at the tool layer and rules that stop dangerous actions like deletes or configuration changes unless a human explicitly approves them.

To get ahead of these challenges, security leaders should begin preparing now: cataloging all AI systems, integrating agents into identity and access management, and designing environments that safely limit what agents can do in the event of an error. Additionally security leaders need to stand up AI-specific red teaming, update incident-response plans for AI failures, and build cross-functional teams that include security, ML, data, and legal experts. Organizations that invest early in these foundations will be far better positioned to scale advanced AI while minimizing risk as autonomy increases safely.

Sabri Allani: Two changes are fundamental:

- From perimeter/control-plane security to behavior- and permission-centric security, treat agents as semi-autonomous actors with continuous authorization and strict least privilege.
- From static systems to continuously evolving systems, security must assume drift, emergent behaviors, and changing risk profiles.
Practically, leaders should now invest in identity and policy enforcement for agents/tools, robust audit trails, secure data pipelines, continuous evaluation, and strong governance for model updates.

Vijay Dilwale: As we move toward more autonomous AI agents, the real shift I am seeing is from securing software that executes instructions to securing software that makes decisions and takes actions. Architecturally, this shift exposes how many systems were designed with the assumption that actions are always initiated by humans or tightly controlled services. That assumption no longer holds true once agents start operating across tools and workflows.From a design standpoint, identity and permissions become foundational. Agents need to operate as clearly defined actors, with explicit scope around what they can access and what actions they can take. Just as importantly, systems need to make it clear whether an action was taken by an AI agent or a human or a background process. Without that clarity, trust, governance, and control quickly erode as autonomy increases.

The mindset shift is accepting that surprises are part of the operating model. Security becomes less about trying to prevent every failure and more about designing systems that fail safely, visibly, and in ways the organization can recover from. Preparing for that transformation today means treating agent permissions, auditability, and human override as first-class security concerns, not future problems.

Igor Maljkovic: As organizations move toward more autonomous AI agents and eventually AGI‑level systems, security will have to shift from deterministic assumptions to protecting system behavior. Traditional security architectures assume systems do only what they are explicitly programmed to do. In contrast, autonomous AI systems operate through latent reasoning, tool use, and emergent strategies that cannot be fully predicted in advance.This requires a fundamental redesign of trust boundaries. It’s no longer enough to isolate where an AI system runs or how it is deployed. We must also isolate and constrain what it is allowed to do. Oversight must incorporate real-time behavioral monitoring, and controls must explicitly limit the impact of model-initiated actions. Security teams must define what an AI system can do, under which conditions, and with which downstream consequences.

One concrete step that must be taken is to establish comprehensive risk assessments before deploying any autonomous capability, explicitly mapping the maximum potential damage if that system is fully compromised or behaves adversarially. This requirement forces organizations to confront the trust they are implicitly granting.

Security leaders should begin preparing now by integrating AI‑aware threat modeling into every pipeline, establishing strong guardrails around autonomy, and building cross‑functional teams capable of understanding failure modes at the intersection of ML engineering and security engineering. Leaders must invest in monitoring infrastructures that observe model behavior over time, not just system state, and develop incident response capabilities suited for systems that learn, adapt, and evolve.

**InfoQ: Looking past specific tools and frameworks, what is the single most impactful action or mindset shift you believe security engineers must adopt today to effectively safeguard our increasingly AI-driven future?**

Elham Arshad: This single change unlocks everything else: "Treat AI systems as unpredictable, goal-driven actors, and not as software components; something that reasons, adapts, and can behave in ways that its designers never fully anticipated".AI security isn’t about chasing bugs, it’s about containing behavior, managing incentives, and protecting against failures in data, goals, and context rather than just code. This mindset transforms the playbook from trying to block corrupted inputs to actively preventing dangerous actions through sandboxing, action-level Zero Trust, strict capability controls, and continuous behavior monitoring. In short, AI security is less about fixing software and more about supervising and constraining a powerful digital actor, making verification, oversight, and containment core to how we secure the systems of the future.

Sabri Allani: Adopt the mindset that AI systems are both software and decision-making entities. In other words, security must focus on controlling actions and influences, not only vulnerabilities. Concretely we must enforce least privilege status for tools and data, require traceability of why an output happened (retrieval and tool traces), and continuously test behavior like you test security controls.

Vijay Dilwale: Looking past specific tools and frameworks, the biggest mindset shift security engineers need to make is thinking purely in terms of exploits and toward thinking in terms of system behavior and risk containment. In AI-driven systems, unexpected or "weird" behavior is not an edge case. It is part of the operating model.For example, it is not just about whether a model generates an inappropriate response, which can already carry reputational or legal risk. It is about what happens when an AI system retrieves the wrong data, misinterprets context, or takes an action it technically had permission to take but should not have taken in that situation. Can the team detect it quickly, understand why it happened, and limit the blast radius before the impact spreads?

The goal is not perfection or eliminating every failure mode, but building resilience, visibility, and control for when things inevitably go wrong.

Igor Maljkovic: The single most impactful shift security engineers must make is moving from static, assumption-based security to continuous validation of system behavior. In an AI-driven world, models behave probabilistically, change their behavior based on context, and can produce outcomes that were not explicitly anticipated by engineers at design time.As a result, security can no longer rely solely on predefined rules, fixed trust boundaries, or expectations of deterministic behavior. Instead, engineers must adopt a posture of continuous skepticism toward machine‑generated decisions, treating AI not as a trusted component but as an untrusted collaborator whose behavior must be monitored, constrained, and interrogated. This shift is what will define effective defense strategies in the years ahead.

## Conclusions

AI has profoundly reshaped the threat landscape and how security engineers operate. The shift from protecting static software to securing dynamic, adaptive systems marks a major evolution in security thinking. AI security now encompasses code, infrastructure, data sources, model behaviors, and their interactions. While traditional security frameworks remain valuable, they require enhancements. Incorporating AI threat modelling and behavioral monitoring can address emerging challenges.

As AI threats grow, security engineers are transitioning from mere gatekeepers to monitors of behavior. This approach views AI systems as unpredictable entities needing ongoing validation and control. The emphasis has moved from fixed assumptions to continuous behavior assessment. This transition demands advanced engineering skills and practical, research-based solutions.

Using AI systems comes with problems, like making it easier for hackers to get in and not knowing what AI will do next. Achieving perfection is impossible; instead, resilience and rapid response are key to effective defense. Metrics like behavioral stability and incident detection speed are more relevant than traditional measures.

One emerging category deserves particular attention: cross-system prompt manipulation. These exploits are different from traditional attacks on single systems. They create inputs that seem harmless on their own. However, they cause harm when multiple AI agents interact and share information. As AI-to-AI communication grows and gets less oversight, the attack surface widens greatly. Future damage will likely arise not from a specific exploit, but from misaligned autonomy. This widened attack surface occurs when AI systems are given authority, integration, or operational responsibility beyond their strength.

Organizations need to invest in monitoring tools. These tools should spot unusual patterns across system boundaries. Organizations must also ensure accountability during AI-to-AI interactions. Additionally, it's important to approach AI outputs with scepticism. Treat these outputs as possibly flawed inputs that need validation, not as trusted data. Organizations can thrive in this changing environment only by combining technical controls with careful monitoring.

This article is part of the "
|