AI already underpins detection, investigation and response across the security stack, and its role will only expand. The question for CISOs is no longer whether to adopt AI-enabled tools but which ones they can have confidence in. As adoption accelerates, leaders face a flood of claims about what each solution can detect, predict, or prevent.
The challenge now is separating marketing promises from proven resilience.
Validating traditional security tools has always been challenging, but validating AI-powered security tools introduces a completely different set of problems. Unlike deterministic systems, AI models behave probabilistically. Their outputs will vary depending on context, input structure and interaction history.
This creates a big challenge for evaluation. Tools that may seem reliable in testing can quickly become unpredictable when attackers actively try to evade them or when confronted with imperfect real-world data. So, any capabilities that have been demonstrated in isolation can not be assumed to translate to resilience in complex, real-world environments.
For security leaders, this means the usual, familiar testing and validation approaches are not enough. The question is not just whether an AI-powered tool works but how it actually behaves when it is stressed, manipulated, or forced to operate outside already known conditions.
AI is being used on both sides of security operations. Offensive AI security tools support activities such as penetration testing, vulnerability discovery, attack simulation, and red teaming. They are designed to augment, not replace, the work security professionals already do, accelerating reconnaissance, exploring attack paths and identifying weaknesses at scale.
Defensive AI security tools support detection, investigation, prioritisation and response. They help analysts to manage complexity better and make faster, well-informed decisions.
These AI-powered tools may be purpose-built for security use cases, but that in itself does not make them inherently robust. Like any system operating in adversarial environments, they need to be evaluated for resilience, as well as capability.
Why adversarial testing matters
Offensive security research tells us that layered defences can work, but only if they are tested properly. In adversarial exercises to test AI security tools, we have observed that basic safeguards are often bypassed quickly. This matters because AI security tool failures are rarely benign. A manipulated model can leak sensitive data, misclassify critical events, or take unintended actions at scale. A minor weakness can escalate into systemic risk once AI is embedded in core security operations. For CISOs, this means AI security tools must be evaluated not only for whether they do what they promise but also how they fail.
Many AI security tools rely on guardrails, which are constraints designed to limit model behaviour and keep outputs within acceptable bounds. Guardrails are necessary, but they are not evidence of resilience. A model that stays “on-rails” in a demo may still fail unpredictably when confronted with novel inputs, chained attacks, or operational noise.
Resilience can only be demonstrated through testing that reflects real threat conditions. Without that evidence, confidence in an AI security tool will be premature.
Which AI security tool use cases are worth attention
The most promising applications of AI in cyber security are the tools that augment human decision-making rather than attempting to replace it. Use cases such as triage assistance, pattern recognition at scale, and investigative support can deliver measurable gains, as long as they are properly constrained, tested and evaluated. By contrast, any fully autonomous decision-making or broad, unsupervised control can introduce risk faster than it reduces it.
What matters is not whether AI is present in the security stack but whether its limits are clearly defined and its behaviour under pressure is understood.
AI adoption is often framed as a matter of trust. But trust implies belief without verification. Confidence is different. Confidence is built through evidence. Security leaders do not need to be convinced that AI belongs in cyber security. They need assurance that the AI they deploy has been challenged, measured and benchmarked against realistic threats. That evidence will allow them to distinguish genuine capability from clever marketing.
Vendors who can demonstrate genuine resilience under adversarial scrutiny will stand apart from those relying on claims alone. AI is already reshaping cyber security, but reaching its full potential will depend on how well it withstands pressure and attack. For CISOs, the goal is to have evidence-backed confidence, not blind trust. In the end, it will be proof, not hype, that is the most valuable security control of all.
Haris Pylarinos is founder and CEO of Hack The Box.
