DDN’s new A1400X3M storage appliance is a 160 million IOPS per rack powerhouse and the company has Nvidia Dynamo-integrated KV Cache software to provide twin-track AI processing acceleration.
The news came at ISC 2026 in Hamburg and included security, observability, and infrastructure efficiency enhancements for large-scale AI environments. DDN says these developments are designed to provide a storage infrastructure attuned to enterprise’s need to deal with bottlenecks in the AI data pipeline, from data ingestion and preparation to training, inference, RAG, and agentic AI.
Alex Bouzari, CEO and Co-Founder at DDN, said: “AI infrastructure is no longer just about compute. The economic success of AI depends on how efficiently organizations move, manage, secure, and operationalize data across the entire AI lifecycle. At ISC 2026, DDN is introducing the next generation of AI data intelligence innovations designed to help customers maximize GPU utilization, reduce inference costs, accelerate time-to-token, and improve the overall economics of AI factories at massive scale.”
The A1400X3M, which is used by DDN’s EXAScaler Lustre-based software, is the successor to DDN’s A1400X2 appliance. The 2 RU x 24-SSD chassis has moved from the previous Intel Xeon Ice Lake CPU to AMD’s Genoa processor, and from PCIe gen 4 to the doubled speed PCIe gen 5. The A1400X3M delivers up to 190 GBps read throughput with sequential writes topping out at 110 GBps. This is an up to 35 percent read throughput increase on the A1400X2.
DDN says a rack full of these chassis provides 160 million IOPS, which we calculate means around 4 million IOPS per chassis. Here’s a table showing how these arrays have progressed over the past few years;
Dynamo-integrated KV Cache
A key-value cache (KV cache) is a mechanism used to store past Gen AI large language model (LLM) layers’ activations (keys and values) during inferencing. It allows LLMs to bypass recomputation of these activations, improving performance. Nvidia supports KV cache extension with its Dynamo distributed inference framework software and CMX scheme. This extends a GPU’s KV cache into NVMe-based storage, with a 4-tier hierarchy - HBM, DRAM, local SSD/BlueField-4 accessed shared NVMe SSD storage, external NVMe SSD storage - making NVMe-resident KV cache part of the context memory address space and persistent across inference runs. It is backed by Nvidia’s NVMe storage partners which includes DDN.
DDN’s KV Cache acceleration is now available across its Infinia (object storage) and EXAScaler (Lustre storage) AI data platforms, DDN saying it “accelerates large-scale AI inference by eliminating memory bottlenecks and enabling ultra-fast retrieval of model context directly from DDN’s AI-native data intelligence platform.” That platform is either EXAScaler or Infinia and the two are positioned by DDN thus;
AI Inference - Infinia delivers AI-native object storage engineered specifically for modern inference and retrieval-intensive workloads, providing ultra-low latency metadata performance, massive concurrency, and high-speed object access required for enterprise-scale AI factories.
AI Training - EXAScaler provides industry-leading parallel file system performance for training and checkpointing.
DDN says the combined EXAScaler-Infinia system “enables organizations to unify AI data infrastructure across the entire AI lifecycle. This architecture allows customers to eliminate data silos, maintain consistently high GPU utilization, accelerate time-to-first-token, and optimize AI infrastructure economics at scale.”
Its KV Cache acceleration features:
Shared distributed KV Cache fabric optimized for large-scale inference environments
Ultra-low latency data access for large-context inference and faster token generation
Optimized support for agentic AI, reasoning models, RAG, and multi-step inference pipelines
Deep integration with Nvidia Dynamo, vLLM, and modern inference frameworks
Improved GPU utilization and reduced idle compute cycles
Up to 55x faster KV cache performance for large-scale inference workloads
Lower cost per token and improved AI factory ROI through more efficient GPU and infrastructure utilization
AI infrastructure developments
DDN also announced security and efficiency news designed to improve workload isolation, governance, visibility, and infrastructure efficiency across production AI environments;
Security
Bare-metal multi-tenancy KMIP-based encryption and key management
VictoriaLogsintegration for operational visibilityMulti-tenant APIs with and without CSI
Efficiency
Intelligent file pinning capabilities
NAND-accelerated Hot Pools to tier data from expensive all-flash drives to lower-cost HDDs
DDN said that, with Google Cloud Managed Lustre, powered by EXAScaler, Salesforce achieved 1.5x faster model training, a 75 percent reduction in I/O latency, and a 42 percent reduction in training costs.
Access an A1400X3M datasheet here. General A1400CX3M availability is expected by the end of Q3 2026.