DDN launches faster array HW and KV Cache SW for AI

DDN launched the A1400X3M storage appliance, delivering up to 160 million IOPS per rack, and introduced KV Cache software integrated with Nvidia Dynamo to accelerate AI inference. The new hardware and software aim to reduce bottlenecks in AI data pipelines, improve GPU utilization, and lower inference costs for large-scale AI environments.

DDN launches faster array HW and KV Cache SW for AI DDN’s https://www.blocksandfiles.com/ai-ml/2026/06/12/ddn-wants-strategic-investors/5254665 new A1400X3M storage appliance is a 160 million IOPS per rack powerhouse and the company has Nvidia Dynamo-integrated KV Cache software to provide twin-track AI processing acceleration. The news came at ISC 2026 in Hamburg and included security, observability, and infrastructure efficiency enhancements for large-scale AI environments. DDN says these developments are designed to provide a storage infrastructure attuned to enterprise’s need to deal with bottlenecks in the AI data pipeline, from data ingestion and preparation to training, inference, RAG, and agentic AI. Alex Bouzari, CEO and Co-Founder at DDN, said: “AI infrastructure is no longer just about compute. The economic success of AI depends on how efficiently organizations move, manage, secure, and operationalize data across the entire AI lifecycle. At ISC 2026, DDN is introducing the next generation of AI data intelligence innovations designed to help customers maximize GPU utilization, reduce inference costs, accelerate time-to-token, and improve the overall economics of AI factories at massive scale.” The A1400X3M, which is used by DDN’s EXAScaler Lustre-based software, is the successor to DDN’s A1400X2 https://www.blocksandfiles.com/file/2021/11/10/ddn-doubles-performance-of-high-end-ai-array/1615525 appliance. The 2 RU x 24-SSD chassis has moved from the previous Intel Xeon Ice Lake CPU to AMD’s Genoa processor, and from PCIe gen 4 to the doubled speed PCIe gen 5. The A1400X3M delivers up to 190 GBps read throughput with sequential writes topping out at 110 GBps. This is an up to 35 percent read throughput increase on the A1400X2. DDN says a rack full of these chassis provides 160 million IOPS, which we calculate means around 4 million IOPS per chassis. Here’s a table showing how these arrays have progressed over the past few years; Dynamo-integrated KV Cache A key-value cache KV cache is a mechanism used to store past Gen AI large language model LLM layers’ activations keys and values during inferencing. It allows LLMs to bypass recomputation of these activations, improving performance. Nvidia supports KV cache extension with its Dynamo distributed inference framework software and CMX scheme. This extends a GPU’s KV cache into NVMe-based storage, with a 4-tier hierarchy - HBM, DRAM, local SSD/BlueField-4 accessed shared NVMe SSD storage, external NVMe SSD storage - making NVMe-resident KV cache part of the context memory address space and persistent across inference runs. It is backed by Nvidia’s NVMe storage partners which includes DDN. DDN’s KV Cache acceleration is now available across its Infinia object storage and EXAScaler Lustre storage AI data platforms, DDN saying it “accelerates large-scale AI inference by eliminating memory bottlenecks and enabling ultra-fast retrieval of model context directly from DDN’s AI-native data intelligence platform.” That platform is either EXAScaler or Infinia https://www.blocksandfiles.com/data-management/2025/07/19/ddn-touts-infinia-storage-as-key-to-faster-cheaper-ai-inference/1603774 and the two are positioned by DDN thus; AI Inference - Infinia delivers AI-native object storage engineered specifically for modern inference and retrieval-intensive workloads, providing ultra-low latency metadata performance, massive concurrency, and high-speed object access required for enterprise-scale AI factories. AI Training - EXAScaler provides industry-leading parallel file system performance for training and checkpointing. DDN says the combined EXAScaler-Infinia system “enables organizations to unify AI data infrastructure across the entire AI lifecycle. This architecture allows customers to eliminate data silos, maintain consistently high GPU utilization, accelerate time-to-first-token, and optimize AI infrastructure economics at scale.” Its KV Cache acceleration features: Shared distributed KV Cache fabric optimized for large-scale inference environments Ultra-low latency data access for large-context inference and faster token generation Optimized support for agentic AI, reasoning models, RAG, and multi-step inference pipelines Deep integration with Nvidia Dynamo, vLLM, and modern inference frameworks Improved GPU utilization and reduced idle compute cycles Up to 55x faster KV cache loading performance for large-scale inference workloads Lower cost per token and improved AI factory ROI through more efficient GPU and infrastructure utilization AI infrastructure developments DDN also announced security and efficiency news designed to improve workload isolation, governance, visibility, and infrastructure efficiency across production AI environments; Security Bare-metal multi-tenancy KMIP-based encryption and key management VictoriaLogs https://docs.victoriametrics.com/victorialogs/ integration for operational visibilityMulti-tenant APIs with and without CSI Efficiency Intelligent file pinning capabilities NAND-accelerated Hot Pools to tier data from expensive all-flash drives to lower-cost HDDs DDN said that, with Google Cloud Managed Lustre, powered by EXAScaler, Salesforce achieved 1.5x faster model training, a 75 percent reduction in I/O latency, and a 42 percent reduction in training costs. Access an A1400X3M datasheet here https://www.ddn.com/download/ai400x3m-datasheet/ . General A1400CX3M availability is expected by the end of Q3 2026.