Rebellions Bets on Memory-Centric Architecture as it Weighs IPO Options

wpnews.pro

South Korean AI silicon startup Rebellions is focusing its technology roadmap on memory as it seeks to take advantage of strategic connections with Korean semiconductor giants SK Hynix and Samsung Foundry. The company believes this strategy will be crucial as it explores options for an IPO, Rebellions CEO Sunghyun Park told EE Times.

“Rebellions, as a Korean startup, backed by Samsung and SK Hynix, is in a very good position to explore all the options for memory-centric architectures,” Park said.

As large-scale LLMs reach deployment, memory capacity and bandwidth are becoming critical to large AI inference accelerator designs like Rebellions’. The company’s second-generation AI accelerator, Rebel, was announced in 2024. Rebel is a scaled-up version of Rebellions’ first-gen CGRA-based accelerator, with four compute chiplets offering 1 POPS of FP16 compute and 144 GB of HBM4e in a 300-W power envelope.

Memory is therefore of extreme strategic importance, in terms of both supply chain and economics. The industry is moving away from commodity memories in a number of directions, Park said. Huge KV caches will require a combination of HBM and HBF (high-bandwidth Flash) for capacity, while scale-up and scale-out solutions will require specialized memory architectures and memory pooling.

View All The industry is also exploring custom HBM implementations. The company had planned a 3D SRAM stack for its next-generation architecture, but has switched to 3D-stacked DRAM as part of a collaboration with both SK Hynix and Samsung. Park said Rebellions is working to co-design HBM memory and logic dies; custom HBM could include logic to handle fast token decoding, but the industry is still figuring out which logic to include.

“There’s no de facto standard solution yet, so it’s a good time to explore what options we can have in the base die for custom HBM,” he said.

Customer base

Rebel is commercialized primarily in South Korea and the Middle East today, Park said. Having secured its memory supply chain, Rebellions is receiving significant attention in the Kingdom of Saudi Arabia (KSA) in particular.

“Everyone talks about technology, but the most important thing right now is to secure the supply chain,” he said. “The beauty of Rebellions is we can secure all the memory.”

Turbulence in the Middle East of late hasn’t dampened the region’s AI infrastructure ambitions, Park said.

“Humain is still the same, Aramco is still the same,” he said. “They believe [AI] is not just a trend. It’s their ambition for 2030, and I’m proud to be part of the ecosystem here in Saudi Arabia.”

Groq was the main player in this region until recently, Park noted, and changes there have brought the Middle East closer to the Korean ecosystem in search of other hardware candidates.

“Sovereign AI [in KSA] means a heterogeneous compute platform where Nvidia and non-Nvidia hardware co-exist, and where U.S. and non-U.S. hardware are installed together,” he said. “Training and inference are not locked in by Nvidia products here, and [we have] a very compelling story.”

As well as sovereign deployments, telecoms is the other key market for Rebellions, both in the Middle East and South Korea.

“The telecoms industry has money, and they know how to do capex,” Park said.

The biggest deployment of Rebellions chips to date is at SK Telecom, where a multi-rack first-gen Rebellions cluster partially powers Adot, SK Telecom’s proprietary AI assistant, which provides Korea-specific services like summarizing phone calls. Adot is the biggest user of tokens in South Korea, Park said, with up to 50 million API calls per day. The companies are currently exploring options for scale-up and scale-out of this cluster.

“I’m proud that we have end users for this service; it’s not just infrastructure, it’s a real, live service in Korea,” he said.

Rebellions hardware is also deployed in NPU-as-a-service infrastructure by Korea Telecom (KT). Rebellions’ open-source software stack, optimized for Red Hat, is popular with potential U.S. customers, but this is still a growing market for the Korean company, Park said.

Chiplet architecture

AI silicon startup Cerebras’ huge IPO last month has put new price tags on companies like Rebellions, Park said, and brought investor and customer focus to low-latency inference, which is further intensifying focus on memory technologies and supply chain. While the first big exits in this domain, Groq and Cerebras, both have SRAM-based architectures, the next big winners will be those who use 3D DRAM stacking, Park said.

“One year ago we focused on chiplets; chiplet was the magic word,” he said. “Today, the magic word is memory and memory-centric architectures. That’s why [we are getting traction] with financial investors, because we’re uniquely positioned here.”

Rebellions recently taped out CXL and Ethernet I/O dies, but plans to sell compute chiplets are still evolving. The industry landscape around chiplets is still moving, Park said, so it’s too soon to decide whether chiplets are a valid go-to-market option for a startup.

“Who’s my friend and who’s my enemy in this field?” he said. “Even Nvidia is trying to build its own chiplet ecosystem. Right now, I don’t know what direction [we’ll go in]. I’d like us to be the chiplet player for XPU, but we need to find the right partners and the right packaging partner. It’s very important to understand what’s going on in this ecosystem.”

The company collaborates with Marvell on system-level technologies, including optical scale-up (Rebellions is considering co-packaged optics for future generations, Park said, to keep up with customer demands for bigger scale-up domains).

Both the AI chip companies that had big exits recently had moved even further up the stack, building out substantial cloud deployments of their own. Would Rebellions consider doing something similar?

“It’s an option,” Park said. “API service is good because we can hide all the numbers by abstracting the customer [further] from our silicon, but frankly, we don’t have a specific direction yet because we are still figuring out the tokenomics.”

The Korean government has encouraged Rebellions to build its own sovereign data centers, but that would require additional investment, Park said.

The trend for disaggregated inference, where inference workloads are split across different types of more specialized chips, has been amplified by the Nvidia-Groq deal. Park said that while the industry doesn’t yet have a standard approach to disaggregation, Rebellions is working with Arm and SK Telecom on a disaggregation project. In this specific setup, Rebellion’s hardware accelerates the decode stage; Rebellion’s compute chiplets have significant SRAM, in some ways similar to the Groq chip in Nvidia’s disaggregated architecture.

“It’s an interesting idea, and the collaboration between Arm, SK Telecom, and Rebellions is working well, but I’m not sure whether disaggregation is the right overall direction in the future,” he said.

Rebellions’ current-generation chip, Rebel, has HBM, Park added, so it can also easily handle prefill.

Rebellions closed a $400 million pre-IPO round in March, bringing total funding raised to $850 million. The company is talking with its bankers, Park said, but no concrete plans have been made for the IPO just yet. The company is exploring both Nasdaq and domestic listing options, Park said, while another strategic funding round pre-IPO is also an option.

Rebellions Bets on Memory-Centric Architecture as it Weighs IPO Options

See also:

Run your AI side-project on zahid.host