Hippocratic AI

mentions 7 type Organization feed RSS

// recent coverage 7 mentions

00:00

2026-06-12

modular.com

large-language-models

Modular: Day Zero: MiniMax M3 Open Weights on Modular Cloud

MiniMax released the open-weights MiniMax M3 model on Modular Cloud, featuring a new Sparse Attention operation that achieves up to 15.6x speedup on decode while maintaining a 1 million token context …

00:00

2026-06-05

modular.com

ai-infrastructure

Modular: Why LLM Inference Needs a New Kind of Router - Part 3

Modular has introduced a five-stage composable routing system for large language model inference, replacing traditional fixed algorithms like round-robin and consistent hashing. The system, detailed i…

22:53

2026-05-29

modular.com

machine-learning

Three Trends from MLSys 2026

At MLSys 2026, Modular identified three key trends in AI inference, highlighted by keynotes on agentic kernel development and the need for "zero trust" verification to prevent benchmark cheating. Lido…

00:00

2026-05-21

modular.com

large-language-models

Modular: Why LLM Inference Needs a New Kind of Router - Part 2

Modular has built a new data layer for LLM inference routing that solves the problem of querying cached blocks across hundreds of pods in microseconds. The company's architecture uses a specialized da…

00:00

2026-05-18

modular.com

artificial-intelligence

Modular: Hippocratic AI partners with Modular to power flexible, high-quality inference for real-time patient conversations

Hippocratic AI partnered with Modular to integrate the MAX framework into its inference pipelines, achieving sub-500ms mean time to first token and approximately 30% faster P99 end-to-end latency for …

00:00

2026-05-13

modular.com

ai-agents

Modular: Translating to Mojo via AI Agents

Modular released AI agent skills for its Mojo language that enable coding assistants to translate existing GPU kernels from CUDA and Triton into Mojo code, addressing the challenge that large language…

00:00

2026-05-08

modular.com

ai-infrastructure

Modular: Why LLM Inference Needs a New Kind of Router - Part 1

Modular announced that traditional HTTP-era load balancing algorithms like round-robin, consistent hashing, and least-connections are inadequate for large language model inference because GPU pods are…

// co-occurs with top 8 entities

Modular 7 MiniMax 4 DeepSeek 4 FLUX 4 Kimi 4 Wan 3 DeepSeek V4 Pro 2 FLUX.2 Klein 9B 2