AMD's Lemonade SDK For Local AI Adds NVIDIA CUDA Support

AMD released Lemonade SDK version 10.7, adding NVIDIA CUDA support to its local AI server solution that previously only supported AMD hardware, Apple Metal GPUs, and AArch64 CPUs. The update integrates Llama.cpp's CUDA back-end on Windows and Linux, along with stable-diffusion.cpp CUDA and Vulkan support, enabling the same local AI experience across competitor GPUs. The move expands Lemonade's cross-platform compatibility for developers running local AI models compliant with OpenAI, Anthropic, and Ollama APIs.

AMD's Lemonade SDK For Local AI Adds NVIDIA CUDA Support Lemonade, the local AI server solution developed by AMD that is designed to work across their CPUs, GPUs, and NPUs, is out with a new version today that also adds NVIDIA CUDA support. THe Lemonade SDK provides local AI server capabilities in an API-compliant manner with OpenAI, Anthropic, and Ollama APIs. Lemonade builds off FastFlowLM, vLLM, Llama.cpp, and other components for a rich, open-source local AI experience. Beyond supporting their own wares of AMD Ryzen AI NPUs, Radeon/Instinct GPU accelerators, and x86 64 CPUs, they have also supported Apple Metal GPUs and AArch64 CPU support too. Interestingly, with Lemonade 10.7 they have also now added NVIDIA CUDA support for allowing the same local AI server experience on their competitor's GPUs. Lemonae 10.7 now properly integrates Llama.cpp's CUDA back-end on Windows and Linux with proper NVIDIA GPU detection in Lemonade and other integration bits. The stable-diffusion.cpp CUDA back-end is also added for Linux. Additionally, this release brings stable-diffusion.cpp Vulkan support on both Windows and Linux for broader cross-vendor GPU support. Lemonade 10.7 also adds support for LMX-Omni models, a native Prometheus end-point for real time stats monitoring, and other enhancements. Exciting me with Lemonade 10.7 is adding the Lemonade 10.7 downloads and more details on this open-source feature release via THe Lemonade SDK provides local AI server capabilities in an API-compliant manner with OpenAI, Anthropic, and Ollama APIs. Lemonade builds off FastFlowLM, vLLM, Llama.cpp, and other components for a rich, open-source local AI experience. Beyond supporting their own wares of AMD Ryzen AI NPUs, Radeon/Instinct GPU accelerators, and x86 64 CPUs, they have also supported Apple Metal GPUs and AArch64 CPU support too. Interestingly, with Lemonade 10.7 they have also now added NVIDIA CUDA support for allowing the same local AI server experience on their competitor's GPUs. Lemonae 10.7 now properly integrates Llama.cpp's CUDA back-end on Windows and Linux with proper NVIDIA GPU detection in Lemonade and other integration bits. The stable-diffusion.cpp CUDA back-end is also added for Linux. Additionally, this release brings stable-diffusion.cpp Vulkan support on both Windows and Linux for broader cross-vendor GPU support. Lemonade 10.7 also adds support for LMX-Omni models, a native Prometheus end-point for real time stats monitoring, and other enhancements. Exciting me with Lemonade 10.7 is adding the lemonade bench command that is focused on apples-to-apples LLM benchmarking across Llama.cpp, FastFlowLM, vLLM, and Ryzen AI software. I'll be checking out the lemonade bench to see how its benchmarking works out and hopefully using it in future articles on Phoronix.Lemonade 10.7 downloads and more details on this open-source feature release via