{"slug": "introducing-kerasformers-transformers-for-keras-3", "title": "Introducing KerasFormers: \"Transformers\" for Keras 3!", "summary": "KerasFormers, an open-source library built entirely in Keras 3, launches with over 100 transformer models spanning vision, language, multimodal, and speech, supporting seamless execution on TensorFlow, JAX, and PyTorch. The library provides pre-trained weights, one-line loading from Hugging Face, and modern architectures including Llama, Qwen, DeepSeek, and Gemma, making state-of-the-art models accessible through a unified API.", "body_md": "After months of dedicated development, I’m excited to share KerasFormers : an open-source library bringing modern transformer architectures with pre-trained weights, built entirely in Keras 3. What started as a vision model collection has grown into a unified ecosystem spanning vision, language, multimodal, and speech all through a single API that runs seamlessly on TensorFlow, JAX, and PyTorch. One API. Any backend.\n\nKey Features\n\n• 100+ models across vision, language, multimodal & speech under one unified API\n\n• Modern LLM architectures : Dense, Mixture-of-Experts (MoE) & Multi-head Latent Attention (MLA): Llama 2/3/4, Qwen 2/3/3.5, DeepSeek V2/V3/V4, Gemma, Mistral, Mixtral, Cohere2, GLM-4, MiniMax, GPT-OSS\n\n• Vision-Language Models: Qwen-VL, Qwen2.5-VL, Qwen3-VL, InternVL3, Janus-Pro, Gemma 3, GLM-4V & more\n\n• Full computer vision suite: classification, detection, segmentation, depth & self-supervised learning\n\n• One-line pre-trained loading from Hugging Face & timm: model = Model.from_weights(“hf:…”) • Fast, compiled .generate() with KV caching\n\n• Native Keras 3 with full multi-backend compatibility KerasFormers makes state-of-the-art models accessible through a consistent, backend-agnostic interface move seamlessly across frameworks.\n\npip install -U kerasformers", "url": "https://wpnews.pro/news/introducing-kerasformers-transformers-for-keras-3", "canonical_source": "https://discuss.huggingface.co/t/introducing-kerasformers-transformers-for-keras-3/176899#post_1", "published_at": "2026-06-17 23:26:54+00:00", "updated_at": "2026-06-17 23:28:07.984823+00:00", "lang": "en", "topics": ["large-language-models", "computer-vision", "natural-language-processing", "ai-tools", "ai-infrastructure"], "entities": ["KerasFormers", "Keras 3", "TensorFlow", "JAX", "PyTorch", "Hugging Face", "Llama", "Qwen"], "alternates": {"html": "https://wpnews.pro/news/introducing-kerasformers-transformers-for-keras-3", "markdown": "https://wpnews.pro/news/introducing-kerasformers-transformers-for-keras-3.md", "text": "https://wpnews.pro/news/introducing-kerasformers-transformers-for-keras-3.txt", "jsonld": "https://wpnews.pro/news/introducing-kerasformers-transformers-for-keras-3.jsonld"}}