Introducing KerasFormers: "Transformers" for Keras 3!

wpnews.pro

cd /news/large-language-models/introducing-kerasformers-transformer… · home › topics › large-language-models › article

[ARTICLE · art-31863] src=discuss.huggingface.co ↗ pub=2026-06-17T23:26Z topic=large-language-models verified=true sentiment=↑ positive

Introducing KerasFormers: "Transformers" for Keras 3!

KerasFormers, an open-source library built entirely in Keras 3, launches with over 100 transformer models spanning vision, language, multimodal, and speech, supporting seamless execution on TensorFlow, JAX, and PyTorch. The library provides pre-trained weights, one-line loading from Hugging Face, and modern architectures including Llama, Qwen, DeepSeek, and Gemma, making state-of-the-art models accessible through a unified API.

read1 min views32 publishedJun 17, 2026

After months of dedicated development, I’m excited to share KerasFormers : an open-source library bringing modern transformer architectures with pre-trained weights, built entirely in Keras 3. What started as a vision model collection has grown into a unified ecosystem spanning vision, language, multimodal, and speech all through a single API that runs seamlessly on TensorFlow, JAX, and PyTorch. One API. Any backend.

Key Features

• 100+ models across vision, language, multimodal & speech under one unified API

• Modern LLM architectures : Dense, Mixture-of-Experts (MoE) & Multi-head Latent Attention (MLA): Llama 2/3/4, Qwen 2/3/3.5, DeepSeek V2/V3/V4, Gemma, Mistral, Mixtral, Cohere2, GLM-4, MiniMax, GPT-OSS

• Vision-Language Models: Qwen-VL, Qwen2.5-VL, Qwen3-VL, InternVL3, Janus-Pro, Gemma 3, GLM-4V & more • Full computer vision suite: classification, detection, segmentation, depth & self-supervised learning

• One-line pre-trained from Hugging Face & timm: model = Model.from_weights(“hf:…”) • Fast, compiled .generate() with KV caching • Native Keras 3 with full multi-backend compatibility KerasFormers makes state-of-the-art models accessible through a consistent, backend-agnostic interface move seamlessly across frameworks.

pip install -U kerasformers

source & further reading

discuss.huggingface.co — original article Rakarrack-0.6.1 port making progress! ( AI assisted ) Cloud Storage Poll Welcome to Haiku basic(Haiku Docs, Haiku slide and Haiku sheets)

~/api · this article 200

$curl api.wpnews.pro/v1/news/introducing-kerasformers…

Read original on discuss.huggingface.co → discuss.huggingface.co/t/introducing-kerasformer…

mentioned entities

KerasFormers

Keras 3

TensorFlow

JAX

PyTorch

Hugging Face

Llama

Qwen

metadata

slugintroducing-kerasformers-transformers-for-keras-3

topic#large-language-models

secondary4 topics

sentimentpositive

canonicaldiscuss.huggingface.co

navigation

← prevWhy standard WER fails for India…

next →AI regulation has a democracy pr…

── more in #large-language-models 4 stories · sorted by recency

dev.to · 1 Aug · #large-language-models

The AI/ML Engineer Roadmap Nobody Actually Finishes (But You Should Try)

frontierroles.com · 1 Aug · #large-language-models

AI Engineer, Intern — Postman

hiraditya.github.io · 25 Jul · #large-language-models

XLA Up Close: What It Optimizes, and What It Won't

hiraditya.github.io · 24 Jul · #large-language-models

A Tour of XLA: Where MLIR Lives (and Where It Doesn't)

── more on @kerasformers 3 stories trending now

wpnews · 1 Aug · #ai-products

OpenAI Atlas Shuts Down August 9: Migration Guide

wpnews · 1 Aug · #ai-agents

Quality Isn't Accidental — Maker/Checker Separation and Automated Validation

wpnews · 2 Aug · #developer-tools

Agent-Browser – Browser Automation for AI

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required