{"slug": "announcing-kotlin-bindings-for-nobodywho", "title": "Announcing Kotlin bindings for NobodyWho", "summary": "NobodyWho released Kotlin bindings for its on-device LLM inference engine, allowing Android and JVM developers to add the library as a Gradle dependency and run models entirely on users' devices without API keys or cloud costs. The bindings support streaming chat via Kotlin Flow, tool calling with automatic parameter extraction through Kotlin reflection, and features like embeddings and cross-encoder for RAG, joining existing bindings for Godot, Rust, Python, Flutter, React Native, and Swift.", "body_md": "# Announcing Kotlin bindings for NobodyWho\n\nYou can now add NobodyWho as a Gradle dependency and ship an LLM that runs entirely on your users' devices. No API keys, no servers to babysit, no per-token bill at the end of the month — just a `.gguf`\n\nfile on the device and a chat loop in your app.\n\n## Why on-device?\n\nMost AI features in mobile apps today route every request through a hosted API. Running the model directly on the user's device is a different shape of product, and it brings real benefits:\n\n**Privacy by design**— user data never leaves the device** Works offline**— no internet connection required** Low latency**— no network round trip on every interaction** No cloud costs**— inference is free, no per-token billing\n\nThe tradeoff is raw capability — local models are smaller than frontier cloud models — but for chat, summarization, classification, and many agentic workflows they're more than enough.\n\n## What you get\n\nThe Kotlin bindings expose the same core API our Godot, Rust, Python, Flutter, Swift, and React Native users already know, including:\n\n- Streaming chat with full token-by-token output via Kotlin\n`Flow`\n\n- Tool calling with automatic parameter extraction via Kotlin reflection\n- Sampling controls (temperature, constrained/JSON output, ...)\n- Embeddings and a cross-encoder for RAG\n- Feed image and audio inputs directly to your LLM\n- Any model in\n`.gguf`\n\nformat, powered by[llama.cpp](https://github.com/ggerganov/llama.cpp)under the hood\n\nIt works on Android and anywhere else the JVM runs.\n\n## Getting started\n\nA minimal chat looks like this:\n\n``` python\nimport ai.nobodywho.Chat\nimport kotlinx.coroutines.runBlocking\n\nfun main() = runBlocking {\n    val chat = Chat.fromPath(modelPath = \"./model.gguf\")\n    val response = chat.ask(\"Is water wet?\").completed()\n    println(response)\n}\n```\n\nFor streaming, use `asFlow()`\n\n:\n\n``` php\nchat.ask(\"Is water wet?\").asFlow().collect { token ->\n    print(token)\n}\n```\n\nFor the full setup — picking a model, getting the `.gguf`\n\nonto the device, wiring up a streaming chat UI — see the [Kotlin documentation](https://docs.nobodywho.ooo/kotlin/).\n\n## Tool calling with reflection\n\nOne of the nicest things about the Kotlin bindings is tool calling. Kotlin's reflection API lets NobodyWho automatically extract parameter names and types from your function, so you don't have to declare them twice.\n\n```\nfun getWeather(city: String, unit: String): String {\n    return \"\"\"{\"temp\": 22, \"unit\": \"$unit\"}\"\"\"\n}\n\nval weatherTool = Tool(\n    name = \"get_weather\",\n    description = \"Get the current weather for a city\",\n    function = ::getWeather\n)\n```\n\nThis is similar to how our Flutter bindings work — both use runtime reflection to derive the tool schema directly from the function signature. No redundant schema definitions needed.\n\n## One core, many languages\n\nKotlin joins a growing list of first-class NobodyWho targets. The same Rust core — wrapping llama.cpp — now powers bindings across:\n\n**Godot**— drop-in nodes for game dialogue, NPCs, and tooling** Rust**— the native API the rest are built on** Python**— for scripting, prototyping, and ML workflows** Flutter**— cross-platform mobile and desktop apps** React Native**— the JavaScript/TypeScript mobile ecosystem** Swift**— native iOS, macOS, watchOS, and visionOS apps** Kotlin**— Android and JVM apps\n\nThat's the whole point of NobodyWho: one well-maintained inference core, with idiomatic bindings for whichever language or framework you actually want to ship in. Every binding gets the same feature set — streaming, tool calling, sampling, embeddings, RAG — so you don't have to give up capabilities to use the language you prefer.\n\n## Join the community\n\nWe'd love to hear what you build with the new Kotlin bindings — and meet the people building with NobodyWho across all the other languages too.\n\n— the best place to ask questions, share what you're working on, and chat with the team and other NobodyWho users.[Discord](https://discord.gg/qhaMc2qCYB)— open an issue if you hit a bug, or a discussion if you have an idea. And if you like what we're building, give us a star![GitHub](https://github.com/nobodywho-ooo/nobodywho)\n\nHappy hacking!\n\nPublished Jun 3, 2026", "url": "https://wpnews.pro/news/announcing-kotlin-bindings-for-nobodywho", "canonical_source": "https://nobodywho.ooo/posts/announcing-kotlin-bindings/", "published_at": "2026-06-03 00:00:00+00:00", "updated_at": "2026-06-04 13:17:39.540295+00:00", "lang": "en", "topics": ["artificial-intelligence", "machine-learning", "large-language-models", "ai-products", "ai-tools"], "entities": ["NobodyWho", "Kotlin", "Gradle", "llama.cpp", "Android", "Godot", "Rust", "Python"], "alternates": {"html": "https://wpnews.pro/news/announcing-kotlin-bindings-for-nobodywho", "markdown": "https://wpnews.pro/news/announcing-kotlin-bindings-for-nobodywho.md", "text": "https://wpnews.pro/news/announcing-kotlin-bindings-for-nobodywho.txt", "jsonld": "https://wpnews.pro/news/announcing-kotlin-bindings-for-nobodywho.jsonld"}}