# Llama.cpp now has an official website: llama.app

> Source: <https://llama.app/>
> Published: 2026-05-29 16:58:26+00:00

[llama.app](./)

[GitHub 112.2K](https://github.com/ggml-org/llama.cpp)

`curl -LsSf https://llama.app/install.sh | sh`

Prefer Brew or Winget?

[Package managers](https://github.com/ggml-org/llama.cpp/blob/master/docs/install.md)Rather build from source?[Follow instructions](https://github.com/ggml-org/llama.cpp/blob/master/docs/build.md)## AI that lives on your computer.

Open-source, private, always local.

Run frontier AI entirely on your machine. No API keys, no telemetry, no limits. Take AI back.

```
# 1. Serve a model
llama serve

# 2. Install the pi-llama plugin
pi install git:github.com/huggingface/pi-llama

# 3. Run Pi, everything is set
pi
```

## Pair it with a local coding agent.

Run `llama serve`

, then launch [Pi](https://github.com/badlogic/pi-mono). It auto-discovers your local model. No config, no API keys. Files stay on your machine,
requests never leave it.

## Optimized for any hardware.

From your laptop to a cluster, llama.cpp runs on whatever you have. Same binary, same models, same hand-tuned kernels for every GPU and CPU.

Apple Silicon M Ultra RTX 5090

H100 MI300 RTX 4090

M Max A100 DGX Spark T4

Jetson B200 Intel Arc

CPU Radeon RX M Pro RTX 3090
