# Show HN: wavecat – a fully local personal agent that watches your screen

> Source: <https://wavecat.ai/>
> Published: 2026-06-29 00:00:32+00:00

#
A fully local personal agent

that watches your screen

a super cool project dev by [Samuel Yuan](https://www.mit.edu/~sdkyuan/)

No data centers;

Private & entirely on your computer

[Download](#download)

wavecat constantly watches your screen to understand you.

All models run locally, so no personal data ever leaves your device.

Hopefully the future of personal AI is local.

It's free. The context is always there. And privacy & sovereignty are always ensured.

Usage

Using wavecat is as easy as installing it. Once the app is installed, you will be guided through the vision and language model installation process. These models will take roughly 19 GB of disk space (even when heavily quantized) since they contain billions of parameters.

You will also be guided to allow wavecat to view your screen. With this, wavecat will develop a rich understanding of your activity and goals. Ideally it will be able to anticipate your needs, before you even ask.

Again, don't worry, no personal data will leave your device. All the data is stored locally and all the processing is done locally on your device. wavecat will never send any of your personal screen data to the cloud; you can even turn off your internet and wavecat will still work.

Hardware Requirements

For Mac users, at least 24 GB of unified memory is necessary for smooth background model running, with 32+ GB recommended. wavecat only supports Apple Silicon Macs.

For Windows and Linux users, wavecat supports Vulkan, plus CUDA on Windows. A dedicated GPU with at least 12 GB of VRAM or unified memory device with at least 24 GB of RAM is recommended. More is always better.

While no hardware requirement is strictly enforced, you will not have an enjoyable time with wavecat unless your device meets these requirements.

But hopefully model improvements, hardware advances, and inference system optimizations will allow local personal agents to run on much more inexpensive hardware in the near future!

Technical Details

wavecat uses [llama.cpp](https://github.com/ggml-org/llama.cpp) as the primary backend inference engine. It's great.

Qwen3.6 35B A3B serves as the primary language model "engine." A way to connect your own (more powerful) open-source model as a backend is currently in development at [github.com/sdkyuanpanda/wavecat-sdk](https://github.com/sdkyuanpanda/wavecat-sdk).

On a M5 Pro with 48 GB of RAM, wavecat runs comfortably at roughly 70-90 tok/s depending on the task. Better speculative decoding methods, first-class MLX support, and other techniques rolled out in later updates will hopefully allow wavecat to run even faster soon.

Misc

Right now, English is the only language supported by wavecat. While you can interact with wavecat in other languages, performance will likely be impaired.

Integrations with other apps and tools are in development and should be available soon! I'll also be working on vastly improving the SDK so you can add your own plugins to wavecat.

If you have any questions, please read the [FAQ](/faq).

If you have any unanswered questions, feedback, or advice, please feel free to email me at sdkyuan [at] mit [dot] edu.