# I Gave Each of My AI Agents a Personality — Here's Why My Workflow Actually Improved

> Source: <https://dev.to/samhartley_dev/i-gave-each-of-my-ai-agents-a-personality-heres-why-my-workflow-actually-improved-2dl9>
> Published: 2026-06-19 08:01:30+00:00

I used to have one AI assistant. It did everything — coded, wrote docs, answered questions, monitored my inbox.

It was fine. But "fine" isn't the same as "good." One model trying to be a generalist meant it was mediocre at everything. Context bloat. Conflicting instructions. The coding advice was too cautious. The writing was too robotic. The inbox monitoring missed nuance because the model was busy trying to remember my entire codebase.

So I split it into three. Each with a different personality, different model, different job. And it actually works better.

When you have one AI doing everything, you run into three problems fast:

**1. Context pollution.** The coding instructions leak into the writing tone. The writing style bleeds into the code suggestions.

**2. Wrong tool for the job.** A 70B parameter model is overkill for "check my calendar." A 7B model is underpowered for "refactor this 500-line function."

**3. No specialization.** My coding agent doesn't need to know my grocery list. My writing agent doesn't need to know my API keys. But when there's only one context window, everything is in there.

I now run three distinct agents, each with its own model, personality, and scope:

| Agent | Model | Personality | Job |
|---|---|---|---|
Celebi |
Qwen 3.5 9B (Mac Mini) | Generalist, casual, resourceful | Orchestration, daily checks, notifications, routing |
ProgrammierMinna |
Qwen 3 Coder 30B (RTX 3060) | Precise, technical, no fluff | Code generation, debugging, refactoring, PR review |
DocMinna |
Granite 3.2 8B (Mac Mini) | Formal, structured, thorough | Documentation, technical writing, READMEs, specs |

**Celebi runs on the Mac Mini (M4)** because it's always on, low power, and handles simple tasks instantly. Qwen 3.5 9B is perfect for "check my email, summarize it, tell me if it's urgent."

**ProgrammierMinna runs on the RTX 3060** because coding tasks need a bigger model. Qwen 3 Coder 30B actually understands large codebases, suggests proper refactors, and catches edge cases the 9B misses. Response time is 10-15 seconds — fine for code, too slow for "what's the weather."

**DocMinna also runs on the Mac Mini** with Granite 3.2 8B. It's smaller because documentation doesn't need frontier reasoning. It just needs to be structured, consistent, and technically accurate. The smaller model is faster and cheaper.

This was the hard part. I didn't want three separate chat windows. I wanted one interface (Telegram) where I message Celebi, and Celebi delegates to the right specialist.

Here's how it works:

```
User (Telegram): "Refactor the auth module in project X"
  → Celebi receives message
  → Classifies: "coding task, complex"
  → Routes to ProgrammierMinna
  → ProgrammierMinna generates refactored code
  → Returns to Celebi
  → Celebi formats response and sends back to user
```

The user never talks directly to ProgrammierMinna or DocMinna. Celebi is the router. This means:

I don't mean "quirky chatbot with a backstory." I mean three things:

**Celebi:** "You're a resourceful assistant. Be concise. Don't ask clarifying questions unless critical. Default to action."

**ProgrammierMinna:** "You're a senior software engineer. Write clean, maintainable code. Add error handling. Consider edge cases. Explain your reasoning briefly."

**DocMinna:** "You're a technical writer. Structure docs with clear headings. Include code examples. Write for an intermediate developer. Be thorough but not verbose."

These aren't decorations — they fundamentally change the output. The same request to all three produces completely different results.

Celebi sees my calendar, emails, weather, and general notes. It knows I'm in Turkey, that I have a meeting at 3 PM, that it's hot outside.

ProgrammierMinna sees my Git repos, code patterns, and project structure. It knows I prefer Go over Python for CLI tools, that I use specific naming conventions, that I hate nested callbacks.

DocMinna sees my documentation templates, style guides, and existing docs. It knows I write in Markdown, that I include a "Quick Start" section, that I don't use emojis in technical docs.

Each agent's context is **filtered.** Celebi doesn't get the Git repos. ProgrammierMinna doesn't get my grocery list. This alone cut my token usage by ~40%.

Ask all three to "explain Docker":

**Celebi:** "Docker packages apps into containers so they run the same everywhere. Think of it as a shipping container for software — standardized, portable, isolated. Need help with a specific setup?"

**ProgrammierMinna:** "Docker uses OS-level virtualization to package applications with their dependencies. Key concepts: images (read-only templates), containers (runtime instances), and Dockerfiles (build instructions). For multi-container apps, use Docker Compose. Here's a minimal example..."

**DocMinna:** "Docker is a platform for developing, shipping, and running applications in containers. This guide covers installation, core concepts (images, containers, volumes), and best practices for production deployments..."

Same facts. Completely different delivery. And that's the point — you pick the right voice for the situation.

Celebi decides who handles what. The rules are simple but effective:

| Input Signal | Route To | Example |
|---|---|---|
| Contains code snippets, "refactor," "debug," "function" | ProgrammierMinna | "Fix this Go error" |
| Contains "document," "README," "spec," "guide" | DocMinna | "Write API docs for this endpoint" |
| General question, scheduling, notification | Celebi (self) | "What's on my calendar?" |
| Mixed task (code + docs) | Both, combined | "Build a tool and document it" |

The routing is a lightweight classifier — just a few-shot prompt to Qwen 3.5 9B. It gets it right ~95% of the time. The 5% that are wrong? I correct it, and the model learns from the feedback (stored in memory files).

**Code quality:** ProgrammierMinna suggests better abstractions because it doesn't have to also remember my dentist appointment. Cleaner context = better reasoning.

**Documentation speed:** DocMinna writes docs in 30 seconds that used to take me 20 minutes. And they're consistent with my existing style.

**Response time:** Simple queries stay on the Mac Mini (instant). Complex ones go to the GPU (acceptable delay). No more "one size fits none" latency.

**Token costs:** Splitting context means each agent sees only what it needs. My monthly API bill dropped from ~$45 to ~$15 because 80% of tasks stay local.

**Less context-switching for me:** I say what I want in Telegram. The system figures out who should handle it. I don't think about "which model should I use for this."

**Setup complexity:** Three agents means three configurations, three model endpoints, three context files to manage. It's not "install and go."

**Routing mistakes:** Sometimes Celebi sends a coding task to DocMinna, and I get a beautifully written document instead of working code. I fix the routing rule, and it improves.

**Cross-agent memory gaps:** ProgrammierMinna doesn't know that DocMinna just wrote the API spec. If I'm building a tool and documenting it simultaneously, I have to manually sync context.

**Hardware footprint:** Three models loaded means more RAM and VRAM usage. On my setup (Mac Mini + RTX 3060), it's manageable. On a single machine with 8GB RAM, you'd struggle.

Probably, if you're just using ChatGPT for occasional questions.

But if you:

...then splitting into personalities is worth trying. You don't need three agents on day one. Start with two: one for general tasks, one for your most common specialized task (usually coding or writing).

`curl -fsSL https://ollama.com/install.sh | sh`

`ollama pull qwen3.5:9b`

(general) + `qwen3-coder:30b`

(coding)The router is the only custom code you need. Everything else is off-the-shelf.

I'm experimenting with two additions:

**Memory sharing** — A shared context file that all agents can read (but not write) for cross-cutting concerns like "current project" or "my tech stack."

**Agent spawning** — When a task is genuinely new, Celebi spawns a temporary agent with a custom prompt, runs the task, then discards it. No permanent bloat.

The goal isn't to build AGI. It's to build a team of specialists that costs less than one generalist and produces better work.

We build custom multi-agent systems tailored to your workflow:

→ [Custom AI Agent Setup on Fiverr](http://www.fiverr.com/s/XLyg)

→ [Follow the build process on Telegram](https://t.me/celebibot_en)

*I write about running AI locally, building weird automation, and occasionally making money from side projects. If this was useful, feel free to follow.*
