TL;DR:I spent weeks tuning DeepSeek V4 to feel native inside Claude Code. The result: a one-command setup with 9 agents, 7 rules, security hooks, OCR, and auto-backup. Clone β ./init.sh β you're shipping.
I wanted DeepSeek's 1M context window inside Claude Code's interface. What I got was two weeks of fighting API configs, debugging token limits, and discovering that most "just swap the model" advice is missing half the puzzle.
So I packaged everything into a single repo.
git clone https://github.com/YuhaoLin2005/deepseek-claude-code-starter.git
cd deepseek-claude-code-starter
./init.sh
Three commands. Here's what lands in your ~/.claude/
:
| Component | What It Does |
|---|---|
| 9 Custom Agents | |
| Code review, security audit, TDD guide, architecture, build-fix β each tuned for DeepSeek's reasoning style | |
| 7 Behavior Rules | |
| Code quality, security, testing discipline, YAGNI enforcement, commit standards | |
| Security Hook | |
| PreToolUse guard β blocks sensitive file access and dangerous commands before they execute | |
| Auto-Backup | |
| Pre-edit snapshots (keeps 5) + session-start git commit. You'll thank me the first time you roll back | |
| Local OCR | |
| RapidOCR on ONNX β lets DeepSeek "see" screenshots without sending them to a cloud API | |
| Status Line | |
| Compaction counter β warns you at 5+ compactions to start fresh ( | |
The single biggest quality-of-life improvement: main agent gets the Pro model, sub-agents get Flash.
DeepSeek V4 Pro handles architecture, debugging, and complex reasoning. But reading files? Searching? Running tests? Those don't need a 1M-context reasoning beast. Flash is faster, cheaper, and completely adequate for the grunt work.
This one decision doubled my effective throughput. The main agent never waits behind a queue of file-reads.
One rule I'm unreasonably proud of: the 6-level decision ladder.
Level 0: stdlib can do it β don't write code
Level 1: one-liner β don't write fifty lines
Level 2: existing tool β don't build a replacement
Level 3: simple script β don't build a framework
Level 4: library β don't build from scratch
Level 5: only then, build
It's YAGNI compiled into a decision tree. When every token costs money, "just in case" code is a bill you pay every session.
My self-model protocol handles identity across sessions. But identity is useless if the tools aren't right. This starter kit is the companion piece β the "body" to the protocol's "mind."
Both are MIT licensed. Both took weeks of trial and error to get right. Both are now one git clone
away.
Have you tried running non-Claude models inside Claude Code? What broke first β the API, the prompts, or your patience? Drop your war stories in the comments.
Related: LLM compaction isn't linear β’ Self-model protocol for AI identity persistence