cd /news/ai-tools/show-hn-mandocode-local-first-ai-cod… · home topics ai-tools article
[ARTICLE · art-24739] src=github.com pub= topic=ai-tools verified=true sentiment=↑ positive

Show HN: MandoCode – local-first AI coding agent (.NET and Ollama)

MandoCode, a new open-source AI coding agent built on .NET and Ollama, launched today as a local-first alternative to cloud-based assistants. The tool, which requires no API keys and runs entirely in the terminal, can read, write, search, and plan across entire codebases while supporting any file type. Its release addresses developer concerns about privacy and cost by offering a fully local AI coding experience that can optionally connect to cloud models for more powerful capabilities.

read20 min publishedJun 12, 2026

Your AI coding assistant — run locally or in the cloud with Ollama.

No API keys required. Just you and your code.

MandoCode is an AI coding assistant built on RazorConsole, powered by Semantic Kernel and Ollama. RazorConsole makes the entire terminal UI possible — Razor components, a virtual DOM, and Spectre.Console rendering all running in the console.

Run locally or connect to Ollama cloud — no API keys required for anything, including web search (an optional free Tavily key upgrades search reliability). It gives you Claude-Code-style project awareness — reading, writing, searching, planning, and web browsing across your entire codebase — without ever leaving your terminal. It understands any file type: C#, JavaScript, TypeScript, Python, CSS, HTML, JSON, config files, and more.

.NET 8 SDKdotnet.microsoft.com/download/dotnet/8.0(SDK includes the runtime — install only the SDK)Ollamaollama.com/download(MandoCode walks you through setup on first run)

dotnet tool install -g MandoCode
mandocode

First run launches a guided wizard: it detects Ollama, offers to start it, walks you through cloud sign-in if you'd like more powerful models, and auto-pulls a sensible default. You can re-run it any time with /setup

.

mandocode --doctor

Prints your runtime version, Ollama status, models pulled, and cloud sign-in state.

Using cloud models (Cloud model context is managed on Ollama's servers and set to the model's maximum by default — nothing on your machine affects it, including the desktop app's slider.:cloud

tags)? Skip this section.

If you use local models and see responses cut off, the model "forgetting" earlier conversation, edits failing repeatedly on files it just wrote, or this message:

⚠ Response was cut off because the model's CONTEXT WINDOW filled …

…your Ollama context window is almost certainly too small. The context window is how much conversation + code the model can see at once — and Ollama defaults it to ~4k tokens, which an agentic session fills almost immediately. When it overflows, the oldest content (including the system prompt — the model's instructions!) is silently dropped.

If you use the Ollama desktop app (the tray icon), the app's Settings → Context length slider controls this — and it overrides everything else, including MandoCode's config:

There's no universally right slider position — it's a trade between how much the model can see and fitting in your GPU's memory (every 8k of window costs roughly 0.5–1.5 GB of VRAM depending on the model):

Too low(the 4k default): the symptoms above — the model's own instructions silently fall out of the window and it stops behaving.Too high for your GPU: the model spills into system RAM, tokens/sec craters, and turns crawl or look hung.** Starting points**:** 16kfor most GPUs, 32k**with 8 GB+ VRAM. Only raise it if you're seeing the symptoms above; step back down a notch if generation slows badly after raising it.

If you run ollama serve yourself (no desktop app), MandoCode handles it: it sets

OLLAMA_CONTEXT_LENGTH

from your contextLength

config when it starts the daemon, and auto-sizes it to the hardware tier of the model you pick in /setup

or /model

. Tune it manually with:

mandocode --config set contextLength 16384

Verify what your daemon is actually using with ollama ps

(look at the CONTEXT column). Run /learn

inside MandoCode for a friendly explainer.

The context window's evil twin — and unlike the slider above, this one applies to every model, cloud included. If the model announces work and then just stops"I'll create the game…" and the turn ends with no plan, no files, and no error — your maxTokens

is too low. It caps a single reply (NumPredict

), and reasoning models spend output tokens thinking before they emit a tool call, so a low cap cuts them off before they ever act.

Fresh installs default to 32k and never notice it. But if your config predates v0.11, or you once lowered maxTokens

thinking it was the context window (they're different knobs — this caps what the model says, the context window caps what it sees), check it:

mandocode --config show                  # look at "Max Tokens"
mandocode --config set maxTokens 32768

The telltale sign: token tracking shows output pinned at exactly your cap, turn after turn (e.g. 2k out

every time). Note that a running session keeps the config it loaded at startup — restart MandoCode (or use /config set

in-app) for the change to take effect.

git clone https://github.com/DevMando/MandoCode.git
cd MandoCode
dotnet build src/MandoCode/MandoCode.csproj
dotnet run --project src/MandoCode/MandoCode.csproj -- /path/to/your/project

| Every file write and delete is intercepted with a color-coded diff. You approve, deny, or redirect — nothing touches disk without your say-so. | Type | | Complex requests are automatically broken into step-by-step plans. Review the plan, then watch each step execute with progress tracking. | The AI can search the web and read webpages to find documentation, tutorials, or answers — no API keys needed. Optionally add a free | | Lofi and synthwave tracks bundled right in. A waveform visualizer runs in the corner while you code. Because vibes matter. | If Ollama isn't running, MandoCode shows setup guidance inline instead of a bare error. Use |

Feature Description
AI
Project-aware assistant Reads, writes, deletes, and searches your entire codebase
AI
Web search & fetch Web search and webpage reading — keyless via DuckDuckGo, or Tavily with a free API key
AI
MCP server support Connect to any Model Context Protocol server (stdio or remote HTTP) — Claude-Desktop-compatible config
AI
Streaming responses Real-time output with animated spinners
AI
Task planner Auto-detects complex requests and breaks them into steps
AI
Fallback function parsing Handles models that output tool calls as raw JSON
UI
Diff approvals Color-coded diffs with approve / deny / redirect
UI
Markdown rendering Rich terminal output — headers, tables, code blocks, quotes
UI
Syntax highlighting C#, Python, JavaScript/TypeScript, Bash
UI
Clickable file links OSC 8 hyperlinks for file paths
UI
Terminal theme detection Auto-adapts colors for light and dark terminals
UI
Taskbar progress Windows Terminal integration during task execution
Input
/ command autocomplete
Slash commands with dropdown navigation
Input
@ file references
Attach file content to any prompt
Input
! shell escape
Run shell commands inline (!git status , !ls )
Input
/copy and /copy-code
Copy responses or code blocks to clipboard
Music
Lofi + synthwave Bundled tracks with volume, genre switching, waveform visualizer
Config
Configuration wizard Guided setup with model selection and connection testing
Config
Config validation Auto-clamps invalid settings to safe ranges
Reliability
Retry + deduplication Exponential backoff and duplicate call prevention
Education
/learn command
LLM education guide with optional AI educator chat

Type /

to see the autocomplete dropdown, or !

to run a shell command.

Command What it does
/help
Show commands and usage examples
/setup
Guided wizard — reconnect to Ollama, install/sign in, or pick a different model
/model
Quick switch — pick a different model (context window auto-sized for local tiers)
/config
Adjust settings — guided wizard
/config set <key> <value>
Set one setting inline without leaving the session (e.g. /config set modelResponseTimeout 300 ); no args lists all keys + current values
/retry
Retry Ollama connection
/learn
Interactive guide to LLMs and local AI
/copy
Copy last AI response to clipboard
/copy-code
Copy code blocks from last response
/command <cmd>
Run a shell command
/music
Start playing music
/music-stop
Stop playback
/music-
/ resume
/music-next
Next track
/music-vol <0-100>
Set volume
/music-lofi
Switch to lofi
/music-synthwave
Switch to synthwave
/music-list
List available tracks
/mcp
List configured MCP servers with status and tool counts
/mcp add
Interactively add a new MCP server to config
/mcp remove <name>
Remove an MCP server from config
/mcp tools <server>
List tools exposed by connected MCP servers (server optional)
/mcp-reload
Restart all MCP servers and re-register their tools
/clear
Clear conversation history
/exit
Exit MandoCode
!<cmd>
Shell escape (e.g., !git status )
!cd <path>
Change project root directory

— first-run wizard, guided. Detects Ollama, offers to install it, walks you through cloud sign-in, picks a model with hardware-aware tiers, auto-pulls a sensible default. Use when something's broken or you're a newcomer./setup

— quick switch. Pick a model from your pulled list and go — local picks get a context window sized to their hardware tier automatically. Use when you just want to swap models./model

— adjust settings. Full configuration form covering temperature, timeouts, ignore dirs, etc. Use when you know exactly what knob you want to turn./config

mandocode --doctor          # preflight check: .NET runtime, Ollama status, models, sign-in
mandocode --config show     # print current config
mandocode --config init     # create a default config file
mandocode --config set <key> <value>   # set a single value (e.g. set model qwen3.5:9b)
mandocode --config path     # show config file location

Run mandocode --doctor

any time chat is misbehaving — exits 0 if everything's green, 1 if anything's missing, with a clear summary of what's wrong.

  You type a prompt
        |
  MandoCode adds project context (@files, system prompt)
        |
  Semantic Kernel sends to Ollama (local or cloud model)
        |
  AI responds with text + function calls
        |
  File operations go through diff approval
  Web searches and fetches run directly
        |
  Rich markdown rendered in your terminal

The AI has sandboxed access to your project through a FileSystemPlugin (9 functions: list files, glob search, read, write, delete files/folders, text search, path resolution) and a WebSearchPlugin (web search via Tavily or DuckDuckGo, webpage fetching — works without any API key). All file operations are locked to your project root — path traversal is blocked.

Models with tool/function calling support work best with MandoCode. The first-run wizard offers exactly the models below — auto-pulls the cloud default, or lets you pick a local tier matched to your hardware.

Cloud (no GPU required — runs on Ollama's servers, free with ollama signin

):

Model Notes
minimax-m2.7:cloud
Default — auto-pulled by /setup when you pick Cloud

Local (fully offline, runs on your hardware):

Model Size Hardware
qwen3.5:0.8b
~1.0 GB CPU-only / integrated GPU — fast on any laptop, light reasoning
qwen3.5:2b
~2.7 GB Modern CPU or 4 GB+ GPU — quick Q&A, simple code edits
qwen3.5:4b
~3.4 GB Mid-range GPU (4-6 GB VRAM) or 16 GB RAM — balanced day-to-day use
qwen3.5:9b
~6.6 GB Dedicated GPU (8+ GB VRAM) — best local quality, multi-file refactors

MandoCode validates model compatibility on startup. Run /learn

for a detailed guide on model sizes and hardware requirements, or /setup

to switch between tiers any time.

Located at ~/.mandocode/config.json

{
  "ollamaEndpoint": "http://localhost:11434",
  "modelName": "minimax-m2.7:cloud",
  "modelPath": null,
  "temperature": 0.7,
  "maxTokens": 4096,
  "ignoreDirectories": [],
  "enableDiffApprovals": true,
  "enableTaskPlanning": true,
  "enableTokenTracking": true,
  "enableThemeCustomization": true,
  "enableFallbackFunctionParsing": true,
  "functionDeduplicationWindowSeconds": 5,
  "maxRetryAttempts": 2,
  "music": {
    "volume": 0.5,
    "genre": "lofi",
    "autoPlay": false
  }
}
Key Default Description
ollamaEndpoint
http://localhost:11434
Ollama server URL
modelName
minimax-m2.7:cloud
Model to use
modelPath
null
Optional path to a local GGUF model file
temperature
0.7
Response creativity (0.0 = focused, 1.0 = creative)
maxTokens
32768
Cap on a single reply (NumPredict ) — a runaway-generation safety ceiling, not the context window. If the model announces work then stops without acting, this is too low (see Troubleshooting)
contextLength
8192
Context window (num_ctx / KV-cache size) for local models, set via OLLAMA_CONTEXT_LENGTH when MandoCode starts the Ollama daemon. 0 = leave Ollama's default (~4k). Bigger window = more VRAM. Cloud models manage context server-side
ignoreDirectories
[]
Additional directories to exclude from file scanning
enableDiffApprovals
true
Show diffs and prompt for approval before file writes/deletes
enableTaskPlanning
true
Enable automatic task planning for complex requests
enableTokenTracking
true
Show session token totals and per-response token costs
enableThemeCustomization
true
Detect terminal theme and apply a curated ANSI palette
enableFallbackFunctionParsing
true
Parse function calls from text output
functionDeduplicationWindowSeconds
5
Time window to prevent duplicate function calls
maxRetryAttempts
2
Max retry attempts for transient errors
music.volume
0.5
Music volume (0.0 - 1.0)
music.genre
lofi
Default genre (lofi or synthwave )
music.autoPlay
false
Auto-start music on launch
mandocode config show              # Display current configuration
mandocode config init              # Create default configuration file
mandocode config set <key> <value> # Set a configuration value
mandocode config path              # Show configuration file location
mandocode config --help            # Show help
Variable Overrides
OLLAMA_ENDPOINT
ollamaEndpoint in config
OLLAMA_MODEL
modelName in config

When the AI writes or deletes a file, MandoCode intercepts the operation and shows a color-coded diff before applying changes.

Red lines— content being removed** Light blue lines**— content being added** Dim lines**— unchanged context (3 lines around each change)- Long unchanged sections are collapsed with a summary

Option Behavior
Approve
Apply this change
Approve - Don't ask again
Auto-approve future changes to this file (per-file), or all files (global)
Deny
Reject the change, the AI is told it was denied
Provide new instructions
Redirect the AI with custom feedback

For new files, "don't ask again" sets a global bypass — all future writes and deletes are auto-approved for the session. For existing files, the bypass is per-file.

Even when auto-approved, diffs are still rendered so you can follow along.

File deletions show all existing content as red removals with a deletion warning. The same approval options apply.

mandocode config set diffApprovals false

Type @

anywhere in your input (after a space or at position 0) to trigger file autocomplete. A dropdown appears showing your project files, filtered as you type.

  • Type your prompt and hit @

— a file dropdown appears - Type a partial name to filter (e.g., Conf

) — matches narrow down - Use arrow keys to navigate, Tab orEnter to select - The selected path is inserted (e.g., @src/MandoCode/Models/MandoCodeConfig.cs

) - Continue typing and press Enter to submit - MandoCode reads the referenced file(s) and injects the content as context for the AI

explain @src/MandoCode/Services/AIService.cs to me
what does the ProcessFileReferences method do in @src/MandoCode/Components/App.razor
refactor @src/MandoCode/Models/Messages.cs to use fewer spinners

Multiple @

references in one prompt are supported. Files over 10,000 characters are automatically truncated.

Key Action
@
Open file dropdown
Type Filter files by name
Up/Down Navigate dropdown
Tab/Enter Insert selected file path (does not submit)
Escape Close dropdown, keep text
Backspace Re-filter, or close if you delete past @

MandoCode automatically detects complex requests and offers to break them into a step-by-step plan before execution.

The planner activates for requests like:

Create a REST API service with authentication and rate limiting for the user module

(12+ words with imperative verb and scope indicator)Build an application that handles user registration and sends email confirmations

  • Numbered lists with 3+ items
  • Requests over 400 characters

Simple questions, short prompts, and single-action operations (delete, remove, read, show, list, find, search, rename) bypass planning automatically.

Detection— heuristics identify complex requests** Plan generation**— AI creates numbered steps** User approval**— review the plan table, then choose: execute, skip planning, or cancel** Step-by-step execution**— each step runs with progress tracking** Error handling**— skip failed steps or cancel the entire plan

See Task Planner Documentation for full technical details.

The /learn

command helps new users understand local LLMs and get set up.

Scenario What happens
Startup, no Ollama detected Automatically displays the educational guide instead of a bare error
/learn typed, no model running
Displays the static educational guide
/learn typed, model is running
Shows the guide, then offers to enter AI educator chat mode

What are Open-Weight LLMs?— Free, private, offline models vs. cloud AI** Model Sizes & Hardware**— Parameters, quantization, VRAM requirements** Cloud vs Local Models**— Ollama cloud models (no GPU) vs local models** Recommended Models**— Table of cloud and local options** Getting Started**— Step-by-step setup instructions

When Ollama is running, /learn

offers an interactive chat mode where the AI explains LLM concepts using beginner-friendly language. Type /clear

to return to normal mode.

MandoCode speaks the Model Context Protocol as a client, which means you can plug in any published MCP server — filesystem, database, GitHub, Linear, Slack, whatever — and its tools show up to the model alongside MandoCode's built-in plugins.

Two ways:

inside MandoCode — an interactive wizard that prompts through name, transport, URL/command, and optional headers/env vars, previews the JSON, and saves + reloads automatically./mcp add

Hand-edit— useful when copy-pasting a~/.mandocode/config.json

mcpServers

block from a server's README. Run/mcp-reload

after saving.

The mcpServers

block mirrors Claude Desktop's schema, so you can copy-paste any server's README installation snippet directly into ~/.mandocode/config.json

:

{
  "enableMcp": true,
  "mcpServers": {
    "filesystem": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem", "/path/to/allow"]
    },
    "solana": {
      "url": "https://mcp.solana.com/mcp",
      "transport": "http"
    },
    "github": {
      "url": "https://api.githubcopilot.com/mcp/",
      "headers": { "Authorization": "Bearer ghp_your_token_here" },
      "autoApprove": ["list_issues", "get_pr"]
    }
  }
}

stdio— for local servers. Populatecommand

+args

  • optionalenv

. Works with any server published as an npm/pip/go binary.HTTP / SSE— for remote servers. Populateurl

; the client auto-detects Streamable HTTP or SSE. Custom headers go inheaders

— most commonlyAuthorization: Bearer …

for servers that accept static tokens.

No. MandoCode itself is pure .NET. But individual servers may need whatever runtime their command

points at — Node for npx

, Python for uvx

, or nothing extra for standalone binaries. Same situation as Claude Desktop, Cursor, and VS Code.

Native OAuth is not in this release. For servers that require an OAuth flow (some hosted connectors like Google Drive), wrap them in stdio via the community mcp-remote proxy, which handles the browser dance itself:

"gdrive": {
  "command": "npx",
  "args": ["mcp-remote", "https://example.com/mcp"]
}

MandoCode cannot tell a read-only MCP tool from a destructive one by inspecting arguments, so the first call of each (server, tool)

pair prompts you with Approve / Approve for session / Deny. Pre-trusted tools can be listed under autoApprove

in a server's config entry to skip the prompt entirely.

/mcp

— shows each configured server with its transport, connection status, and live tool count/mcp add

— interactive wizard for adding a new server without hand-editing JSON/mcp remove <name>

— remove a server from config (with confirm)/mcp tools <server>

— list every tool exposed by connected servers with descriptions (server arg optional — omit to list all)/mcp-reload

— tears down every MCP client, restarts them, and re-registers their tools on the kernel (useful when you edit the config mid-session)

mandocode --config set mcp false   # disable all MCP integration

Individual servers can be muted without deleting them — set "disabled": true

on any entry in mcpServers

.

The AI has sandboxed access to your project directory through these functions:

Function Description
list_all_project_files()
Recursively lists all project files, excluding ignored directories
list_files_match_glob_pattern(pattern)
Lists files matching a glob pattern (*.cs , src/**/*.ts )
read_file_contents(relativePath, startLine?, endLine?)
Reads file content with line count — large files page via startLine /endLine , and truncated output names the exact line to resume from
write_file(relativePath, content)
Writes/creates a file (creates directories as needed)
delete_file(relativePath)
Deletes a file
create_folder(relativePath)
Creates a new directory
delete_folder(relativePath)
Deletes a directory and all its contents
search_text_in_files(pattern, searchText)
Searches file contents for text, returns paths and line numbers
get_absolute_path(relativePath)
Converts a relative path to absolute

Security: All operations are sandboxed to the project root. Path traversal is blocked with a separator-boundary check.

Ignored directories: .git

, node_modules

, bin

, obj

, .vs

, .vscode

, packages

, dist

, build

, __pycache__

, .idea

— plus any custom directories from your config.

The AI can search the web and fetch page content — no API keys required.

Function Description
search_web(query, maxResults)
Searches the web and returns titles, URLs, and snippets (1–10 results)
fetch_webpage(url, maxCharacters)
Fetches a URL and extracts readable text content (500–15,000 chars)

Out of the box, search uses DuckDuckGo's free HTML endpoint — which rate-limits and temporarily blocks IPs under heavy agentic use, so searches can randomly fail. For reliable, AI-optimized search, add a free Tavily API key (free tier ~1,000 searches/month):

/config set tavilyKey tvly-...        # in-app — verifies the key live against Tavily
mandocode --config set tavilyKey tvly-...   # or from the CLI

With a key set, search_web

prefers Tavily and keeps DuckDuckGo as the fallback; clear it anytime with /config set tavilyKey clear

. The key is stored locally in ~/.mandocode/config.json

and only ever sent to Tavily — set the TAVILY_API_KEY

environment variable instead if you'd rather keep it out of the file. Fetched pages are cleaned of scripts, nav, and non-content elements via HtmlAgilityPack.

Transient errors (HTTP failures, timeouts, socket errors) are retried with exponential backoff:

Attempt 1 -> fail -> wait 500ms
Attempt 2 -> fail -> wait 1000ms
Attempt 3 -> fail -> throw
Operation Window Matching
Read operations 2 seconds Function name + arguments
Write operations 5 seconds (configurable) Function name + path + content hash (SHA256)

Some local models output function calls as JSON text instead of proper tool calls. MandoCode detects and parses:

  • Standard: {"name": "func", "parameters": {...}}

  • OpenAI-style: {"function_call": {"name": "func", "arguments": {...}}}

  • Tool calls: {"tool_calls": [{"function": {"name": "func", "arguments": {...}}}]}

AI responses are rendered as rich terminal output:

Markdown Rendered as
**bold**
Bold text
*italic*
Italic text
code
Cyan highlighted
Fenced code blocks Bordered panels with syntax highlighting
Tables Spectre.Console table widgets
# Headers
Bold yellow with horizontal rules
- lists
Indented bullet points
> quotes
Grey-bordered block quotes
URLs Clickable OSC 8 hyperlinks

Syntax highlighting supports C#, Python, JavaScript/TypeScript, and Bash with language-specific keyword coloring.

Per-response:[~1.2k in, 847 out]

after each AI responseSession total:Total [4.2k tokens]

above the promptFile estimates:@file

attachments show estimated token cost (chars/4)

Function executions use semaphore-based signaling, ensuring each task plan step fully completes before the next begins.

src/MandoCode/
  Components/        Razor UI (App, Banner, HelpDisplay, ConfigMenu, Prompt)
  Services/          Core logic (AI, markdown, syntax, tokens, music, diffs, input state machine)
  Models/            Data models, config, system prompts, educational content
  Plugins/           Semantic Kernel plugins (FileSystem, WebSearch)
  Audio/             Bundled lofi and synthwave MP3 tracks
  docs/              Feature and architecture documentation
  Program.cs         Entry point and DI registration
Package Purpose

Ollama Connector1.72.0-alphaRazorConsole.Core0.5.0-alphaMarkdig1.0.0NAudio2.2.1HtmlAgilityPack1.11.72FileSystemGlobbing10.0.3Most AI coding agents in the wild are built with Python, Rust, or TypeScript. .NET rarely gets mentioned — but it should.

Semantic Kernel is Microsoft's open-source SDK for building AI agents, and it's one of the most capable orchestration frameworks available: native plugin systems, function calling, structured planning, and first-class support for local models through connectors like Ollama. It runs cross-platform on Windows, Linux, and macOS.

MandoCode exists partly to prove the point: you can build a full-featured, agentic CLI tool on .NET and Semantic Kernel that stands alongside anything built in other ecosystems. The tooling is there. It's open source. It just doesn't get the attention it deserves.

── more in #ai-tools 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/show-hn-mandocode-lo…] indexed:0 read:20min 2026-06-12 ·