Configure providers, vector stores, file filters, tuning, and status checks for semantic code search.
Codebase indexing is back—here’s how to set it up.
This is the practical companion to the launch post. The launch post covers why indexing matters. This guide covers the mechanics: providers, vector stores, enablement scope, tuning, file filters, and verification.
The main rule: provider configuration is not enablement. You can add API keys and model settings without indexing anything. Kilo starts indexing only after you turn it on globally or for the current project.
What you need before you start #
Codebase indexing needs two pieces:
an embedding provider, which turns code chunks into vectors
a vector store, which saves those vectors and lets Kilo search them later
Kilo parses code locally with Tree-sitter, chunks it into semantic blocks like functions, classes, and methods, embeds those chunks, and stores the vectors. Once the index is ready, Kilo can use the semantic_search tool to answer conceptual questions like “where do we validate customer identity?” or “find the retry logic for failed API calls.”
This guide follows the current Kilo Code codebase indexing docs. Where the public docs do not document a config shape, this guide does not invent one.
Start in the VS Code settings UI #
For most users, the settings UI is the safest first path. This assumes you already have a vector store ready - if not, be sure to follow the “Choose a vector store” section below first.
Open Kilo Code in VS Code.
Go to
Kilo Code Settings → Indexing.Turn on
Global Enable orEnable for This Project.Choose an embedding provider.
Choose a vector store:
Qdrant orLanceDB.Adjust tuning only if you need to.
Save, then watch the indexing status indicator in the prompt input panel.
You can also click the indexing indicator at the bottom of the prompt input panel to open indexing setup.
The statuses are:
Disabled: indexing is off or not configured.** Initializing**: indexing is getting started and setup.** Standby**: indexing is configured but not currently processing files.** In Progress**: Kilo is scanning, chunking, embedding, or storing files. The UI shows progress (e.g.Indexed 123 / 250 files (54%)
).Complete: the index is up to date and ready for semantic search.** Error**: indexing failed. Check the error message, provider credentials, and vector store connection.
Do not skip this check. An API key in config does not mean the repo has been indexed.
Project-level vs. global enablement #
Kilo has two enablement scopes.
Use Enabled Globally when you want Kilo to index every workspace you open, using your global indexing defaults.
Use Enable for this project when you want indexing only for the current repo. This is usually the better first test, especially for a large codebase or a hosted embedding provider.
The config shape is the same either way:
{
“indexing”: {
“enabled”: true
}
}
The path determines the scope:
Global config:
~/.config/kilo/kilo.jsonc
Project config:
./kilo.jsonc in the repo
Use the global file for defaults you want everywhere. Use the project file when a repo needs its own provider, vector store, file filters, or tuning.
Again: setting provider, model, or an API key does not start indexing. indexing.enabled must be true at the scope you intend.
Path 1: Kilo Gateway users #
If you already use Kilo tokens, check Kilo Code Settings → Indexing first. The current public indexing docs list supported embedding providers and provider config keys, but they do not currently document a Kilo Gateway-specific indexing provider or a Gateway embeddings endpoint.
That matters because provider is not a display label. It is a config key Kilo uses to load the provider. Do not guess a key like kilo-gateway unless your installed Kilo Code build writes it for you.
If your build shows Kilo Gateway as an embedding option, use the UI and let Kilo write the provider shape. A guarded example looks like this:
{
"indexing": {
"enabled": true,
// Use the provider key written by your installed Kilo Code build.
// The current public indexing docs do not document a Kilo Gateway
// embedding provider key, so do not hand-write one from memory.
"provider": "<kilo-gateway-provider-from-ui>",
"model": "<embedding-model-from-ui>",
"vectorStore": "lancedb",
"lancedb": {}
}
}
If the UI does not show Kilo Gateway for embeddings, use one of the documented direct provider paths below. The Gateway docs confirm Kilo’s OpenAI-compatible Gateway for chat, FIM, model listing, and provider listing. The indexing docs are the source of truth for indexing embedding providers.
Path 2: Mistral BYOK #
Use Mistral BYOK when you want to bring a Mistral API key from La Plateforme.
In the UI:
Open
Kilo Code Settings → Indexing.Enable indexing globally or for this project.
Choose mistral as the embedding provider.
Paste your Mistral API key.
Choose Qdrant or LanceDB.
Save.
The docs call out one easy mistake: Codestral-specific keys from the Mistral autocomplete setup guide are not interchangeable with regular Mistral API keys for indexing. Use an API key from La Plateforme.
Minimal Mistral BYOK with LanceDB:
{
"indexing": {
"enabled": true,
"provider": "mistral",
"model": "",
"vectorStore": "lancedb",
"mistral": {
"apiKey": "<MISTRAL_API_KEY_FROM_LA_PLATEFORME>"
},
"lancedb": {}
}
}
Leave model unset if you want the provider default. Set it only when you have a specific Mistral embedding model you want Kilo to use.
Path 3: Ollama + LanceDB for fully local indexing #
Use Ollama + LanceDB when you do not want indexing data to leave your machine.
Ollama runs the embedding model locally. LanceDB is embedded and file-based, so there is no vector database server to run. With this setup, parsing, embedding, and vector storage all happen locally.
Install and start Ollama, then pull an embedding model. The indexing docs list mxbai-embed-large, nomic-embed-text, and all-minilm as Ollama options.
ollama pull nomic-embed-text
Then configure Kilo:
{
"indexing": {
"enabled": true,
"provider": "ollama",
"model": "nomic-embed-text",
"vectorStore": "lancedb",
"ollama": {
"baseUrl": "http://localhost:11434"
},
"lancedb": {}
}
}
This is the simplest fully local setup: no hosted embedding API, no external vector database, and no external calls for indexing. If status moves to Error, confirm that Ollama is running, the model was pulled successfully, and baseUrl matches your local Ollama server.
Path 4: OpenAI #
Use OpenAI when you want a hosted embedding model with a small config surface.
The docs list text-embedding-3-small
as the default, text-embedding-3-large
for higher accuracy, and text-embedding-ada-002
as legacy.
{
"indexing": {
"enabled": true,
"provider": "ollama",
"model": "nomic-embed-text",
"vectorStore": "lancedb",
"ollama": {
"baseUrl": "http://localhost:11434"
},
"lancedb": {}
}
}
If you see rate-limit or batch errors during indexing, lower embeddingBatchSize
before changing providers.
Other direct providers #
Kilo also supports these direct embedding provider shapes. These examples are intentionally brief: use them when you already know which provider and embedding model you want.
OpenAI-compatible endpoint:
{
"indexing": {
"provider": "openai-compatible",
"model": "<embedding-model>",
"openai-compatible": {
"baseUrl": "https://...",
"apiKey": "..."
}
}
}
Gemini:
{
"indexing": {
"provider": "openai-compatible",
"model": "<embedding-model>",
"openai-compatible": {
"baseUrl": "https://...",
"apiKey": "..."
}
}
}
Vercel AI Gateway:
{
"indexing": {
"provider": "vercel-ai-gateway",
"model": "<embedding-model>",
"vercel-ai-gateway": {
"apiKey": "..."
}
}
}
AWS Bedrock:
{
"indexing": {
"provider": "bedrock",
"model": "<embedding-model>",
"bedrock": {
"region": "us-east-1",
"profile": "default"
}
}
}
OpenRouter:
{
"indexing": {
"provider": "openrouter",
"model": "<embedding-model>",
"openrouter": {
"apiKey": "...",
"specificProvider": "..."
}
}
}
Voyage:
{
"indexing": {
"provider": "voyage",
"model": "voyage-code-3",
"voyage": {
"apiKey": "..."
}
}
}
For any of these, add enabled
, vectorStore
, and vector store settings when you want indexing to start:
{
"indexing": {
"enabled": true,
"provider": "voyage",
"model": "voyage-code-3",
"vectorStore": "lancedb",
"voyage": {
"apiKey": "..."
},
"lancedb": {}
}
}
Choose a vector store: Qdrant or LanceDB #
The vector store is where Kilo saves embeddings after it chunks your code.
Use LanceDB when you want the least moving parts. It is embedded and file-based. You do not need Docker, a server process, or a network connection. If you omit a directory, Kilo stores LanceDB data under the Kilo data directory by default.
{
"indexing": {
"vectorStore": "lancedb",
"lancedb": {}
}
}
Use Qdrant when you want an external vector database server. The docs list Qdrant as the default vector store and recommend it for larger codebases and team deployments. For production, use authentication.
Start Qdrant locally with Docker:
docker run -p 6333:6333 qdrant/qdrant
Then configure Kilo:
{
"indexing": {
"vectorStore": "qdrant",
"qdrant": {
"url": "http://localhost:6333",
"apiKey": ""
}
}
}
If indexing fails with Qdrant selected, check that the server is running, the URL is reachable from Kilo, and the API key matches your Qdrant deployment.
Configure indexing from the CLI #
The Kilo CLI includes an interactive indexing command when the indexing plugin is installed.
Open a Kilo TUI session in your repo and run:
/indexing
Aliases also work:
/index
/embedding
The dialog can toggle indexing, choose an embedding provider, set provider credentials, choose a model, set vector dimensions, choose Qdrant or LanceDB, configure vector store settings, and adjust tuning parameters. Changes are written to kilo.jsonc and take effect immediately.
A complete CLI-style config can look like this:
{
"indexing": {
"enabled": true,
"provider": "voyage",
"model": "voyage-code-3",
"dimension": 1024,
"vectorStore": "qdrant",
"voyage": {
"apiKey": "pa-..."
},
"qdrant": {
"url": "http://localhost:6333",
"apiKey": ""
},
"searchMinScore": 0.4,
"searchMaxResults": 50,
"embeddingBatchSize": 60,
"scannerMaxBatchRetries": 3
}
}
When indexing is enabled, the CLI shows an IDX
badge at the bottom of the TUI: IDX In Progress 40% 120/300
, IDX Complete
, IDX Standby
, or IDX Error <message>
.
Tune the defaults only when you have a reason #
The defaults are a good starting point. Change them when you are solving a specific failure mode.
searchMinScore
controls the minimum similarity score for returned results. The default is 0.4
. Raise it if searches return too much loosely related code. Lower it if searches miss relevant results.
searchMaxResults
controls how many results semantic search can return. The default is 50
. Lower it if the agent receives too much context. Raise it if you are working in a large repo and relevant matches are being cut off.
embeddingBatchSize
controls how many code segments Kilo sends to the embedding provider per batch. The default is 60
. Lower it if a hosted provider rate-limits you or if local embedding runs out of memory.
scannerMaxBatchRetries
controls how many times Kilo retries a failed embedding batch. The default is 3
. Raise it only if failures are transient and retrying is actually helping.
Example conservative hosted-provider tuning:
{
"indexing": {
"searchMinScore": 0.45,
"searchMaxResults": 30,
"embeddingBatchSize": 25,
"scannerMaxBatchRetries": 3
}
}
Control what gets indexed #
Kilo does not blindly embed every file in your repo.
By default, it excludes:
binary files and images
files larger than 1MB
.git
directoriesdependency folders such as
node_modules
andvendor
files ignored by
.gitignore
files ignored by
.kilocodeignore
Use .kilocodeignore
when you want indexing-specific exclusions without changing Git ignore rules. Common examples include generated clients, build output, large snapshots, vendored SDKs, or private test fixtures that should not be sent to a hosted embedding provider.
Example .kilocodeignore
:
generated/
**/*.generated.ts
fixtures/snapshots/
.env*
.local-notes/
If you are using a hosted provider, remember the privacy model: parsing happens locally, and Kilo sends small code snippets for embedding, not whole files. If nothing should leave the machine, use Ollama + LanceDB.
Verify semantic search works #
Wait for status to reach Complete. Then ask Kilo a conceptual question that would be annoying to grep for:
Where do we validate user permissions before deleting a resource?
A healthy semantic search result should include relevant snippets, file paths, line numbers, similarity scores, and enough surrounding context for the agent to decide what to read next.
If status shows Error
:
Read the error message in the UI or
IDX Error
badge.Confirm
indexing.enabled
is true at the scope you intended.Check provider credentials.
If using Qdrant, confirm the server is running and reachable.
If using Ollama, confirm the embedding model is pulled and Ollama is listening on the configured
baseUrl
.Lower
embeddingBatchSize
if the provider is rate-limiting or local embedding is failing under load.Check
.kilocodeignore
and.gitignore
if files you expected to see are missing.
For local embedding failures involving batch or micro-batch settings, align the embedding model batch size and micro-batch size, restart the local server, and try again.
The setup checklist #
A working indexing setup has all of these pieces:
indexing.enabled
is true globally or for the current project.The embedding provider is documented and has valid credentials or a reachable local endpoint.
The model is an embedding model, not a chat-only model.
The vector store is configured and reachable.
File filters exclude the files you do not want indexed.
Status reaches
Complete
.Kilo can answer a conceptual code question using semantic search.
Start with the simplest path that matches your constraints: Mistral BYOK or OpenAI for hosted embeddings, Ollama + LanceDB for fully local indexing, LanceDB when you do not want a database server, and Qdrant when you need an external vector store.
Once status is Complete
, indexing is no longer a setup task. It becomes part of the agent workflow: ask Kilo for the concept, inspect the files it finds, and then make the change with the right context loaded.