Symbol-aware retrieval
A symbol graph links definitions to their references, so "who calls this function?" is answered from the code's actual structure — not just text that looks similar.
The landing page shows the highlights. This is the complete list — code intelligence, the dashboard, and the security model that keeps all of it on infrastructure you control.
Code intelligence
More than keyword search: the retrieval engine knows how your code connects, keeps its answers honest as the code changes, and plugs into the AI tools you already use.
A symbol graph links definitions to their references, so "who calls this function?" is answered from the code's actual structure — not just text that looks similar.
If a cited file changed since it was indexed, the citation says so. The answer never quietly points you at lines that moved.
Only changed files re-embed, so keeping a large repository current is fast — and the nightly refresh stays cheap.
Rate answers and a built-in eval harness replays them after any model or index change, reporting drift — the backbone of ongoing retrieval tuning.
Ask one question across every indexed repository at once, with each citation tagged by repo — "how do the frontend and backend handle this?" in a single answer. Standard from Pro up.
Commit messages index alongside code, so "why was this changed?" and "when did this break?" get grounded answers — coverage a cloud indexer never sees, because it never sees your history.
A generated repo map and per-module summaries answer "how does auth work overall" with an overview instead of fragments — written by your local model, from your code, on your machine.
A local reranker re-reads the top candidates against your actual question before answering — a sensible default at sub-second cost. The measured file-hit jump on our Express benchmark came from a separate retrieval fix, not the reranker, whose isolated effect was within noise on 30 questions; we keep it on by default and are re-measuring its benefit on a larger repository.
An MCP server exposes the same engine to Claude Code, OpenClaw, and any MCP client — bounded, cited context instead of re-reading files. The server itself is local-only: pair it with a local-model client and the loop stays zero-egress end to end; a cloud-backed client sends what it retrieves to its own vendor, by your choice.
The dashboard
Everything ships with a browser dashboard — connect your source control platforms, manage repositories and models, and ask questions about your code without touching a terminal.
Sign in to GitHub, GitLab, or Bitbucket once. Browse and autocomplete your repositories as you type, and clone private repos without per-clone credentials.
Repositories index automatically on import. Update, sync, or switch branches per repo — and a stale index is one click from fresh.
Literal and semantic search with file-type filters, plus Ask mode for grounded answers where every citation clicks open to its source. History and archive are built in.
Citations and search results open the full file in a syntax-highlighted viewer — cited lines marked and scrolled into view, 15 languages, selectable light and dark code themes.
Pull, select, and uninstall Ollama models from the UI. The embedding model that powers search is protected from accidental removal.
Pin a question as a standing check. After every reindex it re-asks itself and flags you when the cited answer drifts — "did the auth flow change this sprint?" answers itself.
Pin good answers and export them as a markdown knowledge base generated from your own code — every claim keeping its file-and-line citations.
Background polling keeps status, repositories, and models current without refresh buttons, and the health indicator flashes the moment anything needs attention.
Trust layer
The privacy model is architectural, not a policy: nothing leaves the machine, and every control below is there to keep it that way.
Embeddings run on local Ollama, vectors live in local ChromaDB, and answers come from local models.
Token-guarded sessions with a one-click Lock, a strict content-security policy, and a loopback guard keep the control plane local unless you explicitly unlock it.
Access tokens are generated server-side and rotated from the UI in one click — nobody types or chooses a credential, and rotation signs every other session out instantly.
.env files, lockfiles, and dependency directories are excluded automatically so credentials never become searchable vectors.
The retrieval engine is built natively on two auditable local services — ChromaDB for vectors, Ollama for models. No LangChain-style orchestration layer in between: a smaller attack surface, every line yours to audit.
Search, file read, and task endpoints require shared-secret signatures, with separate secrets so one leak does not expose the whole stack.
Path-escape and symlink checks keep every read inside its repository, while file allowlists block binaries and unknown formats.
Removing a repository requires typed confirmation and cleans up the working copy, vectors, and metadata.
← Back to SourceVault · See the benchmark → · Install free — 7-day trial →