Changelog

What's new in SourceVault. Everything below runs on infrastructure you control — local models, zero source-code egress.

2026-07-03 — Production hardening: quieter errors, rate limits everywhere, and a current-generation stack

Server errors keep your internals private

Error responses no longer echo raw messages — which could carry absolute file paths or upstream response bodies — and a global handler replaces the HTML stack trace Express would otherwise emit. The specifics still land in the server log, where they belong. Crash guards round it out: an unexpected exception exits cleanly for the service manager to respawn, and a background failure can't take the server down.

Every API surface is rate-limited

The signed machine API and the task relay now sit behind rate limits mounted ahead of signature checks, so even brute-force attempts are throttled. Limits are tunable per install; the defaults are generous for real use.

Hardened install and container

The Docker image runs as a non-root user on a digest-pinned base, and the Linux installer now downloads, shows a checksum, and asks before executing — instead of piping straight from the network into a root shell. The Homebrew path is unchanged: brew upgrade sourcevault keeps working as before, and releases now publish to the tap automatically.

A current-generation stack underneath

The server now runs on Node 24 (Active LTS) and Express 5; the dashboard builds on React 19 and Vite 8 with a zero-warning build. Dependency updates are automated weekly, so security patches don't wait for a release cycle.

2026-06-18 — Indexing at scale, live progress, and a retrieval-accuracy jump

More accurate file finding

A natural-language question that named a method (for example, "how does res.sendFile send a file?") could anchor too hard on the literal name and surface where the method is used instead of where it's defined. Retrieval now folds exact-name matches into the hybrid ranking instead of letting them short-circuit it. On the public Express benchmark, the share of answers citing the exact ground-truth file rose from 60% to 83%.

Large repositories index reliably

Indexing now runs as a background job instead of inside the request, so a large repository no longer times out mid-run. A single unreadable or malformed file is skipped instead of aborting the whole index, and text that a strict vector store would reject is sanitized automatically — the kind of edge that used to stall a big repo at 80-something percent.

Live progress that survives a reload — and a restart

The dashboard shows scanning, indexing percentage, and an estimated time remaining instead of a bare spinner, and each repository keeps its own progress when several index at once. Reopen the page mid-index and the progress reattaches; restart the server and an interrupted index resumes where it left off, finishing only the remaining files.

Faster indexing on large and vendored repositories

Files now embed and upsert through a bounded concurrent pool — roughly twice as fast on a full index — and vendored, generated, and build directories are filtered out by default, with a per-repository .sourcevaultignore for anything else. On a dependency-heavy repository that's about 90% fewer files to index, which means faster runs and more relevant answers.

Optional high-throughput embedding and reranking backends

For larger installs, search and reranking can now route through a dedicated Text Embeddings Inference server, with optional embedding-dimension truncation for a smaller, faster vector index. Off by default — the local-only path is unchanged, and the built-in cross-encoder reranker remains the default, with its isolated benefit being re-measured on larger repositories.

← Back to SourceVault · See the benchmark · support@sourcevault.ai