Dependi LSP Architecture
Audience: contributors. Read the Architecture section in README.md and Configuration first for user-facing context — this guide expands README’s high-level diagram into the full layered model.
Table of Contents
- Introduction
- Top-Level Architecture
- Request Lifecycle
- Core Data Structures
- Parsers
- Registries
- Cache Strategy
- Vulnerability Path
- Providers
- Authentication & Private Registries
- Key Design Decisions
- Glossary & Further Reading
1. Introduction
This document describes the internal architecture of the Dependi LSP server (dependi-lsp/) — the Rust binary that implements the Language Server Protocol providing dependency hints, vulnerability scanning, code actions and completions for the Zed editor.
Audience
Contributors who want to add an ecosystem, fix a bug, optimise a hot path, or simply build a mental model of how a request flows through the system.
What this document IS
- A guided tour of the LSP server’s internals.
- A high-fidelity sequence diagram of the request lifecycle.
- An explanation of the design decisions (caching strategy, concurrency, debouncing).
- A map of which module owns which responsibility.
What this document is NOT
- A user-facing how-to (see Installation and Configuration).
- An API reference for public types — that lives in
cargo doc. - A per-ecosystem parsing reference (see Languages).
System boundary
flowchart LR
Editor["Zed editor"]
LSP["dependi-lsp<br/>(Rust binary)"]
Reg["Package registries<br/>(crates.io, npm, PyPI, …)"]
OSV["OSV.dev<br/>(vulnerability database)"]
Cache[("Hybrid cache<br/>memory + SQLite")]
Editor <-->|stdio JSON-RPC| LSP
LSP -->|HTTPS| Reg
LSP -->|HTTPS| OSV
LSP <--> Cache
The LSP communicates with Zed over stdio JSON-RPC (initiated by dependi-zed/ — a thin WASM extension). It fans out to package registries (per ecosystem) and to OSV.dev for vulnerability data. All network responses are cached in a two-tier hybrid store.
2. Top-Level Architecture
The LSP is organised as five layers, each occupying a single subdirectory of dependi-lsp/src/:
| Layer | Modules | Responsibility |
|---|---|---|
| Transport | main.rs, lib.rs |
Bootstrap, CLI parsing (clap), tokio runtime, JSON-RPC stdio loop via tower-lsp. |
| Handlers | backend.rs |
Implements tower_lsp::LanguageServer; owns shared state (DependiBackend); routes LSP requests to providers. |
| Providers | dependi-lsp/src/providers/ |
Five plain functions producing LSP responses: inlay hints, diagnostics, code actions, completions, document links. |
| Domain | dependi-lsp/src/parsers/, dependi-lsp/src/registries/, dependi-lsp/src/auth/, dependi-lsp/src/vulnerabilities/ |
Manifest parsing, registry HTTP clients, credential resolution, OSV vulnerability queries. |
| Cache | dependi-lsp/src/cache/ |
Two-tier hybrid cache (memory + SQLite) for version info; separate caches for vulnerability seen-set and OSV advisories. |
Module map
| Path | Role |
|---|---|
dependi-lsp/src/main.rs |
Entry: tokio runtime, CLI subcommands, stdio LSP loop. |
dependi-lsp/src/backend.rs |
DependiBackend — central handler struct; debounce + dispatch. |
dependi-lsp/src/config.rs |
Workspace config (registries, auth, cache TTL, debounce ms). |
dependi-lsp/src/document.rs |
Text/document utilities. |
dependi-lsp/src/file_types.rs |
FileType enum + path-pattern → ecosystem detection. |
dependi-lsp/src/document.rs |
DocumentState per-open-document snapshot. |
dependi-lsp/src/reports.rs |
JSON / Markdown vulnerability report generation. |
dependi-lsp/src/settings_edit.rs |
Programmatic edits to user settings (used by code actions). |
dependi-lsp/src/utils.rs |
Shared utilities. |
dependi-lsp/src/auth/ |
Token providers, cargo credentials, npmrc parsing. |
dependi-lsp/src/parsers/ |
Per-ecosystem parsers (cargo, npm, python, go, php, dart, csharp, ruby, maven) + lockfile resolvers (lockfile_resolver.rs, lockfile_graph.rs). |
dependi-lsp/src/providers/ |
Five LSP feature implementations. |
dependi-lsp/src/registries/ |
Registry HTTP clients (incl. maven_central.rs) + shared reqwest::Client. |
dependi-lsp/src/vulnerabilities/ |
OSV client + vulnerability seen-set cache. |
dependi-lsp/src/cache/ |
Hybrid cache, advisory cache. |
The strict directional rule: handlers call providers, providers call domain modules, domain modules read/write the cache. Providers do not call each other; domain modules do not call handlers.
Adding a new ecosystem? The parser (§5), registry (§6) and FileType entry (§2 above) are the three touch-points. A step-by-step walkthrough is in Adding a language.
3. Request Lifecycle
A typical edit-to-hint cycle illustrates how the layers interact. Below is the path of a textDocument/didChange notification on package.json.
sequenceDiagram
participant Z as Zed editor
participant B as DependiBackend<br/>(backend.rs)
participant D as Debounce task<br/>(tokio::spawn)
participant P as process_document
participant Pa as NpmParser
participant L as LockfileResolver
participant R as NpmRegistry
participant C as HybridCache
participant Pr as Providers
participant V as Vulnerability spawner
Z->>B: didChange(uri, text)
B->>B: abort prior debounce<br/>(JoinHandle::abort)
B->>B: increment debounce_generation
B->>D: spawn debounced task
D-->>D: sleep 200 ms
D->>P: process_document(uri, content)
P->>Pa: parse(content)
Pa-->>P: Vec<Dependency>
P->>L: resolve_versions_from_lockfile(deps)
L-->>P: Arc<LockfileGraph>
loop for each dep
P->>C: get(key)
alt cache miss
P->>R: get_version_info(name)
R-->>P: VersionInfo
P->>C: set(key, info)
end
end
P->>Pr: create_inlay_hint / create_diagnostics / …
Pr-->>P: hints + diagnostics
P->>Z: publish_diagnostics + inlay hints
P->>V: spawn vuln fetch (non-blocking)
V->>C: second-pass publish_diagnostics
V-->>Z: updated diagnostics
Key invariants
- Debounce (coalescing rapid keystrokes into one delayed action): every
didChangewrites the new content intopending_changes, aborts the previousJoinHandle, and spawns a fresh task. After sleeping 200 ms the task verifies its content is still the value inpending_changes(an equality check on the buffered text) — if a newer keystroke arrived, that buffer was overwritten and this task exits without doing work. AnArc<AtomicU64>generation counter is used independently to clean up the per-URIdebounce_tasksslot viaremove_if(stored_gen == generation). Full rationale in §11 Key Design Decisions. - Per-document bounded concurrency: each
process_documentcall constructs its owntokio::sync::Semaphore::new(5)inline (backend.rs) — it is not a field ofProcessingContextand does not cap global registry traffic. The cap is per-document: a single document is limited to 5 concurrent registry fetches; multiple documents being processed in parallel each get their own semaphore. - Vulnerabilities never block hints: the vuln path is a separate
tokio::spawn. Inlay hints publish first; vulnerability diagnostics arrive in a second pass. - Document state lives in
Arc<DashMap<Url, DocumentState>>—DashMapshards are independently locked, so two handlers reading different URIs never block each other. Handlers snapshot the fields they need into locals and drop theRef(theDashMapshard reference) before anyawait, since shard references are notSend.
The did_open, did_save and did_close paths share the same skeleton minus the debounce: open and save run process_document immediately, close removes the entry from documents and any pending debounce.
4. Core Data Structures
Six types form the spine of the system.
DependiBackend (dependi-lsp/src/backend.rs)
The central handler struct that implements tower_lsp::LanguageServer. All shared state hangs off here as Arc<…> pointers so each LSP handler can hold a cheap clone:
| Field group | Purpose |
|---|---|
client: tower_lsp::Client |
Outbound channel to Zed (publish diagnostics, send requests). |
config: Arc<RwLock<Config>> |
Workspace configuration. |
documents: Arc<DashMap<Url, DocumentState>> |
Per-document parsed state. |
<ecosystem>_parser (9 fields) |
Per-ecosystem Arc<XxxParser>. |
<ecosystem>_registry (8 fields) |
Per-ecosystem registry clients. |
cargo_custom_registries: Arc<DashMap<String, Arc<CargoSparseRegistry>>> |
Alternative cargo registries keyed by name. |
http_client: Arc<reqwest::Client> |
Shared HTTP client (one connection pool). |
token_manager: Arc<TokenProviderManager> |
URL → auth-header resolution. |
osv_client, vuln_cache, advisory_cache, negative_advisory_cache |
Vulnerability subsystem. |
transitive_vuln_data: Arc<DashMap<VulnCacheKey, Vec<Vulnerability>>> |
Per-package transitive vuln payloads. |
debounce_tasks: Arc<DashMap<Url, (u64, JoinHandle<()>)>>, debounce_generation: Arc<AtomicU64>, pending_changes: Arc<DashMap<Url, String>> |
Debounce coordination. |
version_cache: Arc<HybridCache> |
Two-tier version-info cache. |
Dependency (dependi-lsp/src/parsers/mod.rs)
The output of every parser:
| Field | Type | Purpose |
|---|---|---|
name |
String |
Package name as declared. |
version |
String |
Version constraint as declared. |
name_span, version_span |
Span |
LSP-friendly source ranges. |
dev, optional |
bool |
Manifest flags. |
registry |
Option<String> |
Alternative registry name (cargo-only at present). |
resolved_version |
Option<String> |
Concrete version from a lockfile, populated post-parse. |
Span (dependi-lsp/src/parsers/mod.rs)
pub struct Span {
pub line: u32, // 0-indexed line number
pub line_start: u32, // byte offset, relative to start of line
pub line_end: u32, // byte offset (exclusive), relative to start of line
}
Note the offsets are line-relative byte offsets, not file-relative. This makes conversion to LSP positions trivial: line is direct, character is computed by counting UTF-16 code units in line[line_start..line_end].
VersionInfo (dependi-lsp/src/registries/mod.rs)
Returned by every registry client. Contains latest, latest_prerelease, versions: Vec<String>, description, homepage, repository, license, vulnerabilities, deprecated, yanked_versions, release_dates: HashMap<String, DateTime<Utc>>, transitive_vulnerabilities. Vulnerability fields are populated separately (after the OSV pass) — registry clients themselves never query OSV.
DocumentState (dependi-lsp/src/document.rs)
Per-open-document snapshot held in documents: DashMap<Url, DocumentState> on the backend. Stores parsed dependencies, file type, content, last-publish timestamps. Read once into a local then Ref dropped — never held across await.
FileType (dependi-lsp/src/file_types.rs)
A Copy-able enum tagging the ecosystem of an open file: Cargo, Npm, Python (covers requirements.txt, constraints.txt, pyproject.toml, hatch.toml), Go, Php, Dart, Csharp, Ruby, Maven. Single source of truth used by handlers, parsers and providers when picking ecosystem-specific behaviour. Lockfiles do not get their own variants — the ecosystem variant identifies both manifest and lockfile, and lockfile_resolver.rs handles per-ecosystem dispatch.
5. Parsers
The Parser trait
Every manifest parser implements a deliberately tiny trait (dependi-lsp/src/parsers/mod.rs):
pub trait Parser: Send + Sync {
fn parse(&self, content: &str) -> Vec<Dependency>;
}
Synchronous, infallible, malformed input silently yields an empty Vec. Rationale: LSP fires parse on every keystroke (after debounce) — we cannot stop the world for a TOML error, so we degrade gracefully and let the editor’s native syntax checker flag invalid manifests.
Span model
Each Dependency carries name_span and version_span of type Span. Spans use byte offsets relative to the start of the line, not the start of the file. This was a conscious choice — see §11 Key Design Decisions.
File type routing
Routing is performed by FileType::detect in dependi-lsp/src/file_types.rs — pure path matching, no trait dispatch:
pub fn detect(uri: &Url) -> Option<Self> {
let path = uri.path();
if path.ends_with("Cargo.toml") { Some(FileType::Cargo) }
else if path.ends_with("package.json") { Some(FileType::Npm) }
// … etc.
else { None }
}
The downstream process_document switches on the resulting FileType to pick the right Arc<XxxParser> from DependiBackend. Static dispatch, zero virtual calls.
Lockfile resolution
Two cooperating modules turn Cargo.lock / package-lock.json / composer.lock / pubspec.lock / go.sum / Gemfile.lock / etc. into structured graphs:
LockfileResolvertrait (dependi-lsp/src/parsers/lockfile_resolver.rs) — async trait withfind_lockfile,parse_graph,normalize_name,resolve_version.select_resolver(file_type)returns the per-ecosystem implementation asBox<dyn LockfileResolver>. The free functionresolve_versions_from_lockfile(deps: &mut [Dependency], resolver: Box<dyn LockfileResolver>, manifest_path: &Path) -> Option<Arc<LockfileGraph>>then locates the lockfile, parses the graph, and back-fills eachdep.resolved_versionin place; the returned graph feeds the vulnerability pass.LockfileGraph(dependi-lsp/src/parsers/lockfile_graph.rs) — DFS algorithms overVec<LockfilePackage>:transitive_deps_of(name)— cycle-safe DFS, multi-version aware.transitives_only(direct)— reachability filter for “what is brought in by this top-level dep”.reverse_index()— transitive → direct attribution map (used to point a vulnerability inlodashback to the user’sgulpdeclaration).
The graph is consumed downstream for vulnerability attribution — version selection itself is the resolver’s job, not the graph’s.
6. Registries
The Registry trait
Every registry client implements a small async trait (dependi-lsp/src/registries/mod.rs):
#[allow(async_fn_in_trait)]
pub trait Registry: Send + Sync {
async fn get_version_info(&self, package_name: &str) -> anyhow::Result<VersionInfo>;
fn http_client(&self) -> Arc<Client>;
}
Each implementation owns its registry-specific URL templates, response parsing and error mapping (404 → “not found”, 429 → retry-after honoured, 5xx → anyhow::Error).
Shared reqwest::Client
A single Arc<reqwest::Client> is created once at startup by create_shared_client in dependi-lsp/src/registries/http_client.rs and threaded into every registry via with_client(Arc<Client>). Important properties:
pool_max_idle_per_host = 10tcp_keepalive = 60spool_idle_timeout = 90stimeout = 10s,connect_timeout = 5s
Sharing the client means one TCP connection pool serves all 10 registry clients (crates_io, cargo_sparse, npm, pypi, go_proxy, packagist, pub_dev, nuget, rubygems, maven_central), dramatically reducing handshake overhead for the typical “user opened a polyglot monorepo” scenario.
There is no HTTP-level cache. Caching happens at the application layer (see §7 Cache Strategy) where we serialise full VersionInfo records, not raw HTTP bodies.
Bounded concurrency
process_document builds a tokio::sync::Semaphore::new(5) per request (created inline; not a field of ProcessingContext) and acquires a permit before each get_version_info call. Five in-flight requests is enough to saturate a slow-path lockfile while staying polite to public registries. For per-registry rate limits and HTTP API details see Registries.
Per-ecosystem clients
| Module | Registry | Notes |
|---|---|---|
crates_io.rs |
crates.io | Used for the canonical cargo registry. |
cargo_sparse.rs |
Alternative cargo registries (sparse index) | Per-registry instances stored in cargo_custom_registries: DashMap<String, Arc<…>>. |
npm.rs |
npm | Behind RwLock to allow runtime reconfiguration of registry URL via .npmrc. |
pypi.rs |
PyPI | JSON API. |
go_proxy.rs |
proxy.golang.org | Honours GOPROXY env. |
packagist.rs |
Packagist (PHP) | |
pub_dev.rs |
pub.dev (Dart) | |
nuget.rs |
NuGet | |
rubygems.rs |
RubyGems | |
maven_central.rs |
Maven Central | Behind RwLock for repository reconfiguration. |
Two tiny utility modules sit alongside: version_utils.rs (semver helpers shared across ecosystems) and url_sanitizer.rs (strips credentials / normalises registry URLs before logging).
7. Cache Strategy
Three independent caches each live in dependi-lsp/src/cache/. They serve different access patterns and have different TTLs.
flowchart TB
subgraph hybrid["HybridCache (version info)"]
L1[("Memory L1<br/>DashMap<String, CacheEntry><br/>TTL 1h, lazy expiry")]
L2[("SQLite L2<br/>r2d2 pool, WAL<br/>~/.cache/dependi/cache.db")]
L1 -->|miss → read-through| L2
L2 -->|backfill| L1
end
subgraph vuln["VulnerabilityCache (seen-set)"]
VS[("DashMap<VulnCacheKey, Instant><br/>TTL 6h, no payload")]
end
subgraph adv["HybridAdvisoryCache (positive + negative)"]
AP[("Positive: long TTL<br/>RUSTSEC payloads")]
AN[("Negative: short TTL<br/>404 markers")]
end
CleanupBackground cleanup<br/>every 30 min
Cleanup --> L1
Cleanup --> L2
Cleanup --> AP
Cleanup --> AN
HybridCache — version info
Two-tier read-through, write-through cache (dependi-lsp/src/cache/mod.rs):
- L1 =
MemoryCache—DashMap<String, CacheEntry>. Key like"crates.io:serde". Value =VersionInfo+(inserted_at, ttl). Default TTL 1 hour. Lazy expiry onget; bulk sweep every 30 min by a background tokio task. - L2 =
SqliteCache—r2d2::Pool<SqliteConnectionManager>over~/.cache/dependi/cache.db. Schema:packages(key TEXT PRIMARY KEY, data TEXT, inserted_at INTEGER, ttl_secs INTEGER). WAL journal. Index on(inserted_at, ttl_secs)for bulk expiry deletes. All DB calls offloaded totokio::task::spawn_blocking— never block the runtime.
Read path: check L1 → on miss, read L2 → on hit, write back to L1. Write path: write to both layers simultaneously.
VulnerabilityCache — seen-set
A side-cache (dependi-lsp/src/vulnerabilities/cache.rs) that does not store any payload. Key = VulnCacheKey { ecosystem, package_name, version }. Value = (inserted_at,) only. TTL 6 hours. Purpose: prevent redundant OSV API calls for packages already queried. The actual vulnerability data lives in VersionInfo inside the main HybridCache.
HybridAdvisoryCache — RUSTSEC advisories
Two-tier cache (dependi-lsp/src/cache/advisory/) for individual OSV advisories looked up by ID. Key = advisory ID string. Value = CachedAdvisory { id, kind: AdvisoryKind::Found { summary, unmaintained } | NotFound, fetched_at }.
The cache is split into positive and negative variants for one reason: 404s should expire faster than 200 OKs (a not-yet-published advisory may appear within hours; a published advisory is immutable for days). Default policy: negative TTL ≪ positive TTL.
sqlite_manager.rs vs sqlite.rs
sqlite_manager.rs— low-levelr2d2::ManageConnectionimpl. Opens connections, applies PRAGMAs (busy_timeout,synchronous=NORMAL,cache_size).sqlite.rs— higher-levelSqliteCachethat owns the pool and implements theReadCache/WriteCachetraits.
8. Vulnerability Path
Vulnerability detection runs on a separate tokio::spawn so it never blocks inlay-hint or diagnostic publish.
sequenceDiagram
participant P as process_document
participant Spawner as Vuln spawner<br/>(tokio::spawn)
participant VC as VulnerabilityCache<br/>(seen-set)
participant O as OsvClient
participant API as OSV.dev API
participant AC as HybridAdvisoryCache
participant Z as Zed editor
P->>Spawner: spawn fetch_vulnerabilities_background(deps)
P-->>Z: publish_diagnostics (1st pass, no vuln data)
Spawner->>VC: contains(key) for each dep
Note over Spawner,VC: filter to unseen
Spawner->>O: query_batch(unseen)
O->>API: POST /v1/querybatch
API-->>O: BatchResponse
loop per RUSTSEC ID found
O->>AC: get(id) — positive then negative
alt cache hit
AC-->>O: CachedAdvisory
else cache miss
O->>API: GET /vulns/{id}
API-->>O: advisory
O->>AC: write to positive or negative
end
end
O-->>Spawner: Vec<Vulnerability> per dep
Spawner->>VC: insert seen keys
Spawner-->>Z: publish_diagnostics (2nd pass, with vulns)
Severity model
OSV returns CVSS scores. We map them to four severities (dependi-lsp/src/vulnerabilities/osv.rs):
| CVSS score | Severity | LSP diagnostic level |
|---|---|---|
| ≥ 9.0 | Critical |
Error |
| ≥ 7.0 | High |
Error |
| ≥ 4.0 | Medium |
Warning |
| < 4.0 | Low |
Hint |
Non-numeric CVSS strings default to Medium (defensive — do not silently swallow severity information). User-visible severity indicators are described in Security Scanning.
Concurrency
OsvClient runs at most 5 concurrent /vulns/{id} lookups (RUSTSEC_ADVISORY_LOOKUP_CONCURRENCY). The batch query itself is a single POST.
Transitive vulnerability attribution
When a transitive dependency is vulnerable, we point the diagnostic at the direct dependency that pulled it in. The LockfileGraph::reverse_index from §5 produces this attribution. Resulting payloads land in transitive_vuln_data: Arc<DashMap<VulnCacheKey, Vec<Vulnerability>>> on DependiBackend, and the diagnostic provider consults this map alongside the per-package vulnerabilities in VersionInfo.
9. Providers
The five LSP feature providers are plain functions living in dependi-lsp/src/providers/. Each takes the inputs it needs and returns a fully-formed LSP response. There is no provider trait — see §11 Key Design Decisions for the rationale.
| Provider | Module | Signature (abridged) | When invoked |
|---|---|---|---|
| Inlay hints | inlay_hints.rs |
fn create_inlay_hint(dep, version_info, file_type) -> InlayHint |
inlayHint request, after process_document finishes. |
| Diagnostics | diagnostics.rs |
async fn create_diagnostics(deps, cache, …, file_type, transitive_vulns, ignored) -> Vec<Diagnostic> |
After process_document and again after vulnerability fetch. |
| Code actions | code_actions.rs |
async fn create_code_actions(deps, cache, uri, range, file_type, …) -> Vec<CodeActionOrCommand> |
codeAction request. |
| Completions | completion.rs |
async fn get_completions(deps, position, cache, …) -> Option<Vec<CompletionItem>> |
completion request when cursor is inside a version field. |
| Document links | document_links.rs |
fn create_document_links(deps, file_type) -> Vec<DocumentLink> |
documentLink request. |
Wiring pattern
Every LSP handler in backend.rs follows the same template:
- Look up
DocumentStateindocuments(DashMap). - Snapshot the relevant fields into local variables.
- Drop the
DashMapguard before anyawait. - Call the provider function directly with
&self.version_cache(theHybridCache) cast as&impl ReadCache. - Return the LSP response.
The provider functions are unit-testable in isolation — they accept any impl ReadCache so tests pass a MemoryCache populated with fixture data. No mocking framework needed.
Inlay hint label vocabulary
The labels rendered next to versions are produced exclusively by create_inlay_hint. The full label vocabulary (✓, → X.Y.Z, ⚠ N vulns, ⚠ Deprecated, ⊘ Yanked, → Local, ? Unknown) and rendering rules are documented in Inlay Hints; from the architecture standpoint what matters is that create_inlay_hint is the single producer — no other call site emits these strings.
10. Authentication & Private Registries
Status — two parallel paths. Production auth flows through direct config injection today: tokens from LSP config and
~/.cargo/credentials.toml(parsed bycargo_credentials::parse_credentials_content, a plainpub fn) are fed intoCargoSparseRegistry::with_client_and_config, and npm registry config tokens are fed intoNpmRegistry::with_client_and_config— both buildAuthorization: Bearer …headers at construction time. TheTokenProviderManagerdynamic-dispatch path (theTokenProvidertrait + longest-prefix lookup described below) is structurally complete and instantiated inDependiBackend, butTokenProviderManager::get_auth_headersand the.npmrcparsers (parse_token_from_content,parse_registry_from_content,extract_auth_token,resolve_env_var) are still#[cfg(test)]-gated — they are the planned mechanism for runtime-resolved per-request auth (e.g. matching a request URL against many registered scopes), which lands in a follow-up. See Private Registries for user-facing setup.
TokenProvider trait
pub trait TokenProvider: Send + Sync {
fn get_auth_headers(&self, url: &str) -> Option<HeaderMap>;
}
Implementations decide if a request URL falls within their scope and, if so, return the headers to attach. The current built-in is EnvTokenProvider, which issues Authorization: Bearer <token> from an env var.
TokenProviderManager
Stores tokio::sync::RwLock<hashbrown::HashMap<String, Arc<dyn TokenProvider>>> keyed by URL prefix (dependi-lsp/src/auth/mod.rs). Resolution walks the keys and picks the longest matching prefix so a more specific scope (e.g. https://internal.npm.example.com/scoped/) wins over a general one (https://internal.npm.example.com/).
Two safety properties enforced at registration:
- HTTPS only.
registerispub async fnreturning(); non-HTTPS URLs are rejected and logged viatracing::error!("SECURITY: Refusing to register auth provider for non-HTTPS URL: …"), and the function returns early without inserting. Bearer tokens never travel cleartext. - No silent overwrite. Re-registering the same prefix is permitted but logged.
Credential file parsers
| Parser | Source | Status |
|---|---|---|
parse_credentials_content |
~/.cargo/credentials.toml ([registries.<name>].token) |
Plain pub fn, ready to use; production file I/O integration pending. |
parse_token_from_content, parse_registry_from_content, extract_auth_token, resolve_env_var |
.npmrc (env-var expansion supported) |
All #[cfg(test)]-gated; not yet wired. |
Both modules live under dependi-lsp/src/auth/. They are deliberately small and side-effect-free so they can be exercised by unit tests without touching the filesystem.
11. Key Design Decisions
This section captures non-obvious choices and their rationale. Each subsection answers “why this and not the obvious alternative?”.
Hybrid memory + SQLite cache
Choice: two layers, write-through, JSON-serialised payloads in SQLite.
Why not memory-only? Cold start on a polyglot monorepo would refetch hundreds of packages every session. The SQLite layer lets a fresh editor open hint within milliseconds.
Why not SQLite-only? Per-keystroke process_document runs need sub-millisecond cache reads. SQLite is fast but not fast enough for hot paths.
Why JSON in SQLite, not a normalised schema? VersionInfo evolves frequently as we surface new registry metadata. JSON lets us add fields without writing a migration; payloads are small (single-digit KB), so the storage cost is irrelevant.
Tokio multi-thread runtime
Choice: default #[tokio::main] (multi-thread).
Why not current_thread? Registry fan-out, vulnerability fetches and SQLite blocking calls all benefit from a thread pool. The LSP itself is mostly I/O-bound, but the work it dispatches is CPU-bound enough (parsers, JSON deserialisation) that a single-threaded runtime would serialise it and create head-of-line blocking on hint publishing.
Debounce via JoinHandle::abort + content equality check
Choice: per-URI JoinHandle plus a pending_changes: DashMap<Url, String> buffer. On didChange: write the new content into pending_changes (overwriting any prior pending value), abort the existing handle, and spawn a new task that sleeps 200 ms then re-reads pending_changes and only proceeds if the buffered value still equals the content this task captured. A separate Arc<AtomicU64> generation counter exists solely to clean up the per-URI debounce_tasks slot via remove_if(stored_gen == generation) — it is not the gate for “should I run”.
Why not tokio_util::sync::CancellationToken? Adding the dependency for one call site felt heavy. Abort + content equality is a handful of lines and uses only tokio core. Cancellation tokens shine when a long-lived task wants to be cooperatively interrupted; debounce is closer to “discard work in flight” — abort fits.
Plain provider functions instead of a trait
Choice: create_inlay_hint(...), create_diagnostics(...), etc. — no shared trait.
Why not trait Provider? The five providers have wildly different signatures (varying input shapes, sync vs async, Vec<X> vs Option<Vec<Y>>). Forcing them through a single trait would mean either (a) an enum-of-input/enum-of-output dance, or (b) a generic with eight type parameters. Both hurt call-site clarity. Free functions, called directly by backend.rs, are simpler and statically dispatched.
Span model relative to line start
Choice: Span { line: u32, line_start: u32, line_end: u32 } — byte offsets relative to the start of the line, not the start of the file.
Why? LSP Position is { line, character } where character is the UTF-16 code-unit count from the start of the line. Storing line-relative offsets makes the conversion a one-line slice — no need to re-walk the file or maintain a line-start table. Trade-off: spans spanning multiple lines are not representable, but every dependency declaration we care about is single-line in practice.
One shared reqwest::Client for all registries
Choice: every registry receives Arc<reqwest::Client> at construction time.
Why? A polyglot project (e.g. Cargo.toml + package.json + requirements.txt) triggers hits to crates.io, npm and PyPI within the same process_document cycle. A shared connection pool reuses TLS sessions across hosts (reqwest’s pool is per-host but uses the same DNS cache and the same TCP keep-alive policy), shaving 20–80 ms per first request to a host.
Vulnerability fetch on a separate tokio::spawn
Choice: fetch_vulnerabilities_background is fire-and-forget; first publish of diagnostics happens before vuln data exists.
Why? OSV batch query latency is variable (P50 ~150 ms, P99 > 1 s). Blocking inlay hints on it would make the editor feel sluggish on every save. Two-pass diagnostics — one fast pass, one with vulns — gives the user immediate feedback and adds the security signal as soon as it is available.
Bounded concurrency at 5
Choice: Semaphore::new(5) for registry calls and for OSV /vulns/{id} fan-out.
Why 5? Empirically: 1 is too slow for a fresh lockfile, 20 starts to draw rate-limit warnings from public registries (especially npm). Five hits the sweet spot for a “I just opened a project” flood while staying under any sane limit.
12. Glossary & Further Reading
Glossary
| Term | Meaning |
|---|---|
| LSP | Language Server Protocol — JSON-RPC contract between editors and language servers. |
| Inlay hint | Editor-rendered annotation that appears between source tokens (e.g. the → 2.1.0 next to a version). |
| Span | Byte range in source code; here line-relative (see §4). |
| OSV | osv.dev — open vulnerability database used as our security data source. |
| RUSTSEC | Vulnerability ID namespace for the Rust ecosystem; surfaced in OSV. |
| CVSS | Common Vulnerability Scoring System — numeric severity score. |
| Sparse index | Cargo’s HTTP-fetchable alternative to the full registry git index; used by alternative registries. |
| Lockfile graph | DAG produced from a lockfile (Cargo.lock, package-lock.json, …) and used for transitive vulnerability attribution. |
| Debounce | Coalesce rapid consecutive events into one delayed action; here, 200 ms after the last keystroke. |
| Seen-set | Cache that records “this key was already processed” without storing a payload — used to suppress redundant OSV calls. |
Further reading
- README.md — high-level overview and feature list.
- Configuration — user-facing settings reference (registries, TTLs, debounce).
- Private registries — user-facing private-registry setup.
- Adding a language — tutorial that mirrors the layered architecture described here.
- Features overview — per-feature deep dives (inlay hints, diagnostics, code actions, security).
- Contributing — how to set up a development environment.
- LSP specification — upstream protocol reference.
- tower-lsp — the LSP framework we build on.
- tokio — async runtime.