Architecture
This page explains how the different components of tauri-pilot fit together and the design decisions behind them.
Overview
Section titled “Overview”┌──────────────┐ Unix Socket ┌─────────────────────────────┐│ tauri-pilot │ ◄──────────────► │ tauri-plugin-pilot (Rust) ││ (CLI) │ JSON-RPC │ embedded in your app │└──────────────┘ │ │ │ ┌─────────────────────┐ │ │ │ JS Bridge (injected)│ │ │ │ window.__PILOT__ │ │ │ └─────────────────────┘ │ │ WebView (WebKitGTK) │ └─────────────────────────────┘Three components:
- CLI (
tauri-pilot): A standalone Rust binary that connects to the socket, serializes commands as JSON-RPC, and formats the output for the terminal or for machine consumption. - Plugin (
tauri-plugin-pilot): Embedded in your Tauri app (debug builds only). Starts a Unix socket server at boot, accepts connections, and routes incoming requests to the appropriate handler. - JS Bridge: Vanilla JS injected into the WebView via
js_init_script()at startup. Exposeswindow.__PILOT__with snapshot, action, and read methods that the plugin calls through WebView eval.
Unix Socket Protocol
Section titled “Unix Socket Protocol”Communication happens over a Unix socket at /tmp/tauri-pilot-{identifier}.sock.
Messages are newline-delimited JSON-RPC 2.0 — each message ends with \n. This framing makes the protocol compatible with socat and nc for manual debugging:
echo '{"jsonrpc":"2.0","id":1,"method":"ping"}' | socat - UNIX-CONNECT:/tmp/tauri-pilot-myapp.sockJSON-RPC Message Format
Section titled “JSON-RPC Message Format”Three message types:
// Request{"jsonrpc":"2.0","id":1,"method":"ping"}
// With params{"jsonrpc":"2.0","id":2,"method":"click","params":{"ref":"e3"}}
// Response (success){"jsonrpc":"2.0","id":1,"result":{"status":"ok"}}
// Response (error){"jsonrpc":"2.0","id":1,"error":{"code":-32601,"message":"Method not found"}}The protocol is implemented with three hand-rolled serde structs (~50 lines total) — no external JSON-RPC crate is used.
| Struct | Fields |
|---|---|
Request | jsonrpc, id, method, params? |
Response | jsonrpc, id, result?, error? |
RpcError | code, message, data? |
42 methods are available: ping, windows.list, snapshot, diff, click, fill, type, press, select, check, scroll, drag, drop, eval, screenshot, text, html, value, attrs, visible, count, checked, wait, watch, navigate, url, title, state, ipc, console.getLogs, console.clear, network.getRequests, network.clear, storage.get, storage.set, storage.list, storage.clear, forms.dump, record.start, record.stop, record.status, record.add.
Element Reference System
Section titled “Element Reference System”The snapshot method assigns stable short references (e1, e2, …) to every DOM element in the page:
- Refs are stored in a
Mapinside the JS bridge - Refs are reset on each new snapshot — they are not persistent across calls
- Always take a fresh snapshot before issuing actions to avoid stale refs
- In CLI commands, refs use the
@prefix:@e1,@e2, etc.
# Typical workflowtauri-pilot snapshot # assigns e1, e2, ... to current DOMtauri-pilot click @e3 # clicks the element with ref e3tauri-pilot fill @e5 "hello" # fills the input at e5Multi-Window Support
Section titled “Multi-Window Support”The plugin supports Tauri apps with multiple windows. When a JSON-RPC request includes a "window" parameter, the plugin resolves the target window by label via AppHandle::get_webview_window(label). If the label is present but invalid or doesn’t match any open window, the request returns a “Window ‘{label}’ not found” error — it does not fall back to the default. Without the parameter, it falls back to "main" then the first available window.
The windows.list method enumerates all open windows (label, URL, title), sorted by label for deterministic output. From the CLI, use --window <label> to target a specific window, or tauri-pilot windows to list them.
Eval + Callback Pattern (ADR-001)
Section titled “Eval + Callback Pattern (ADR-001)”webview.eval() in Tauri v2 is fire-and-forget — it dispatches JS into the WebView but provides no return value. All methods that read from the page require a response, so a callback pattern is used:
- The plugin wraps the target JS in a
try/catchblock - On completion, the JS invokes
window.__TAURI_INTERNALS__.invoke('plugin:pilot|__callback', {id, result}) - The plugin’s IPC handler for
__callbacklooks up the matchingoneshot::Senderand resolves it - Rust awaits the oneshot channel with a 10-second timeout
The EvalEngine maintains:
- A
HashMap<u64, oneshot::Sender<Result<Value, String>>>for in-flight requests - An
AtomicU64counter for request IDs
The eval function dynamically resolves the target window on each call — it captures the AppHandle and looks up the window by label at eval time, rather than binding to a single window at startup. This allows targeting different windows across requests.
This makes every eval effectively async and type-safe from the Rust side.
JS Bridge Structure
Section titled “JS Bridge Structure”The JS bridge is compiled into the plugin binary via include_str!("../js/bridge.js") and injected into every WebView at boot through js_init_script(). It is available before any frontend framework code runs.
Key internals:
- Snapshot: Uses a manual recursive traversal over
node.childrento walk the DOM. AROLE_MAPmaps implicit HTML element roles (e.g.<button>→"button",<a>→"link") for elements without an explicit ARIA role. - Actions: Dispatch realistic DOM event sequences —
focus → mousedown → mouseup → click— ensuring compatibility with React, Vue, and other frameworks that rely on synthetic events. fill: Uses the nativeHTMLInputElementvalue setter viaObject.getOwnPropertyDescriptor(HTMLInputElement.prototype, 'value').setto trigger React’s synthetic change events correctly.- Console capture: Monkey-patches
console.log/warn/error/info, stores entries in a 500-entry ring buffer withid,timestamp,level,args, andsource. Exposed viaconsoleLogs(options)andclearLogs().
Project Structure
Section titled “Project Structure”tauri-pilot/├── Cargo.toml # workspace├── crates/│ ├── tauri-plugin-pilot/│ │ ├── src/│ │ │ ├── lib.rs # Plugin init, js_init_script, setup│ │ │ ├── server.rs # Unix socket server, accept loop│ │ │ ├── protocol.rs # Request, Response, RpcError│ │ │ ├── handler.rs # Dispatch method → handler│ │ │ ├── eval.rs # EvalEngine (callback pattern)│ │ │ └── error.rs # thiserror types│ │ └── js/│ │ └── bridge.js # JS bridge (included via include_str!)│ └── tauri-pilot-cli/│ └── src/│ ├── main.rs # Entry point, tokio::main│ ├── cli.rs # Clap definitions│ ├── client.rs # Unix socket client│ ├── protocol.rs # Request, Response│ ├── output.rs # Formatters text/JSON│ └── error.rs # anyhow wrappers