Reference · MCP toolbelt

36 typed browser tools.

Every tool exposed by the VulpineOS Model Context Protocol bridge. Any MCP-speaking client — Claude, GPT, Gemini, Ollama, OpenRouter, the Foxbridge CDP shim — can call any of these against a hardened Vulpine browser session.

36 tools5 categoriesMCP-native

Core · 12

The day-one toolbelt — navigation, snapshots, click / type, contexts, and reference-based interaction.

vulpine_navigate
Navigate the active context to a URL. Optionally waits for load + page-settled before returning.
vulpine_snapshot
Return the optimised DOM snapshot for the current page (semantic JSON, ~93.1% smaller than Chrome AX).
vulpine_click
Click by CSS selector. Auto-scrolls into view, action-locks during the press, verifies the click landed.
vulpine_type
Focus a selector and type into it. Triggers input + change events. Falls back to keyboard input where IME interaction is needed.
vulpine_screenshot
Capture a PNG screenshot of the current viewport or a clipped element. Returns base64 + dimensions.
vulpine_scroll
Scroll the page by a delta or to a target selector. Honours overflow containers, not just window scroll.
vulpine_new_context
Allocate a fresh browser context from the pool. Optional citizen id pins identity; otherwise a Nomad session is used.
vulpine_close_context
Release the active context back to the pool. Recorded in the lifecycle audit log.
vulpine_get_ax_tree
Return the full accessibility tree for screen-reader-style reasoning. Filters injected DOM via the Phase 1 filter.
vulpine_click_ref
Click by durable reference id from a prior snapshot — survives minor DOM churn between turns.
vulpine_type_ref
Type into a referenced element. Same durability story as click_ref.
vulpine_hover_ref
Hover over a referenced element. Useful for menus, tooltips, and lazy-rendered surfaces.

Agent reliability · 8

Primitives that turn flaky scripts into deterministic agent loops — wait, find, verify, page-settled, fill-form.

vulpine_wait
Wait for one of: element visible, text appears, network idle, DOM stable, URL contains. Configurable timeout and poll interval.
vulpine_find
Search interactive elements by visible text / aria-label / placeholder. Returns positions, durable refs, and disambiguation hints.
vulpine_verify
Assert state after an action: exists, visible, checked, value, text, url, title. Emits a typed pass/fail trace row.
vulpine_screenshot_diff
Take before / after screenshots and return a structural diff. Confirms an action actually changed the page.
vulpine_page_settled
Block until readyState=complete, no pending images, no in-flight network for N ms, and DOM mutation count below threshold.
vulpine_select_option
Choose from a <select> by value or visible text. Fires change so frameworks update model state.
vulpine_fill_form
Fill multiple fields at once. Triggers correct input/change events per field type. Reports per-field success.
vulpine_page_info
Return URL, scroll position, form / button / link counts, modal visibility, and current focus element.

Interaction · 3

Lower-level keyboard primitives plus form-error extraction.

vulpine_press_key
Press a key or chord (Ctrl, Shift, Alt, Meta + key). Honours focus context and IME state.
vulpine_clear_input
Select-all + delete on a focused input. Works on contenteditable surfaces too.
vulpine_get_form_errors
Collect HTML5 validation errors, CSS error states, and ARIA error messages from the active form.

Human realism · 3

Bot-detection-aware variants of click, scroll, and type. Use when sites probe for inhuman timing.

vulpine_human_click
Click with cursor trajectory (Bezier path), arrival overshoot, and timing jitter calibrated to real-user telemetry.
vulpine_human_scroll
Smooth scroll with easing curves, mid-scroll pauses, and direction reversals. Avoids the linear-scroll fingerprint.
vulpine_human_type
Per-character delay variance, occasional corrections, and shift-modifier timing that matches human typing patterns.

Extensions · 10

Pluggable extras — annotated screenshots, credential autofill, audio capture, mobile bridge. Some require optional providers.

vulpine_annotated_screenshot
Screenshot plus labelled interactive elements with durable object ids. Pairs with click_label for vision-driven agents.
vulpine_click_label
Click by @N label returned from annotated_screenshot. Lower hallucination than CSS selectors for vision models.
vulpine_get_credential
Look up a credential for the active citizen via the credential provider. Returns metadata only, never plaintext.
vulpine_autofill
Inject a stored credential into a targeted form field via Page.secureSetInputValue. Plaintext never crosses the JS boundary.
vulpine_start_audio_capture
Start an audio capture session on the active page. Requires the AudioCapturer extension.
vulpine_stop_audio_capture
Stop an active audio capture session and finalise the segment.
vulpine_read_audio_chunk
Drain captured audio bytes from an active session for downstream transcription.
vulpine_list_mobile_devices
Enumerate Android devices visible to the mobile bridge. Returns device id, model, and Android version.
vulpine_connect_mobile_device
Start a mobile bridge session against a device id. Returns a CDP endpoint usable by the same toolbelt.
vulpine_disconnect_mobile_device
Stop a mobile bridge session and release the device.

Drive the toolbelt from any model.

Self-host the runtime, point your MCP client at the bundled bridge, and call any of these 36 tools against a fingerprint-coherent Vulpine browser session. BYO keys, no proxy.

Read the overview →Star on GitHub