MCP tool catalog

Once an agent is connected (see Overview), a tools/list call returns Autonomy’s tool catalog — on the order of 170 tools spanning screen reading, input, browser automation, speech, VoiceOver, consent, and workflow. That number is the whole reason this page exists: no agent should try to hold 170 tool schemas in context at once, and no person should have to read 170 rows to understand what Autonomy can do.

This page groups the catalog by domain — the same domain taxonomy Autonomy uses internally to scope tool access for specialist subagents — and gives a few representative real tools per group. It is not the exhaustive reference; for that, ask a connected agent to run a7y-cli tools list, or call tools/list directly against the daemon.

Why domains, not one flat list

A one-shot Claude connection was observed, live, deferring every tool’s schema the moment the catalog crossed roughly 160 tools — the classic “tool overload” failure mode where selection quality drops and context bloats. The fix wasn’t fewer tools; some tasks genuinely need all of them somewhere in the system. The fix was scoping: a connecting agent (or a domain-specialist subagent) can request just one domain’s tools plus a small cross-cutting set, instead of the full surface.


 full catalog (~170 tools)
   │
   ├─ a7y-cli daemon proxy               → everything (default, back-compatible)
   └─ a7y-cli daemon proxy --domain X    → domain X's tools ∪ cross-cutting set
                                            (consent, action ledger, readiness)

The domain groups below are exactly the ones this filtering understands.

Screen — reading and acting on the accessibility tree

The largest domain. Prefers the native accessibility tree over pixels: read what’s on screen, find elements semantically, act on them, and verify the result.

Tool	Purpose
`screen_read_tree`	Read the accessibility tree for the focused or a named window.
`screen_find_element`	Locate an element by role, label, or query instead of coordinates.
`screen_click_element_by_query`	Click an element resolved semantically, not by x/y.
`screen_type_into_element`	Type into a resolved element with built-in waits.
`screen_wait_for_element` / `screen_wait_for_idle`	Block until a target element or app state is ready before acting.
`screen_macro_focus_click_type_verify`	A bundled focus → click → type → verify sequence for common form flows.
`screen_capture`	Take a screenshot when semantic reading isn’t enough (a `capture_screen`-consent action — see Consent & safety).

Browser and DOM — pages, tabs, and page content

Connects browser automation, tab state, and DOM reads to the same evidence model as the screen domain.

Tool	Purpose
`browser_open_url` / `browser_session_open_url`	Navigate to a URL in a managed tab.
`browser_session_list_tabs` / `browser_session_switch_to_tab`	Enumerate and switch between open tabs.
`dom_extract`	Pull structured content out of the current page.
`dom_select_dropdown` / `dom_scroll_to_text`	Semantic page interaction beyond raw clicks.
`browser_cookies`	Read cookie state for the current session.

Guided page exploration (copilot overlay)

A narrower lane for screen-reader-safe, step-at-a-time page exploration — the runtime behind README’s “Guided Live Page Exploration” workflow. Distinct from raw DOM tools because it’s built for narrated, bounded loops rather than one-shot extraction.

Tool	Purpose
`copilot_attach`	Attach to the active guided-overlay UI session.
`copilot_status`	Read current overlay connection status.
`copilot_highlight` / `copilot_cursor`	Visually mark the element being discussed.
`copilot_dom_read_text` / `copilot_dom_find`	Read or locate page content within the guided session.

Speech and voice

Assistive speech recognition (listening) and speech synthesis (speaking), kept separate from the VoiceOver-specific transport below.

Tool	Purpose
`speak_text`	Speak an utterance through the assistive speech output lane.
`speech_list_voices` / `speech_set_voice`	Enumerate and choose a synthesis voice.
`speech_listen_start` / `speech_listen_stop`	Start and stop assistive speech recognition.
`speech_status`	Combined recognition + synthesis runtime status.

VoiceOver transport

A dedicated, deterministic lane for VoiceOver-aware output and navigation, with delivery evidence and AX focus verification built in — this is what voiceover_transport_announce uses for the screen-off status updates described in Bring your own agent.

Tool	Purpose
`voiceover_transport_announce`	Post a concise announcement, preferring direct VoiceOver output, with delivery evidence (not a claim the user heard it).
`voiceover_transport_key` / `voiceover_transport_scroll`	Native keyboard navigation with AX focus capture for verification.
`voiceover_transport_snapshot`	Read the focused app/element without claiming spoken output.
`voiceover_transport_session_policy`	Get or set session-level narration mode (e.g. frequent spoken updates for a screen-off user).

Accessibility and readiness

Checks what’s actually available before an agent commits to a plan — the “say what’s granted, missing, blocked, or unknown” principle from the README.

Tool	Purpose
`accessibility_state`	Read the seven-surface accessibility MVP state (keyboard, voice, switch, speech, audio, display).
`doctor_check_permissions`	Report macOS permission states (Accessibility, Screen Recording, …).
`assistive_mode_select`	Choose the assistive mode to operate under, with consent gates and rationale.
`surface_classify`	Classify a target surface (native AX, browser DOM, PDF, 2FA, payment, …) before choosing a tool path.
`driver_policy_check`	Check Autonomy policy before falling back to generic driver-style control.

The gate for anything in the elevated/critical safety band, and the durable, screen-reader-friendly record of what happened. Covered in full in Consent & safety.

Tool	Purpose
`consent_request`	Create a typed, task-scoped consent request before acting.
`consent_resolve`	Resolve a pending request as granted, declined, or cancelled.
`action_ledger_record` / `action_ledger_list`	Record and read back what the agent did, asked, and observed.
`barrier_report_record`	Record an accessibility barrier as first-class history instead of a generic failure.

Session and conversation

Claims, narrates in, and releases the in-app Agent Conversation panel — the tool family a bring-your-own agent uses to participate in that surface. Full walkthrough in Bring your own agent.

Tool	Purpose
`agent_conversation_session_claim`	Claim the panel as an external MCP-connected agent.
`agent_conversation_append`	Append user-visible transcript content (`user`/`agent`/`status`/`failure`).
`agent_conversation_session_release`	Release the claim without closing the panel.
`agent_conversation_session_get`	Read current session state.

Playbooks

Bounded, auditable workflow execution — a step further than one-off tool calls when a task has explicit state and needs a resumable audit trail.

Tool	Purpose
`playbook_list`	List available playbooks with metadata summaries.
`playbook_start` / `playbook_status`	Start a playbook and read back step state and its audit log.
`playbook_cancel`	Cancel a running playbook.
`playbook_export_audit`	Export a run’s audit trail as newline-delimited JSON.

Routing, profiles, and subagents

Supports the ADR-013 pattern of dispatching a task to a domain-scoped specialist instead of one agent holding everything.

Tool	Purpose
`task_route`	Route a task goal to a portable Autonomy agent profile.
`agent_profile_get` / `agent_profile_list`	Read available assistive working-style profiles.
`subagent_task_claim` / `subagent_task_ack`	Claim and acknowledge queued specialist tasks.

Everything else

A handful of smaller, cross-cutting groups round out the catalog: clipboard_read/clipboard_write for clipboard access, host_read_status and host_lima_* for host/VM environment status, preferences_get/preferences_set and access_config_* for stored user preferences, run_checkpoint/run_resume/replay_export for durable workflow recovery, and support_packet_create for local, redacted support handoffs. Every domain-scoped subagent gets consent, the action ledger, and readiness checks in addition to its own domain — so any acting specialist can still gate risk and record what it did.

Pitfalls

Don’t assume one agent should hold the whole catalog. If you’re building a specialist rather than a general-purpose connection, scope it to one domain (a7y-cli daemon proxy --domain screen, for example) plus the cross-cutting set, the same way Autonomy’s own shipped subagents (autonomy:high-consent-guard, autonomy:voiceover-form-review, and others under plugins/autonomy/agents/) are scoped.
Tool counts drift. Treat “~170 tools” as approximate, not a contract — the surface grows as capabilities are added. a7y-cli tools list is the live source of truth, not this page.
A subagent’s tool allowlist alone doesn’t shrink its context. In Claude Code specifically, restricting which tools a subagent may call doesn’t stop every tool’s schema from landing in its context — only a domain-scoped MCP connection does that. See Bring your own agent for the detail.

Overview — how an agent gets connected in the first place
Bring your own agent — domain-scoped connections and the conversation tools
Consent & safety — the model behind the consent and ledger tools
What Autonomy gives agents — the product-level shape of these capabilities