← Graham Digital MediaCase study · Desktop engineering

macOS · Electron · Chrome DevTools Protocol

Claude-Browser

A native macOS app that pairs a persistent Chromium webview with a scripting layer of 19 automation tools over the Chrome DevTools Protocol. Logins persist, the browser is real, and entire workflows can be driven end-to-end — perceive a page, act on it by coordinates, and re-perceive after every change. This is desktop-grade engineering, so it runs as an app rather than a web demo.

01

Perceive

Screenshot the page and overlay a numbered box on every interactive element, returning each with its click coordinates and role.

02

Act

Click or type by coordinate — works even on canvas-rendered and opaque-DOM UIs that selector-based tools can't touch.

03

Re-perceive

After every state change, perceive again. Coordinates are only trusted for the current state — no stale assumptions.

claude-browser
$ claude-browser tool browser_goto --url=https://example.com
{ "ok": true, "text": "navigated to example.com" }
$ claude-browser tool browser_perceive
[3] (412,288) button "Sign in" (ref=...)
$ claude-browser tool browser_click_at --x=412 --y=288
{ "ok": true, "text": "clicked at (412,288)" }

The 19-tool surface

Navigation & state

  • goto
  • back
  • forward
  • reload
  • state
  • wait

Perception & reading

  • perceive
  • screenshot
  • read
  • extract

Interaction · coordinate-first

  • click_at
  • type_at
  • scroll
  • key

Interaction · DOM fallback

  • find
  • click
  • type
  • select
  • evaluate

Stack

ElectronChromiumChrome DevTools ProtocolNode.jsmacOS

Overview / case study · the tool runs as a native macOS app · Built by Graham Digital Media