OAuth device-code flows with AI agents

CLI tools that use OAuth device-code login (gh auth login, npm publish, MCP Registry publisher, AWS CLI, Stripe CLI) all hit the same wall: paste the URL, type the code, click Authorize. The agent does the first two automatically; you click once.

Why this is the right human-in-the-loop split: Authorizing an OAuth grant is a real security decision — the human should be the one clicking "Authorize". GitHub specifically disables synthetic clicks on the Authorize button (anti-bot). Chromeflow's job is everything around that one click: open the page, walk through the account selector, fill the 8-character code into 8 separate inputs, highlight the Authorize button so you know what to click. The agent never sees your password or your OAuth token.

The pattern, end to end

1. CLI emits the device code

$ mcp-publisher login github
Logging in with github...

To authenticate, please:
1. Go to: https://github.com/login/device
2. Enter code: 66AA-30CA
3. Authorize this application
Waiting for authorization...

The agent reads the code and URL from the terminal output (via capture_terminal).

2. Agent opens the device-flow URL in a new tab

open_page("https://github.com/login/device", new_tab=true)

3. Agent confirms the account picker

// Most providers show "Continue as YourUsername" before the code input
click_element("Continue", expect_submit=true)

expect_submit=true verifies the navigation actually happened — protects against silent-submit failures.

4. Agent fills the code into the input boxes

GitHub splits the 8-char code across 8 separate inputs with a visible dash. The agent fills each box using the React-aware value setter (synthetic value= assignment doesn't fire React's onChange — Chromeflow's pattern walks the prototype to get the native setter).

const code = '66AA30CA'.split('');
const indices = [0,1,2,3,5,6,7,8];
for (let i = 0; i < indices.length; i++) {
  fill_input(
    selector: `#user-code-${indices[i]}`,
    value: code[i]
  );
}

5. ⚠ Human handoff — the Authorize click

GitHub disables the Authorize button until a real user gesture is detected. document.body.dispatchEvent(new MouseEvent('mousemove')) doesn't work — they specifically filter trusted-vs-untrusted events. So the agent calls:

highlight_region(
  selector: "button[name=authorize]",
  message: "Click Authorize — GitHub blocks synthetic clicks here."
)
wait_for_click()

You click once. The agent detects the click target and the post-click navigation, then continues.

6. CLI's polling detects authorization

$ mcp-publisher login github
...
Successfully authenticated!
✓ Successfully logged in

The agent reads the success from the terminal and continues with the next step (e.g. mcp-publisher publish).

Where this comes up

gh auth login — GitHub CLI authentication. Same flow exactly.
npm login --auth-type=web — npm's web-based auth (OAuth device flow under the hood).
mcp-publisher login github — the MCP Registry publisher (we just did this one in publishing chromeflow v0.9.5).
aws sso login — AWS CLI single sign-on via device code.
stripe login — Stripe CLI authentication.
doctl auth init — DigitalOcean CLI device flow.
Most modern CI/cloud CLIs have adopted device-code login because it's secure and works headlessly.

What you save

Without Chromeflow, each device-code flow is: read the code, switch to browser, paste the URL, click through the account picker, type the 8 characters into 8 inputs, click Authorize, switch back to terminal, watch for success. Maybe a minute per flow, repeated dozens of times across a typical SaaS-publishing cycle.

With Chromeflow: the agent does everything except the one Authorize click. ~5–10 seconds of your attention per auth. Across 20 OAuth dances per week, that's 20 minutes back.

Why isn't this a single "`oauth_device_flow()`" helper tool?

It was proposed (May 13 retro). It was rejected. OAuth device flows have too many variants — GitHub's CSRF + account selector + 8-input code, Google's single-input code + consent screen, AWS's region + role chooser, Stripe's separate confirmation page. Wrapping all of these in one tool either makes it too rigid or too generic. The agent composing from the existing primitives (open_page, click_element, fill_input, highlight_region, wait_for_click) keeps the tool surface lean while handling every variant cleanly.

← Back to all use cases