Skip to content

[poc] feat(webkit): classic W3C WebDriver backend#41150

Draft
yury-s wants to merge 8 commits into
microsoft:mainfrom
yury-s:webdriver-poc
Draft

[poc] feat(webkit): classic W3C WebDriver backend#41150
yury-s wants to merge 8 commits into
microsoft:mainfrom
yury-s:webdriver-poc

Conversation

@yury-s
Copy link
Copy Markdown
Member

@yury-s yury-s commented Jun 4, 2026

Summary

  • Proof-of-concept WebDriver (wd*) backend that drives WebKit over classic W3C WebDriver, alongside the existing webview backend. Reached via webkit.connectOverCDP('webdriver://host:port') (or a local-launch sentinel).
  • Synthesizes the frame/context/lifecycle/console events the core expects (WebDriver pushes none), emulates live handles via a page-side registry, and serializes commands (the driver is single-threaded per session).
  • Native W3C Actions input; reuses one session per worker.
  • Adds tests/webdriver/ reusing the tests/page/ suite (PWPAGE_IMPL=webkit-webdriver, npm run wdtest).

Draft / CI note

  • Draft to exercise the new WebDriver workflow on a macos-15-xlarge bot (runs page-check / page-goto / page-click only).
  • Other PR workflows are temporarily guarded to skip on this PR's branch to keep the draft focused — to be reverted before any merge.

yury-s added 8 commits June 4, 2026 12:10
Adds a wd* backend (next to webview/) that drives Safari over the classic
W3C WebDriver protocol exposed by safaridriver — both by launching
safaridriver locally (webdriver+launch://) and by connecting to an existing
endpoint (webdriver://), reached via webkit.connectOverCDP().

WebDriver pushes no events, so the backend synthesizes the frame, execution
context and lifecycle events the core Page/Frame machinery expects. Since
WebDriver has no persistent object handles, live handles are emulated with a
page-side registry (window.__pwHandles), which lets the InjectedScript
hit-target interceptor, element handles and waitForFunction-style polling work
across the handle-less execute boundary.

Verified end-to-end against real Safari: goto, evaluate, screenshot,
fill/type/click, and locator queries.

Also adds tests/webdriver/ which reuses the existing tests/page/ suite via
PWPAGE_IMPL=webkit-webdriver (mirroring tests/webview/). Run with `npm run wdtest`.
Makes tests/page/page-check.spec.ts pass (and unblocks setContent/console-based
tests generally).

- Capture console messages + document readyState as a side-channel of every
  execute, then synthesize console and lifecycle events. This drives
  Frame.setContent (which logs a console.debug tag to clear lifecycle) and
  resolves waitForLoadState over a protocol that pushes no events.
- Dispatch input through the same injected page-side dispatcher the webview
  backend uses (window.__pwWebViewInput) instead of the W3C Actions API. The
  synthetic events are handled deterministically by Playwright's hit-target
  interceptor at the exact target, eliminating the intermittent click flake
  (only mousedown delivered) that trusted OS-level Actions events caused.
Switches the WebDriver backend back to the native W3C Actions API for input
(instead of the injected synthetic dispatcher) and fixes the real root cause of
the intermittent dropped clicks.

- Serialize all WebDriver commands onto a single chain. safaridriver processes
  one command per session at a time; Playwright issued some concurrently
  (races, lifecycle-triggered follow-ups), and overlapping /actions + /execute
  requests were reordered by the driver — landing pointerUp a cycle late and
  swallowing parts of a click. Serializing eliminates the desync.
- Cache UtilityScript/InjectedScript as per-document singletons in the page
  registry instead of rebuilding them (and re-installing their global event
  listeners) on every evaluate.
- Test fixture: reuse one session per worker and reset to about:blank between
  tests, and stop safaridriver gracefully (SIGTERM + await exit, never SIGKILL)
  so it deletes its session and unpairs Safari — avoiding the "Continue
  Session" dialog / "already paired" failures on the next run.
…reation

- Trim verbose comments and drop dead WDSession methods.
- Reuse one WebDriver session per worker (resetting to about:blank between
  tests): creating a session re-prompts Safari's "remotely controlled" dialog,
  so opening it once per worker minimises the prompt versus a fresh session
  per test.
- Retry session creation while Safari still reports the previous session as
  paired, so a just-released session doesn't fail the next connect.
Runs the basic page-check/page-goto/page-click suites against Safari over the
WebDriver backend on a macos-15-xlarge runner (workflow_dispatch + PR on
webdriver paths).

Temporarily guards the other PR-triggered workflows with
`github.head_ref != 'webdriver-safari-poc'` so they don't fire on the draft PR
used to exercise the bot. Revert these TEMP guards before merging.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant