feat(stage-tamagotchi,computer-use-mcp): implement browser-native DOM action routing by 3361559784 · Pull Request #1648 · moeru-ai/airi

3361559784 · 2026-04-12T16:55:07Z

这是 Desktop 线路的第 2 / 3 步。基于 1/3 的观察层，解决执行精准度问题。将符合条件的 chrome_dom 目标拦截，并在可能的情况下绕开 OS-level 点击，直插浏览器 extension/CDP 事件流。

本次包含：

引入了 browser-action-router，提供对 click、checkCheckbox、setInputValue 的显式类型路由决策树。
补上了完整的 handler 层 Integration 测试。

不在本次范围：

依然没有介入 Chrome 浏览器的生命周期管理（这部分交由 3/3 核心解决）。
未对非 DOM 元素的原生 OS 弹窗做特殊注入。

⚠️ 高危与后续债 (Known Debts)：

fallback 存在撕裂风险：如果由于 bridge 断连，从 browser_dom 回退到 os_input 时，如果页面正好发生了重排（Reflow），落脚点坐标依然可能失效。
并发请求保护未加锁：存在前端快速发出的重叠指令对 bridge 造成击穿的可能性，此次不碰。

github-actions · 2026-04-12T16:55:32Z

⏳ Approval required for deploying to Cloudflare Workers (Preview) for stage-web.

Name	Link
🔭 Waiting for approval	For maintainers, approve here

Hey, maintainers, kindly take some time to review and approve this deployment when you are available. Thank you! 🙏

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 50d218e4d6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

gemini-code-assist

Code Review

This pull request implements a comprehensive desktop grounding layer for macOS automation, featuring a transparent visualization overlay, a read-only Chrome extension for DOM observation, and unified observation tools like desktop_observe and desktop_click_target. It also introduces a snap resolution engine and a descriptor-driven tool registration system. Feedback focuses on increasing the staleness threshold for click actions to better handle LLM latency, fixing a memory leak in the overlay polling timer, and improving the robustness of CSS selector generation and application detection consistency.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d9d02ec11b

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7c35c91fa1

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b0861357bd

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7d39ec590f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d92d1732c2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3a0d95e21c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3047fc1aa6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 09bdcb61b1

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1b5ca6a401

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5bb1b3f69b

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Copilot

Pull request overview

This PR adds “desktop grounding” for the Desktop lane by introducing a unified observe→candidate→click flow, and routes eligible Chrome DOM targets into the browser extension/CDP path to improve click/type precision while maintaining OS-input fallback.

Changes:

Add new MCP actions (desktop_observe, desktop_click_target) with RunState support, strategy advisories, and transparency text.
Implement grounding pipeline (snapshot aggregation, candidate ranking/dedup, snap resolver) plus browser-dom routing for click/type.
Introduce tool descriptor registry + add Electron desktop overlay window (polls desktop_get_state) for visualizing snap/candidates.

Reviewed changes

Copilot reviewed 47 out of 50 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
services/computer-use-mcp/src/types.ts	Adds new action kinds/inputs and enriches DOM element metadata (`role`).
services/computer-use-mcp/src/transparency.ts	Adds intent/outcome strings + run-state summary for grounding snapshot/pointer.
services/computer-use-mcp/src/strategy.ts	Adds grounding-related advisories (observe-first, stale, duplicate click).
services/computer-use-mcp/src/state.ts	Stores grounding snapshot, pointer intent, last clicked candidate + update helpers.
services/computer-use-mcp/src/snap-resolver.ts	Implements snapping logic and candidate-ID resolution helpers.
services/computer-use-mcp/src/snap-resolver.test.ts	Unit tests for snap/geometry helpers.
services/computer-use-mcp/src/server/tool-descriptors/vscode.ts	Adds VS Code tool descriptors.
services/computer-use-mcp/src/server/tool-descriptors/types.ts	Defines descriptor types/validation for registry.
services/computer-use-mcp/src/server/tool-descriptors/task-memory.ts	Adds task memory tool descriptors.
services/computer-use-mcp/src/server/tool-descriptors/registry.ts	Implements descriptor registry/query/validation.
services/computer-use-mcp/src/server/tool-descriptors/registry.test.ts	Adds registry validation/query/coverage tests.
services/computer-use-mcp/src/server/tool-descriptors/register-helper.ts	Adds descriptor-driven tool registration helpers.
services/computer-use-mcp/src/server/tool-descriptors/pty.ts	Adds PTY tool descriptors.
services/computer-use-mcp/src/server/tool-descriptors/index.ts	Exports registry + helpers/types.
services/computer-use-mcp/src/server/tool-descriptors/display.ts	Adds display tool descriptors.
services/computer-use-mcp/src/server/tool-descriptors/desktop.ts	Adds desktop/browser-dom/workflow/meta tool descriptors.
services/computer-use-mcp/src/server/tool-descriptors/coding.ts	Adds coding tool descriptors.
services/computer-use-mcp/src/server/tool-descriptors/cdp.ts	Adds CDP tool descriptors.
services/computer-use-mcp/src/server/tool-descriptors/all.ts	Aggregates descriptors and initializes global registry.
services/computer-use-mcp/src/server/tool-descriptors/accessibility.ts	Adds accessibility tool descriptors.
services/computer-use-mcp/src/server/register-desktop-grounding.ts	Registers `desktop_observe` / `desktop_click_target` and routes to browser-dom when possible.
services/computer-use-mcp/src/server/action-executor.ts	Routes `type_text` through browser-dom `setInputValue` when appropriate.
services/computer-use-mcp/src/server.ts	Wires desktop grounding tool registration into server startup.
services/computer-use-mcp/src/desktop-grounding.ts	Implements unified grounding capture + candidate merge/dedup + agent formatting.
services/computer-use-mcp/src/desktop-grounding.test.ts	Unit tests for candidate building and formatting.
services/computer-use-mcp/src/desktop-grounding-types.ts	Adds grounding snapshot/candidate/snap/pointer intent types.
services/computer-use-mcp/src/chrome-semantic-adapter.ts	Captures Chrome semantics via extension/CDP and maps to candidates/selectors.
services/computer-use-mcp/src/chrome-semantic-adapter.test.ts	Unit tests for Chrome semantic adapter + routing integration.
services/computer-use-mcp/src/browser-action-router.ts	Adds explicit routing decision tree for click/type via browser-dom vs OS input.
services/computer-use-mcp/src/browser-action-router.test.ts	Unit tests for routing decisions and consistency.
services/computer-use-mcp/chrome-extension/msg_bridge.js	Adds isolated-world relay between background and main-world content script.
services/computer-use-mcp/chrome-extension/manifest.json	New MV3 extension manifest for grounding bridge.
services/computer-use-mcp/chrome-extension/icon48.png	Extension icon asset.
services/computer-use-mcp/chrome-extension/icon16.png	Extension icon asset.
services/computer-use-mcp/chrome-extension/icon128.png	Extension icon asset.
services/computer-use-mcp/chrome-extension/content.js	Main-world DOM collector + limited DOM action helpers.
services/computer-use-mcp/chrome-extension/background.js	Background worker that relays commands and maintains WS connection.
services/computer-use-mcp/chrome-extension/README.md	Extension documentation.
packages/stage-ui/src/stores/mcp-tool-bridge.ts	Adds a minimal MCP tool bridge interface for UI consumers.
packages/stage-ui/package.json	Exports the new stage-ui store entrypoint.
apps/stage-tamagotchi/src/renderer/pages/desktop-overlay.vue	Adds overlay renderer to visualize pointer/candidate boxes.
apps/stage-tamagotchi/src/renderer/pages/desktop-overlay-polling.ts	Adds polling/extraction logic for overlay (pure module).
apps/stage-tamagotchi/src/renderer/pages/desktop-overlay-polling.test.ts	Unit tests for overlay polling/extraction.
apps/stage-tamagotchi/src/renderer/pages/desktop-overlay-coordinates.ts	Coordinate conversion helpers for overlay rendering.
apps/stage-tamagotchi/src/renderer/pages/desktop-overlay-coordinates.test.ts	Unit tests for overlay coordinate helpers.
apps/stage-tamagotchi/src/main/windows/desktop-overlay/rpc/index.electron.ts	Sets up Electron RPC services for overlay window.
apps/stage-tamagotchi/src/main/windows/desktop-overlay/index.ts	Creates the click-through always-on-top overlay window (env-gated).
apps/stage-tamagotchi/src/main/index.ts	Boots overlay window when `AIRI_DESKTOP_OVERLAY=1`.
apps/stage-tamagotchi/electron.vite.config.ts	Adds explicit alias for stage-ui `stores/mcp-tool-bridge`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

chatgpt-codex-connector

💡 Codex Review

airi/services/computer-use-mcp/src/server/register-tools.ts

Lines 74 to 75 in a3c915f

    
           if ('success' in record) 
        
             return Boolean(record.success)

Unwrap frame payload before evaluating browser_dom success

This success predicate reads top-level frame.result.success, but extension replies are wrapped as { success: true, data: <tool result> } by content.js for any non-throwing call. That means frame-level failures like { data: { success: false, error: 'not found' } } are still counted as successful, so browser_dom tool summaries can claim N/N frame(s) succeeded when actions actually failed in every frame.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d83d0dedbc

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

3361559784 · 2026-04-24T14:32:27Z

Addressed the two new review comments in 66ac599:

P1 — role=textbox routing (browser-action-router.ts): Fixed. Removed the role === 'textbox' catch-all from isTextInputCandidate. Now only actual <input> and <textarea> elements are routed to setInputValue. Contenteditable divs fall through to OS typing via desktop_type_text.

P2 — simulateKeystrokes (content.js): Added TODO — implementing per-character KeyboardEvent dispatch is non-trivial (need proper key/code/keyCode mapping, composition events for CJK, etc.). The current direct value assignment + input/change events works for the majority of forms. Will track as a follow-up.

Also fixed 2 eslint innerText lint errors (intentional use — innerText returns layout-aware visible text, textContent doesn't).

chatgpt-codex-connector

💡 Codex Review

airi/services/computer-use-mcp/src/server/register-tools.ts

Lines 565 to 567 in 6a3dc6a

    
           content: [ 
        
             textContent(`Clicked selector "${selector}" in frame ${result.targetFrameId} at (${result.targetPoint.x}, ${result.targetPoint.y}).`), 
        
           ],

Check clickResults before reporting browser_dom_click success

browser_dom_click always emits a successful Clicked selector... response after clickSelector resolves, but clickSelector can return clickResults where every frame reports { success: false } (for example after a reflow between target lookup and click dispatch). In that case this tool still reports status: "ok" even though no DOM click happened, which can make downstream automation proceed under a false success assumption.

airi/services/computer-use-mcp/src/server/register-tools.ts

Lines 74 to 75 in 6a3dc6a

    
           if ('success' in record) 
        
             return Boolean(record.success)

Unwrap nested browser_dom payloads before success counting

didBrowserDomFrameSucceed treats the outer frame.result.success as the operation outcome, but extension responses are wrapped and can look like { success: true, data: { success: false, error: ... } }. This causes frame-level failures to be counted as successes in summaries, so tools can report misleading X/Y frame(s) succeeded counts and hide real DOM-action failures.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

3361559784 · 2026-04-24T14:52:20Z

Addressed latest review in efd1a13:

P1 — browser_dom_click clickResults (register-tools.ts): Fixed. Now validates per-frame clickResults before reporting success. Returns isError: true with status: 'click_miss' if no frame reported a successful DOM click. Same pattern as register-desktop-grounding.ts fix.

P2 — nested payload unwrapping (didBrowserDomFrameSucceed): Not taking this. The extension content.js returns flat { success: true/false, error?: string } — there is no { success: true, data: { success: false } } nesting in the actual protocol. The reviewer is assuming a wrapping pattern that doesn't exist in the codebase. didBrowserDomFrameSucceed checking record.success is correct.

…ting click success When executeAction returns a denial or approval-required result the handler was proceeding unconditionally to the 'Clicked: success' response and updating pointer/clicked-candidate state, misleading the agent into assuming the target was clicked when it was only queued or blocked by policy. Fix: capture the CallToolResult from both the OS-input and browser-dom fallback paths and return it early when structuredContent.status is 'approval_required' or 'denied', or when isError is set. Post-click state updates are now only reached when execution actually occurred.

…k unknown bridge actions as failed P1 (content.js): getClickTarget returned click coordinates only under the nested `center` object. BrowserDomExtensionBridge.clickSelector reads top-level x/y from the unwrapped payload, so extension clicks always failed to resolve a target point and fell back to OS input. Expose x/y at both the top level and under center for compatibility. P1 (background.js): unknown actions fell into the default branch and returned { ok: true, result: { error: '...' } }. The bridge treated any ok:true as success, so callers like setInputValue/checkCheckbox saw a resolved promise and never triggered their OS-input fallback paths, producing silent no-ops. Changed to return ok:false so the bridge correctly rejects and callers fall back.

…coords; add write actions to extension P1 (action-executor.ts): type_text with explicit x/y coordinates was still routing through lastClickedCandidateId to setInputValue, so text could land in a stale Chrome selector when the user had moved to a different app or window. Gate the browser-dom type path out when explicit coordinates are present — coordinates imply the caller has already targeted a specific screen position that may not correspond to the last Chrome candidate. P1 (background.js + content.js): the extension command switch only handled read-only observation actions. BrowserDomExtensionBridge issues write commands (setInputValue, checkCheckbox, selectOption) that all hit the default branch and returned ok:false, so browser-dom write tools never succeeded and always degraded to OS-input fallback. Add setInputValue, checkCheckbox, selectOption implementations to content.js (bridge-triggered, limited DOM mutations using native value setter + framework-compatible events). Wire the three actions into background.js switch to route them to the correct content.js methods via runCUAction.

rolldown v1.0.0-rc.15 does not reliably resolve exports-field glob patterns ("./stores/*") against new files introduced in the same PR, causing stage-tamagotchi build and typecheck to fail with UNLOADABLE_DEPENDENCY. Add an explicit entry for mcp-tool-bridge alongside existing precise entries to guarantee the module is resolvable regardless of glob support in the bundler.

… setter P1 (content.js): setInputValue always preferred HTMLInputElement.prototype value setter via || chain. When the target is a <textarea>, calling the input setter throws 'Illegal invocation'. Now selects the prototype matching the actual element type (HTMLTextAreaElement vs HTMLInputElement). P1 (action-executor.ts): setInputValue only throws on transport failure — per-frame DOM failures (selector not found, frame stale) resolve with { success: false } in the frame result array. The code unconditionally set usedBrowserDom = true, skipping OS typeText fallback on silent DOM failure. Now checks that at least one frame reported success before marking browser-dom used; otherwise re-throws to enter the catch fallback. P1 (register-desktop-grounding.ts): checkCheckbox has the same frame-result gap. Bridge resolve with all frames returning success:false was treated as success, producing a 'Clicked: success' response without any DOM mutation. Now validates frame results and throws on total failure to trigger OS click fallback through the policy pipeline.

Dispatching a synthetic click after manually setting el.checked = target can cause React controlled-component onChange handlers to toggle the value back, since the browser's real event order is click→change. Only dispatch change, which is sufficient for Vue @change and React onChange bindings. Update the JSDoc to explain why no click event is emitted.

…n and ARIA checkbox guard Build (stage-tamagotchi): @proj-airi/stage-ui alias points to a directory; rolldown concatenates sub-paths without adding a file extension, so flat .ts files at the stores/ root (mcp-tool-bridge.ts) are not resolved. Add an explicit alias entry for mcp-tool-bridge before the directory-level alias so rolldown finds the file directly. P1 (content.js checkCheckbox): for custom ARIA checkboxes (div[role=checkbox]) writing el.checked only creates an expando property — no native state changes and nothing is visually toggled. The bridge still returned success:true, so desktop_click_target reported click success and skipped OS-input fallback. Add an instanceof HTMLInputElement guard: non-native elements now return success:false, which triggers the frame-result validation added earlier and lets the handler fall back to OS click through the policy pipeline.

mcp-tool-bridge.ts was deleted in main by a refactor (5bbf955) that moved LLM tool registration to llm-tools.ts. However, that refactor targeted a different concern (xsai tool registration), while desktop- overlay.vue uses getMcpToolBridge().callTool() to invoke computer-use MCP tools at runtime — a separate bridge interface not replaced by the refactor. Restore the file with the original interface plus expanded JSDoc so the desktop overlay pages can resolve and use it.

The overlay window never calls setMcpToolBridge(), so the polling controller failed immediately with 'MCP tool bridge is not available'. Since the computer-use MCP tools are already exposed to the main process via electronMcpCallTool (registered during setupDesktopOverlayElectronInvokes), we can bypass the bridge singleton entirely and invoke the eventa route directly.

The previous commit introduced an unresolved import error in the tamagotchi renderer build because the relative path to shared/eventa stepped one directory too high (../../../shared/eventa instead of ../../shared/eventa). Fixed the path in desktop-overlay.vue, desktop-overlay-polling.ts, and its test file.

…WebSocket relay - background.js: read blur/simulateKeystrokes from cmd.opts (not top-level) to match BrowserDomExtensionBridge.setInputValue() wire format - state.ts: split updatePointerIntent into updatePointerIntent (intent only) and recordClickedCandidate (candidate id only), so duplicate-click guard only fires after execution succeeds, not before - register-desktop-grounding.ts: move recordClickedCandidate after action execution; if policy denies/requires approval, candidate is not marked - background.js: replace dead ws-incoming listener with native WebSocket relay connecting to BrowserDomExtensionBridge on localhost:8765

…e extension Add readInputValue, getComputedStyles, triggerEvent, waitForElement, and clickAt to both background.js (command router) and content.js (DOM API). These actions are called by BrowserDomExtensionBridge but were missing from the extension switch block, causing them to return 'unknown action' and making the bridge methods fail silently. Also makes the content.js message handler async-aware so waitForElement (which uses MutationObserver and returns a Promise) is properly awaited.

… paths, docstring - findChromeWindowBounds: prefer exact foregroundApp match over substring search to avoid selecting wrong Chrome window when multiple exist - README: update to reflect actual DOM-mutation capabilities, document WebSocket relay, remove stale 'no DOM mutations' claim - Remove local filesystem paths (/Users/liuziheng/...) from provenance comments in background.js, content.js, msg_bridge.js - ToolDescriptor: fix docstring to acknowledge defaultDeferred is optional - Add TODO for WebSocket auth handshake follow-up

…porting success clickSelector can resolve even when clickAt misses (e.g. layout shift after getClickTarget). Previously the handler always reported success and locked lastClickedCandidateId, blocking retries via duplicate-click protection. Now inspects per-frame results from clickAt; if none report success, throws to trigger OS-input fallback — matching the existing checkCheckbox pattern.

… fix lint - isTextInputCandidate: remove role=textbox catch-all that incorrectly routed contenteditable divs to setInputValue (which only works on native input/textarea). These targets now fall through to OS typing. - content.js: suppress unicorn/prefer-dom-node-text-content on intentional innerText uses (we want layout-aware visible text, not raw DOM text). - content.js: add TODO for simulateKeystrokes option (currently ignored).

…dler Same class of bug as register-desktop-grounding.ts fix (d83d0de). clickSelector can resolve even when clickAt misses after a reflow. Now inspects per-frame results; reports isError + click_miss status if no frame reported success.

…andler The message handler was wrapping method results as { success: true, data: <methodResult> } which created a double-envelope: methods like setInputValue already return { success: false, error: 'not found' } so consumers checking fr.result.success always saw true (transport layer) while the real DOM failure was buried in fr.result.data.success. Now returns method results directly. This fixes all clickResults/frameResults success checks in register-desktop-grounding.ts, register-tools.ts, and action-executor.ts — they were checking the correct field (fr.result.success) but getting the wrong semantics due to the wrapper.

- Restore DESKTOP_CLICK_SNAPSHOT_MAX_AGE_MS and foregroundWindowTitle from main - Align updatePointerIntent signature to accept candidateId (matches main) - Fix test call sites for updated signature

The noop reference was introduced in moeru-ai#1647 without an import. Use an inline arrow function instead.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: da94fe735e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-24T16:10:49Z

-              }
-            : snapshot.screenshot,
-        })
+        runtime.stateManager.updateGroundingSnapshot(snapshot)


Strip screenshot bytes before saving grounding snapshot

desktop_observe now writes the full snapshot into RunState, including screenshot.dataBase64. Because desktop_get_state returns the whole run state and the overlay poller requests it repeatedly, this turns every state response into a large image payload and can significantly increase IPC traffic and memory churn in normal use. Keep returning the image in the observe response, but redact screenshot bytes when persisting lastGroundingSnapshot (as the previous code path did).

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-24T16:10:49Z

      const result = await Promise.race([
-        currentCall,
+        config.callTool(MCP_TOOL_NAME),
        new Promise<never>((_, reject) =>


Cap timed-out overlay polls to prevent invoke buildup

This poll loop races callTool against a timeout but never cancels or tracks the timed-out invoke. If the Eventa call hangs (the code comment already calls out this startup race), each cycle schedules a new invoke while previous ones remain pending indefinitely, so prolonged outages accumulate unresolved RPC calls. Reintroduce an in-flight/lease cap (or equivalent guard) so fallback retries do not create unbounded outstanding calls.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-24T16:10:49Z

+// TODO: Add shared-secret auth handshake to prevent rogue localhost processes
+// from hijacking the bridge. The bridge server should generate a token and
+// inject it into chrome.storage.local so the extension can present it on hello.
+const WS_URL = 'ws://localhost:8765'


Use configured bridge endpoint instead of fixed localhost URL

The extension now hardcodes ws://localhost:8765, but the server runtime still supports non-default browser-dom bridge host/port configuration. In environments that set a different endpoint, the extension cannot connect, so browser-dom routing never activates and all DOM-native actions degrade to fallback behavior. Restore endpoint configurability (or handshake-based endpoint discovery) rather than a fixed URL.

Useful? React with 👍 / 👎.

chatgpt-codex-connector Bot reviewed Apr 12, 2026

View reviewed changes

Comment thread services/computer-use-mcp/src/server/register-desktop-grounding.ts Outdated

Comment thread services/computer-use-mcp/src/chrome-semantic-adapter.ts

Comment thread services/computer-use-mcp/src/server/register-desktop-grounding.ts Outdated

gemini-code-assist Bot reviewed Apr 12, 2026

View reviewed changes

3361559784 force-pushed the codex/desktop-v2-browser-routing branch from b815b7f to 4caca43 Compare April 20, 2026 07:13

chatgpt-codex-connector Bot reviewed Apr 20, 2026

View reviewed changes

Comment thread services/computer-use-mcp/src/server/register-desktop-grounding.ts Outdated

Comment thread services/computer-use-mcp/chrome-extension/content.js Outdated

Comment thread services/computer-use-mcp/chrome-extension/background.js

chatgpt-codex-connector Bot reviewed Apr 20, 2026

View reviewed changes

Comment thread services/computer-use-mcp/src/desktop-grounding.ts

Comment thread services/computer-use-mcp/src/chrome-semantic-adapter.ts

chatgpt-codex-connector Bot reviewed Apr 23, 2026

View reviewed changes

Comment thread services/computer-use-mcp/src/server/action-executor.ts Outdated

Comment thread services/computer-use-mcp/chrome-extension/background.js

chatgpt-codex-connector Bot reviewed Apr 23, 2026

View reviewed changes

Comment thread services/computer-use-mcp/src/server/action-executor.ts

Comment thread services/computer-use-mcp/src/server/register-desktop-grounding.ts Outdated

Comment thread services/computer-use-mcp/chrome-extension/content.js Outdated

3361559784 force-pushed the codex/desktop-v2-browser-routing branch from d015ff7 to d92d173 Compare April 23, 2026 08:36

chatgpt-codex-connector Bot reviewed Apr 23, 2026

View reviewed changes

Comment thread services/computer-use-mcp/chrome-extension/content.js

chatgpt-codex-connector Bot reviewed Apr 23, 2026

View reviewed changes

Comment thread apps/stage-tamagotchi/src/renderer/pages/desktop-overlay.vue Outdated

chatgpt-codex-connector Bot reviewed Apr 23, 2026

View reviewed changes

Comment thread apps/stage-tamagotchi/src/renderer/pages/desktop-overlay.vue Outdated

Comment thread services/computer-use-mcp/src/chrome-semantic-adapter.ts

chatgpt-codex-connector Bot reviewed Apr 23, 2026

View reviewed changes

Comment thread services/computer-use-mcp/chrome-extension/background.js Outdated

Comment thread services/computer-use-mcp/src/server/register-desktop-grounding.ts Outdated

Comment thread services/computer-use-mcp/chrome-extension/manifest.json

chatgpt-codex-connector Bot reviewed Apr 23, 2026

View reviewed changes

Comment thread services/computer-use-mcp/src/desktop-grounding.ts Outdated

Comment thread services/computer-use-mcp/chrome-extension/content.js

chatgpt-codex-connector Bot reviewed Apr 23, 2026

View reviewed changes

Comment thread services/computer-use-mcp/chrome-extension/background.js

Copilot AI review requested due to automatic review settings April 24, 2026 13:58

Copilot started reviewing on behalf of 3361559784 April 24, 2026 13:59 View session

Copilot AI reviewed Apr 24, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed Apr 24, 2026

View reviewed changes

Comment thread services/computer-use-mcp/src/server/register-desktop-grounding.ts Outdated

chatgpt-codex-connector Bot reviewed Apr 24, 2026

View reviewed changes

Comment thread services/computer-use-mcp/src/browser-action-router.ts Outdated

Comment thread services/computer-use-mcp/chrome-extension/content.js

chatgpt-codex-connector Bot reviewed Apr 24, 2026

View reviewed changes

autofix-ci Bot and others added 22 commits April 24, 2026 23:46

[autofix.ci] apply automated fixes

cba93f9

[autofix.ci] apply automated fixes

10cd0f0

[autofix.ci] apply automated fixes

0f2a417

[autofix.ci] apply automated fixes

335a1e0

[autofix.ci] apply automated fixes

819f2d3

3361559784 force-pushed the codex/desktop-v2-browser-routing branch from 4b6c5ce to 9c0284f Compare April 24, 2026 15:51

fix: resolve rebase conflicts with main (1/3 merge)

ab1114d

- Restore DESKTOP_CLICK_SNAPSHOT_MAX_AGE_MS and foregroundWindowTitle from main - Align updatePointerIntent signature to accept candidateId (matches main) - Fix test call sites for updated signature

3361559784 force-pushed the codex/desktop-v2-browser-routing branch from 9c0284f to ab1114d Compare April 24, 2026 15:54

autofix-ci Bot and others added 2 commits April 24, 2026 15:57

[autofix.ci] apply automated fixes

9b6bbb3

fix(stage-tamagotchi): replace undefined noop with inline arrow

da94fe7

The noop reference was introduced in moeru-ai#1647 without an import. Use an inline arrow function instead.

nekomeowww changed the title ~~[2/3] feat(desktop): implement browser-native DOM action routing~~ feat(stage-tamagotchi,computer-use-mcp): implement browser-native DOM action routing Apr 24, 2026

nekomeowww merged commit 255ce71 into moeru-ai:main Apr 24, 2026
10 checks passed

chatgpt-codex-connector Bot reviewed Apr 24, 2026

View reviewed changes

	content: [
	textContent(`Clicked selector "${selector}" in frame ${result.targetFrameId} at (${result.targetPoint.x}, ${result.targetPoint.y}).`),
	],

Uh oh!

Conversation

3361559784 commented Apr 12, 2026

Uh oh!

github-actions Bot commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⏳ Approval required for deploying to Cloudflare Workers (Preview) for stage-web.

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Apr 12, 2026 •

edited

Loading