feat(visual-chat): harden vision-text pipeline and packaged desktop runtime#1579
Conversation
Made-with: Cursor
Made-with: Cursor
Made-with: Cursor
Made-with: Cursor
…untime Made-with: Cursor
There was a problem hiding this comment.
Code Review
This pull request introduces a new Visual Chat feature, adding a comprehensive architecture for real-time multimodal interaction. It includes a gateway service, an inference worker bridge, and integration with the AIRI ecosystem. My review identified a potential issue with process group termination in the development script, a need for better error feedback in the WebSocket subscription handler, and a missing timeout mechanism for user-triggered inference requests.
| const child = spawn('pnpm', ['exec', 'electron-vite', 'dev'], { | ||
| cwd: process.cwd(), | ||
| env, | ||
| stdio: 'inherit', | ||
| shell: true, | ||
| }) |
There was a problem hiding this comment.
On non-Windows platforms, process.kill(-child.pid!, 'SIGTERM') is used to terminate the entire process group. For this to work correctly, the child process must be spawned as a process group leader using the detached: true option. Without it, child.pid is not a valid process group ID, and the signal might fail or be sent to the parent's process group.
| const child = spawn('pnpm', ['exec', 'electron-vite', 'dev'], { | |
| cwd: process.cwd(), | |
| env, | |
| stdio: 'inherit', | |
| shell: true, | |
| }) | |
| const child = spawn('pnpm', ['exec', 'electron-vite', 'dev'], { | |
| cwd: process.cwd(), | |
| env, | |
| stdio: 'inherit', | |
| shell: true, | |
| detached: process.platform !== 'win32', | |
| }) |
| if (!data.sessionToken || !options.authorizeSessionAccess?.(data.sessionId, data.sessionToken)) | ||
| return |
There was a problem hiding this comment.
When a subscription request fails authorization, the message is silently ignored. It would be better to provide feedback to the client by sending an error event. Per repository rules, ensure the event includes a programmatic 'source' identifier for better extensibility.
if (data.type === 'subscribe' && data.sessionId) {
if (!data.sessionToken || !options.authorizeSessionAccess?.(data.sessionId, data.sessionToken)) {
peer.send(JSON.stringify({ event: 'error', source: 'subscription_handler', sessionId: data.sessionId, data: { message: 'Unauthorized session access' }, timestamp: Date.now() }))
return
}References
- For extensibility, identify the source of an event using a programmatic identifier (e.g., a source field in metadata) rather than relying on string patterns within the message content.
| const timeoutId = auto | ||
| ? setTimeout(() => { | ||
| if (!abortController.signal.aborted) { | ||
| state.stats.timedOut++ | ||
| log.withTag('realtime').log(`[trace:${traceId}] Inference timed out after ${AUTO_OBSERVE_INFERENCE_TIMEOUT_MS}ms`) | ||
| abortController.abort() | ||
| } | ||
| }, AUTO_OBSERVE_INFERENCE_TIMEOUT_MS) | ||
| : null |
There was a problem hiding this comment.
User-triggered inference requests (auto: false) currently have no timeout mechanism. If the worker bridge or the upstream Ollama service hangs, the gateway's request will remain pending indefinitely, potentially leading to resource exhaustion or a degraded user experience. Consider applying a default timeout for all inference requests.
const timeoutId = setTimeout(() => {
if (!abortController.signal.aborted) {
if (auto) state.stats.timedOut++
log.withTag('realtime').log('[trace:' + traceId + '] Inference timed out after ' + AUTO_OBSERVE_INFERENCE_TIMEOUT_MS + 'ms')
abortController.abort()
}
}, AUTO_OBSERVE_INFERENCE_TIMEOUT_MS)There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 25761c705e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
|
||
| router.post('/api/sessions/:sessionId/access', defineEventHandler((event) => { | ||
| requireGatewayAccess(event) | ||
| const sessionId = normalizeVisualChatSessionId(getRouterParam(event, 'sessionId')!) |
There was a problem hiding this comment.
Handle invalid session IDs before normalizing access route
normalizeVisualChatSessionId(...) throws for malformed IDs, but this handler does not catch that exception, so a bad :sessionId currently bubbles out as a 500 instead of a 4xx client error. That makes malformed requests look like server failures and breaks the "unified sessionId validation" behavior expected by callers (the same uncaught pattern is also used in the restore route).
Useful? React with 👍 / 👎.
| state.autoObserve.lastFrameFingerprint = fingerprint | ||
|
|
||
| const startedAt = Date.now() | ||
| await this.inferWithLatestFrame(sessionId, DEFAULT_AUTO_OBSERVE_PROMPT, true) |
There was a problem hiding this comment.
Update auto-observe fingerprint only after successful inference
The frame fingerprint is persisted before running inferWithLatestFrame(...). If that inference fails or is aborted (e.g., transient worker outage), lastFrameFingerprint still advances, so subsequent cycles on the same frame are treated as "no change" and skipped. In practice this can stall auto-observe updates until a new frame arrives, even though no successful memory refresh happened.
Useful? React with 👍 / 👎.
…source-realtime-pipeline
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4a3346bc5e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| function handleChunk(chunk: Uint8Array | string) { | ||
| const text = chunk.toString() | ||
| appendOutput(text) | ||
| const match = text.match(QUICK_TUNNEL_PATTERN) |
There was a problem hiding this comment.
Parse quick-tunnel URL from accumulated output
cloudflared stdout/stderr chunks are not line- or token-aligned, but the matcher only checks the current text chunk. If the trycloudflare.com URL is split across chunk boundaries, the regex never matches and the command times out even though the tunnel is actually up, causing flaky share/start behavior in real runs.
Useful? React with 👍 / 👎.
| settled = true | ||
| resolve({ | ||
| name: tunnelName, | ||
| url: '', | ||
| close: () => child.kill(), |
There was a problem hiding this comment.
Reject tunnel startup when readiness times out
When no readiness log line appears before TUNNEL_RUN_TIMEOUT_MS, this path resolves success with an empty URL instead of failing. That lets startNamedTunnels continue, write endpoint metadata, and report tunnels as running even when they may never have connected, which can leave users with broken public URLs and no immediate error.
Useful? React with 👍 / 👎.
⏳ Approval required for deploying to Cloudflare Workers (Preview) for stage-web.
Hey, @nekomeowww, @sumimakito, @luoling8192, @LemonNekoGH, kindly take some time to review and approve this deployment when you are available. Thank you! 🙏 |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 55cc0f13c7
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if (!actual || expected.length !== actual.length) | ||
| return false | ||
| return timingSafeEqual(Buffer.from(expected), Buffer.from(actual)) |
There was a problem hiding this comment.
Guard timing-safe compare against UTF-8 length mismatches
tokensMatch checks string length before calling timingSafeEqual, but Buffer.from(...) compares byte length, not JS string length. A header containing non-ASCII characters can have the same .length as the expected token while producing a different byte length, which makes timingSafeEqual throw and turns auth failures into 500s on requireGatewayAccess routes instead of a clean 403. This is externally triggerable with a crafted token header and should be handled as a normal auth miss.
Useful? React with 👍 / 👎.
| } | ||
| else { | ||
| try { | ||
| execSync('pkill -f "visual-chat-gateway|visual-chat-worker-minicpmo|visual-chat-ops/.+(share.ts|dev-tamagotchi.ts|setup-tunnel.ts)|cloudflared tunnel"', { stdio: 'ignore' }) |
There was a problem hiding this comment.
Limit stop command to AIRI-managed cloudflared processes
The Unix stop path unconditionally runs pkill -f ...|cloudflared tunnel, which matches any cloudflared tunnel process on the machine, not just tunnels started by this project. Running visual-chat-ops stop can therefore terminate unrelated Cloudflare tunnels used by other apps/sessions, causing collateral outages outside AIRI.
Useful? React with 👍 / 👎.
…ture/ai-visual-chat-multisource-realtime-pipeline # Conflicts: # apps/stage-tamagotchi/tsconfig.node.json # apps/stage-tamagotchi/tsconfig.web.json # packages/stage-pages/tsconfig.json # packages/stage-ui/package.json # packages/stage-ui/src/components/markdown/markdown-renderer.vue # packages/stage-ui/src/composables/use-modules-list.ts # pnpm-lock.yaml # pnpm-workspace.yaml
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ab7c572dd8
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| console.info('\nInstallation complete.') | ||
| } | ||
|
|
||
| install() |
There was a problem hiding this comment.
Guard install CLI from running on module import
Remove the top-level install() call or gate it behind a direct-execution check. This file is re-exported by packages/visual-chat-ops/src/index.ts, and apps/stage-tamagotchi/src/main/services/electron/visual-chat.ts imports that package at runtime; as written, simply importing the module can execute pnpm install and pnpm run build:packages, causing unexpected long-running side effects and startup failures in production/dev flows that only intended to use helper functions.
Useful? React with 👍 / 👎.
| console.info('\nPrune complete.') | ||
| } | ||
|
|
||
| prune() |
There was a problem hiding this comment.
Guard prune CLI from running on module import
Do not invoke prune() at module top level; gate it with an isDirectExecution() check like the other CLI files. Because this module is also re-exported from packages/visual-chat-ops/src/index.ts, importing @proj-airi/visual-chat-ops can immediately run retention cleanup (pruneWithPolicy) against visual-chat data/log/cache directories, which is destructive behavior for callers that only intended to import library APIs.
Useful? React with 👍 / 👎.
…ture/ai-visual-chat-multisource-realtime-pipeline
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 83497f10f7
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| + ' ($_.CommandLine -match \'visual-chat-gateway\')' | ||
| + ' -or ($_.CommandLine -match \'visual-chat-worker-minicpmo\')' | ||
| + ' -or ($_.CommandLine -match \'visual-chat-ops.+(share\\\\.ts|dev-tamagotchi\\\\.ts|setup-tunnel\\\\.ts)\')' | ||
| + ' -or ($_.Name -match \'cloudflared\')' |
There was a problem hiding this comment.
Scope Windows process kill filter to AIRI-managed tunnels
Tighten the Windows stop filter so it does not match all cloudflared processes by executable name alone. The current predicate includes ($_.Name -match 'cloudflared'), so running visual-chat-ops stop on Windows will terminate unrelated Cloudflare tunnels on the same machine, not just AIRI-managed ones; this causes collateral outages for other local services/users whenever they coexist.
Useful? React with 👍 / 👎.
Description
This PR hardens the AIRI visual chat stack and aligns the shipped product around a secure vision+text realtime pipeline. It removes the remaining workspace-only assumptions from the desktop runtime path, tightens session access, and updates the docs / diagnostics / devtools to match the actual ollama-lite worker behavior.
What this PR adds
Technical path
This implementation keeps the current shipped runtime explicitly focused on realtime latest-frame vision plus typed text prompts. The gateway now derives stable local access tokens from a persisted secret, enforces them across session routes and websocket flows, and the desktop runtime can now resolve either a development workspace or a packaged runtime path.
Key design points:
Linked Issues
N/A
Additional Context
Validation / Checks
Reviewer focus