feat(visual-chat): harden vision-text pipeline and packaged desktop runtime by Joker-of-Gotham · Pull Request #1579 · moeru-ai/airi

Joker-of-Gotham · 2026-04-05T21:57:34Z

Description

This PR hardens the AIRI visual chat stack and aligns the shipped product around a secure vision+text realtime pipeline. It removes the remaining workspace-only assumptions from the desktop runtime path, tightens session access, and updates the docs / diagnostics / devtools to match the actual ollama-lite worker behavior.

What this PR adds

Session-scoped and local-admin access tokens for visual chat gateway access
Unified sessionId validation across gateway routes, websocket subscription, and storage paths
Protected room token issuance, diagnostics, and worker proxy surfaces
Packaged desktop runtime startup for bundled visual-chat gateway / worker entries
App-data-backed tunnel config, public endpoint files, and cloudflared cache paths
Safer visual-chat stop behavior that avoids blind port-based process killing
A clearer shipped interaction mode name: �ision-text-realtime
Worker health / diagnostics / devtools / docs alignment with the fixed ollama-lite path
Apple Silicon Ollama discovery improvement and Windows ARM64 cloudflared download support

Technical path

This implementation keeps the current shipped runtime explicitly focused on realtime latest-frame vision plus typed text prompts. The gateway now derives stable local access tokens from a persisted secret, enforces them across session routes and websocket flows, and the desktop runtime can now resolve either a development workspace or a packaged runtime path.

Key design points:

session-scoped access for phone/shared clients
local-admin access for diagnostics, session management, and worker proxy routes
fail-closed desktop setup when neither workspace nor packaged runtime is available
packaged runtime startup via bundled visual-chat service dist entries
visual-chat runtime artifacts moved out of the repository root and into app-data locations
docs and devtools updated to stop advertising native duplex audio in ollama-lite mode

Linked Issues

N/A

Additional Context

Validation / Checks

pnpm install
pnpm -F @proj-airi/electron-eventa build
pnpm -F @proj-airi/stage-tamagotchi run typecheck:node
pnpm -F @proj-airi/visual-chat-ops doctor:visual-chat
pnpm -F @proj-airi/stage-tamagotchi build
pnpm -F @proj-airi/visual-chat-ops start:local
- gateway and worker reported healthy at http://127.0.0.1:6200/health and http://127.0.0.1:6201/health
- the frontend build reached preview startup during the smoke run
targeted pnpm exec moeru-lint --fix ... on touched visual-chat / desktop-runtime paths

Reviewer focus

Gateway auth / websocket subscription enforcement
Packaged desktop runtime startup in stage-tamagotchi
App-data path migration for tunnel and public endpoint artifacts
Docs / diagnostics / devtools alignment with the fixed ollama-lite pipeline

Made-with: Cursor

…untime Made-with: Cursor

gemini-code-assist

Code Review

This pull request introduces a new Visual Chat feature, adding a comprehensive architecture for real-time multimodal interaction. It includes a gateway service, an inference worker bridge, and integration with the AIRI ecosystem. My review identified a potential issue with process group termination in the development script, a need for better error feedback in the WebSocket subscription handler, and a missing timeout mechanism for user-triggered inference requests.

gemini-code-assist · 2026-04-05T22:01:39Z

+const child = spawn('pnpm', ['exec', 'electron-vite', 'dev'], {
+  cwd: process.cwd(),
+  env,
+  stdio: 'inherit',
+  shell: true,
+})


On non-Windows platforms, process.kill(-child.pid!, 'SIGTERM') is used to terminate the entire process group. For this to work correctly, the child process must be spawned as a process group leader using the detached: true option. Without it, child.pid is not a valid process group ID, and the signal might fail or be sent to the parent's process group.

Suggested change

const child = spawn('pnpm', ['exec', 'electron-vite', 'dev'], {

cwd: process.cwd(),

env,

stdio: 'inherit',

shell: true,

})

const child = spawn('pnpm', ['exec', 'electron-vite', 'dev'], {

cwd: process.cwd(),

env,

stdio: 'inherit',

shell: true,

detached: process.platform !== 'win32',

})

gemini-code-assist · 2026-04-05T22:01:39Z

+          if (!data.sessionToken || !options.authorizeSessionAccess?.(data.sessionId, data.sessionToken))
+            return


When a subscription request fails authorization, the message is silently ignored. It would be better to provide feedback to the client by sending an error event. Per repository rules, ensure the event includes a programmatic 'source' identifier for better extensibility.

if (data.type === 'subscribe' && data.sessionId) { if (!data.sessionToken || !options.authorizeSessionAccess?.(data.sessionId, data.sessionToken)) { peer.send(JSON.stringify({ event: 'error', source: 'subscription_handler', sessionId: data.sessionId, data: { message: 'Unauthorized session access' }, timestamp: Date.now() })) return }

References

For extensibility, identify the source of an event using a programmatic identifier (e.g., a source field in metadata) rather than relying on string patterns within the message content.

gemini-code-assist · 2026-04-05T22:01:39Z

+    const timeoutId = auto
+      ? setTimeout(() => {
+          if (!abortController.signal.aborted) {
+            state.stats.timedOut++
+            log.withTag('realtime').log(`[trace:${traceId}] Inference timed out after ${AUTO_OBSERVE_INFERENCE_TIMEOUT_MS}ms`)
+            abortController.abort()
+          }
+        }, AUTO_OBSERVE_INFERENCE_TIMEOUT_MS)
+      : null


User-triggered inference requests (auto: false) currently have no timeout mechanism. If the worker bridge or the upstream Ollama service hangs, the gateway's request will remain pending indefinitely, potentially leading to resource exhaustion or a degraded user experience. Consider applying a default timeout for all inference requests.

const timeoutId = setTimeout(() => { if (!abortController.signal.aborted) { if (auto) state.stats.timedOut++ log.withTag('realtime').log('[trace:' + traceId + '] Inference timed out after ' + AUTO_OBSERVE_INFERENCE_TIMEOUT_MS + 'ms') abortController.abort() } }, AUTO_OBSERVE_INFERENCE_TIMEOUT_MS)

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 25761c705e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-05T22:05:49Z

+
+  router.post('/api/sessions/:sessionId/access', defineEventHandler((event) => {
+    requireGatewayAccess(event)
+    const sessionId = normalizeVisualChatSessionId(getRouterParam(event, 'sessionId')!)


Handle invalid session IDs before normalizing access route

normalizeVisualChatSessionId(...) throws for malformed IDs, but this handler does not catch that exception, so a bad :sessionId currently bubbles out as a 500 instead of a 4xx client error. That makes malformed requests look like server failures and breaks the "unified sessionId validation" behavior expected by callers (the same uncaught pattern is also used in the restore route).

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-05T22:05:49Z

+    state.autoObserve.lastFrameFingerprint = fingerprint
+
+    const startedAt = Date.now()
+    await this.inferWithLatestFrame(sessionId, DEFAULT_AUTO_OBSERVE_PROMPT, true)


Update auto-observe fingerprint only after successful inference

The frame fingerprint is persisted before running inferWithLatestFrame(...). If that inference fails or is aborted (e.g., transient worker outage), lastFrameFingerprint still advances, so subsequent cycles on the same frame are treated as "no change" and skipped. In practice this can stall auto-observe updates until a new frame arrives, even though no successful memory refresh happened.

Useful? React with 👍 / 👎.

…source-realtime-pipeline

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4a3346bc5e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-06T09:57:50Z

+    function handleChunk(chunk: Uint8Array | string) {
+      const text = chunk.toString()
+      appendOutput(text)
+      const match = text.match(QUICK_TUNNEL_PATTERN)


Parse quick-tunnel URL from accumulated output

cloudflared stdout/stderr chunks are not line- or token-aligned, but the matcher only checks the current text chunk. If the trycloudflare.com URL is split across chunk boundaries, the regex never matches and the command times out even though the tunnel is actually up, causing flaky share/start behavior in real runs.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-06T09:57:50Z

+        settled = true
+        resolve({
+          name: tunnelName,
+          url: '',
+          close: () => child.kill(),


Reject tunnel startup when readiness times out

When no readiness log line appears before TUNNEL_RUN_TIMEOUT_MS, this path resolves success with an empty URL instead of failing. That lets startNamedTunnels continue, write endpoint metadata, and report tunnels as running even when they may never have connected, which can leave users with broken public URLs and no immediate error.

Useful? React with 👍 / 👎.

github-actions · 2026-04-07T08:52:17Z

⏳ Approval required for deploying to Cloudflare Workers (Preview) for stage-web.

Name	Link
🔭 Waiting for approval	For maintainers, approve here

Hey, @nekomeowww, @sumimakito, @luoling8192, @LemonNekoGH, kindly take some time to review and approve this deployment when you are available. Thank you! 🙏

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 55cc0f13c7

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-04-07T09:02:50Z

+  if (!actual || expected.length !== actual.length)
+    return false
+  return timingSafeEqual(Buffer.from(expected), Buffer.from(actual))


Guard timing-safe compare against UTF-8 length mismatches

tokensMatch checks string length before calling timingSafeEqual, but Buffer.from(...) compares byte length, not JS string length. A header containing non-ASCII characters can have the same .length as the expected token while producing a different byte length, which makes timingSafeEqual throw and turns auth failures into 500s on requireGatewayAccess routes instead of a clean 403. This is externally triggerable with a crafted token header and should be handled as a normal auth miss.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-07T09:02:50Z

+    }
+    else {
+      try {
+        execSync('pkill -f "visual-chat-gateway|visual-chat-worker-minicpmo|visual-chat-ops/.+(share.ts|dev-tamagotchi.ts|setup-tunnel.ts)|cloudflared tunnel"', { stdio: 'ignore' })


Limit stop command to AIRI-managed cloudflared processes

The Unix stop path unconditionally runs pkill -f ...|cloudflared tunnel, which matches any cloudflared tunnel process on the machine, not just tunnels started by this project. Running visual-chat-ops stop can therefore terminate unrelated Cloudflare tunnels used by other apps/sessions, causing collateral outages outside AIRI.

Useful? React with 👍 / 👎.

…ture/ai-visual-chat-multisource-realtime-pipeline # Conflicts: # apps/stage-tamagotchi/tsconfig.node.json # apps/stage-tamagotchi/tsconfig.web.json # packages/stage-pages/tsconfig.json # packages/stage-ui/package.json # packages/stage-ui/src/components/markdown/markdown-renderer.vue # packages/stage-ui/src/composables/use-modules-list.ts # pnpm-lock.yaml # pnpm-workspace.yaml

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ab7c572dd8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-29T14:33:18Z

+  console.info('\nInstallation complete.')
+}
+
+install()


Guard install CLI from running on module import

Remove the top-level install() call or gate it behind a direct-execution check. This file is re-exported by packages/visual-chat-ops/src/index.ts, and apps/stage-tamagotchi/src/main/services/electron/visual-chat.ts imports that package at runtime; as written, simply importing the module can execute pnpm install and pnpm run build:packages, causing unexpected long-running side effects and startup failures in production/dev flows that only intended to use helper functions.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-29T14:33:18Z

+  console.info('\nPrune complete.')
+}
+
+prune()


Guard prune CLI from running on module import

Do not invoke prune() at module top level; gate it with an isDirectExecution() check like the other CLI files. Because this module is also re-exported from packages/visual-chat-ops/src/index.ts, importing @proj-airi/visual-chat-ops can immediately run retention cleanup (pruneWithPolicy) against visual-chat data/log/cache directories, which is destructive behavior for callers that only intended to import library APIs.

Useful? React with 👍 / 👎.

…ture/ai-visual-chat-multisource-realtime-pipeline

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 83497f10f7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-29T15:19:29Z

+          + ' ($_.CommandLine -match \'visual-chat-gateway\')'
+          + ' -or ($_.CommandLine -match \'visual-chat-worker-minicpmo\')'
+          + ' -or ($_.CommandLine -match \'visual-chat-ops.+(share\\\\.ts|dev-tamagotchi\\\\.ts|setup-tunnel\\\\.ts)\')'
+          + ' -or ($_.Name -match \'cloudflared\')'


Scope Windows process kill filter to AIRI-managed tunnels

Tighten the Windows stop filter so it does not match all cloudflared processes by executable name alone. The current predicate includes ($_.Name -match 'cloudflared'), so running visual-chat-ops stop on Windows will terminate unrelated Cloudflare tunnels on the same machine, not just AIRI-managed ones; this causes collateral outages for other local services/users whenever they coexist.

Useful? React with 👍 / 👎.

Joker-of-Gotham added 5 commits April 6, 2026 05:54

feat(visual-chat): add shared protocol and storage foundations

9412027

Made-with: Cursor

feat(visual-chat): add media core and worker runtime layers

9f1e352

Made-with: Cursor

feat(visual-chat): add gateway and worker bridge services

5c64cd3

Made-with: Cursor

feat(visual-chat): add sdk, ops tooling, and bridge plugins

78e99c9

Made-with: Cursor

feat(visual-chat): harden vision-text pipeline and packaged desktop r…

25761c7

…untime Made-with: Cursor

gemini-code-assist Bot reviewed Apr 5, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed Apr 5, 2026

View reviewed changes

Merge branch 'main' into joker-of-gotham/feature/ai-visual-chat-multi…

4a3346b

…source-realtime-pipeline

chatgpt-codex-connector Bot reviewed Apr 6, 2026

View reviewed changes

[autofix.ci] apply automated fixes

55cc0f1

chatgpt-codex-connector Bot reviewed Apr 7, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed Apr 29, 2026

View reviewed changes

Merge remote-tracking branch 'upstream/main' into joker-of-gotham/fea…

83497f1

…ture/ai-visual-chat-multisource-realtime-pipeline

chatgpt-codex-connector Bot reviewed Apr 29, 2026

View reviewed changes

		if (!data.sessionToken \|\| !options.authorizeSessionAccess?.(data.sessionId, data.sessionToken))
		return

Uh oh!

Conversation

Joker-of-Gotham commented Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

What this PR adds

Technical path

Linked Issues

Additional Context

Validation / Checks

Reviewer focus

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Apr 7, 2026

⏳ Approval required for deploying to Cloudflare Workers (Preview) for stage-web.

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Joker-of-Gotham commented Apr 5, 2026 •

edited

Loading