Skip to content

refactor: improve transcription UX and DX#1685

Draft
drHuangMHT wants to merge 14 commits intomoeru-ai:mainfrom
drHuangMHT:use-transcriptions
Draft

refactor: improve transcription UX and DX#1685
drHuangMHT wants to merge 14 commits intomoeru-ai:mainfrom
drHuangMHT:use-transcriptions

Conversation

@drHuangMHT
Copy link
Copy Markdown
Contributor

@drHuangMHT drHuangMHT commented Apr 18, 2026

Description

  • move transcription related code into composable to improve code readability(tests included).
  • microphone icon color now reflects whether the transcription is running.
  • support transcription on portrait orientation.

Linked Issues

Additional Context

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 18, 2026

⏳ Approval required for deploying to Cloudflare Workers (Preview) for stage-web.

Name Link
🔭 Waiting for approval For maintainers, approve here

Hey, maintainers, kindly take some time to review and approve this deployment when you are available. Thank you! 🙏

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors transcription and auto-send logic by extracting it from ChatArea.vue into a new reusable composable, useTranscriptions.ts. This composable is then integrated into both ChatArea.vue and MobileInteractiveArea.vue to provide consistent transcription behavior across layouts. Review feedback suggests removing a redundant try-catch block in the new composable and transitioning from string-prefixed log messages to programmatic identifiers in metadata for better extensibility.

Comment thread packages/stage-layouts/src/composables/use-transcriptions.ts Outdated
Comment thread packages/stage-layouts/src/composables/use-transcriptions.ts Outdated
@drHuangMHT
Copy link
Copy Markdown
Contributor Author

blocked by #1569

@lietblue lietblue added feature Related to feature scope/audio-input Scope related to audio input (ASR, Whisper, STT, etc.) scope/ui Scope related to UI/UX, or interface improve, perf, and bugs labels Apr 21, 2026
@drHuangMHT drHuangMHT marked this pull request as ready for review April 25, 2026 11:16
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f09a8e2ea1

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

{
messageInputRef: messageInput,
sendMessage: handleSend,
isStageTamagotchi: isMobileDevice,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Pass runtime env check instead of mobile-user-agent flag

useTranscriptions expects isStageTamagotchi to mean the Electron runtime, but this call passes isMobileDevice (user-agent mobile detection). On mobile web/Capacitor this evaluates true, so startStreaming treats the app like Electron and skips Web Speech auto-configuration, causing transcription startup to fail when no provider is preconfigured. This regresses the new portrait/mobile transcription path for first-time users.

Useful? React with 👍 / 👎.

Comment on lines +65 to +66
clearPendingAutoSend()
await stopStreamingTranscription(true)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Send pending transcript when stopping listening

Stopping transcription now cancels the debounce timer unconditionally and never flushes the last pending auto-send, so if a user turns the microphone off during the debounce window their final dictated sentence is left unsent. The previous logic explicitly sent pending text on stop; this refactor drops that behavior and can silently lose expected auto-send delivery in normal use.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

intended behavior

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0cc7a1f169

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +203 to +207
watch(autoSendEnabled, (enabled) => {
if (!enabled) {
clearPendingAutoSend()
console.info('Auto-send disabled', { source: 'useTranscriptions' })
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Stop active transcription when microphone is disabled

This composable no longer reacts to enabled/stream changes, so turning the microphone off in HearingConfig can leave an active transcription session running until manual stop or unmount. That is especially problematic for the Web Speech provider path, which is designed to run continuously until stopStreamingTranscription() is explicitly called. Repro: start transcription, then disable the mic; isListening stays active and speech callbacks can continue even though the mic UI is off.

Useful? React with 👍 / 👎.

Comment on lines 166 to 168
watch([enabled], () => {
setupAnalyzer()
}, { immediate: true })
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Re-run analyzer setup when popover or stream changes

setupAnalyzer() now depends on hearingPopoverOpen and stream, but the watcher only observes enabled. If the mic is already enabled, opening the hearing popover (or receiving the stream slightly later) no longer re-initializes the analyzer, so the volume visualization can stay inactive until enabled is toggled again.

Useful? React with 👍 / 👎.

Comment on lines +91 to +94
v-if="autoSend"
label="Auto send"
:variant="autoSend ? 'primary' : 'secondary'"
flex-1
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Render auto-send toggle even when the value is false

The auto-send button is gated by v-if="autoSend", so it disappears whenever the setting is false. Since autoSendEnabled defaults to false, users cannot enable auto-send from this dialog at all, and once disabled they cannot turn it back on here.

Useful? React with 👍 / 👎.

@drHuangMHT drHuangMHT marked this pull request as draft May 2, 2026 03:56
@drHuangMHT drHuangMHT changed the title refactor: move transcription related code into composable refactor: improve transcription UX and DX May 2, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9915f883a3

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

}

watch([hearingDialogOpen, enabled, stream], () => {
watch([enabled], () => {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Watch dialog and stream changes when setting up analyzer

setupAnalyzer() still gates on hearingDialogOpen and stream, but the watcher now only tracks enabled. If the mic is already enabled, opening the hearing dialog or receiving the stream later will no longer re-run setup, so the volume meter can remain inactive until enabled is toggled again. This is a regression from the previous dependency set and affects normal UI interaction on mobile.

Useful? React with 👍 / 👎.

})

it('should expose startListening and stopListening', () => {
const { startListening, stopListening } = useTranscriptions(createOptions())
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Use the composable's exported method names in tests

The tests destructure startListening/stopListening, but useTranscriptions() now returns startStreamingTranscription/stopStreamingTranscription. As written, these variables are undefined, so calls like await startListening() will throw and make the new test suite fail instead of validating behavior.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature Related to feature scope/audio-input Scope related to audio input (ASR, Whisper, STT, etc.) scope/ui Scope related to UI/UX, or interface improve, perf, and bugs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants