Silence-based Chunked Transcription for long recordings with optional toggle#1252
Closed
meetuljain wants to merge 20 commits intocjpais:chunkingfrom
Closed
Silence-based Chunked Transcription for long recordings with optional toggle#1252meetuljain wants to merge 20 commits intocjpais:chunkingfrom
meetuljain wants to merge 20 commits intocjpais:chunkingfrom
Conversation
… template (cjpais#1175) Add a visible feature freeze section near the top of CONTRIBUTING.md so contributors see it before starting work. Fix "GATEHRED" typo in the PR template.
* audio_toolkit: remove long repeating words I've noticed a couple of times that Paraket v3 may produce repeats of words that are not folded by Handy so I propose to lift up word length limitation. * Update text.rs --------- Co-authored-by: CJ Pais <cj@cjpais.com>
* fix: don't log cloud provider keys * test: make sure keys are redacted * change to newtype * change secret struct to public * test: test secretmap directly * Update bindings.ts --------- Co-authored-by: Shaan <shaankhosla@macbook-pro.mynetworksettings.com> Co-authored-by: CJ Pais <cj@cjpais.com>
Fixes cjpais#1184. The tray icon was created and refreshed without setting a tooltip, which caused Windows to show an empty hover tooltip. This change sets the tooltip when building the tray icon and refreshes it during tray menu updates using the existing version label.
Co-authored-by: CJ Pais <cj@cjpais.com>
* add cohere * format * add appropriate translations * add chinese properly * 0.3.8 sentencepiece fix * update
* fix crash on old cpu * fix * disable whisper extensions
… in local models (cjpais#1221) * perf: add reasoning_effort passthrough to avoid thinking-mode latency Reasoning models (Gemma 4, Qwen 3, etc.) default to thinking mode, adding 10-40x latency for simple transcript cleanup. Pass through the OpenAI-compatible reasoning_effort parameter so users can disable thinking when speed matters more than deep reasoning. - Add optional reasoning_effort field to ChatCompletionRequest - Thread through send_chat_completion / send_chat_completion_with_schema - New set_post_process_reasoning_effort Tauri command for persistence - Dropdown in post-processing settings, Custom provider only - i18n for en and ja * refactor: simplify reasoning_effort to on/off toggle Addresses review feedback on the reasoning_effort passthrough: post-processing rarely benefits from reasoning, so a 5-way dropdown (default/none/low/medium/high) is overkill. Collapse to a single boolean toggle "Disable reasoning" that defaults to ON for Custom providers (sends reasoning_effort: "none"). Cloud providers continue to receive no reasoning_effort parameter. - Replace post_process_reasoning_effort: Option<String> with post_process_disable_reasoning: bool (default true) in settings - Map bool to Option<String> at the request site (true -> "none", false -> omit) so the API contract is unchanged - Rename Tauri command to set_post_process_disable_reasoning(disable) so parameter naming matches the setting and avoids inversion bugs - Swap the Dropdown for a ToggleSwitch in the Custom provider section - Collapse five i18n strings to one label + description, and add the translation to all 18 additional locales so non-en/ja users don't fall back to English for this control * refactor: remove disable-reasoning toggle, always disable for custom provider Per PR cjpais#1221 discussion with @cjpais: since this is an alpha feature, simplify the UI by always sending reasoning_effort: "none" for custom provider post-processing. If this causes issues with specific provider setups, we can revisit. * disable for openrouter too --------- Co-authored-by: CJ Pais <cj@cjpais.com>
…#1198) * fix: surface paste errors as UI toast notification (cjpais#522) When pasting the transcription fails (e.g. wtype/xdotool/dotool not available or returns an error on Linux), the error was silently logged to the backend console only. Users had no idea why their text was not pasted. Emit a new 'paste-error' Tauri event from the Rust side and listen for it in the frontend App component, showing a sonner toast with the error detail — mirroring the existing 'recording-error' pattern exactly. Files changed: - src-tauri/src/actions.rs: add PasteErrorEvent struct, emit event on paste failure - src/App.tsx: add useEffect listener that shows toast.error on paste-error - src/lib/types/events.ts: add PasteErrorEvent interface - src/i18n/locales/en/translation.json: add pasteFailedTitle/pasteFailed keys * chore: apply cargo fmt and prettier formatting * add translationss --------- Co-authored-by: CJ Pais <cj@cjpais.com>
Owner
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR adds an optional chunked transcription mode for long recordings.
Instead of waiting until the user stops recording and then transcribing the entire audio at once, Handy can now detect silence, split the recording into chunks, and transcribe completed chunks in the background while recording is still in progress.
This significantly improves perceived speed for long dictation / monologue use cases.
I'm aware Handy is currently under feature freeze, so I want to call out that this is based on a previously requested use case:
#179
Changes
chunked_transcription_enabledTrade-offs
Chunking improves speed, but because chunks are transcribed independently, context can be lost at chunk boundaries and transcription quality may be slightly lower in some cases. That is why this feature is optional.
Testing
Related discussion
AI disclosure
AI used: Yes
Tools used:
How it was used: