Skip to content

Silence-based Chunked Transcription for long recordings with optional toggle#1252

Closed
meetuljain wants to merge 20 commits intocjpais:chunkingfrom
meetuljain:chunking
Closed

Silence-based Chunked Transcription for long recordings with optional toggle#1252
meetuljain wants to merge 20 commits intocjpais:chunkingfrom
meetuljain:chunking

Conversation

@meetuljain
Copy link
Copy Markdown

This PR adds an optional chunked transcription mode for long recordings.

Instead of waiting until the user stops recording and then transcribing the entire audio at once, Handy can now detect silence, split the recording into chunks, and transcribe completed chunks in the background while recording is still in progress.

This significantly improves perceived speed for long dictation / monologue use cases.

I'm aware Handy is currently under feature freeze, so I want to call out that this is based on a previously requested use case:
#179

Changes

  • Added silence-based chunked transcription during recording
  • Added a toggle in Advanced Settings
  • Persisted the setting with chunked_transcription_enabled
  • Preserved the original transcription flow when disabled

Trade-offs

Chunking improves speed, but because chunks are transcribed independently, context can be lost at chunk boundaries and transcription quality may be slightly lower in some cases. That is why this feature is optional.

Testing

  • Tested long recordings with chunking enabled
  • Confirmed earlier background transcription
  • Confirmed legacy behavior when disabled
  • Verified existing paste-error behavior remains intact

Related discussion

AI disclosure

AI used: Yes

Tools used:

  • Codex
  • ChatGPT

How it was used:

  • Used Codex to understand the code structure
  • Gave Codex pseudocode and it generated an initial implementation
  • I then manually reviewed the generated code, identified bugs, fixed them, and refined/consolidated the implementation
  • ChatGPT was used to help polish the PR text

Roger-Wu and others added 20 commits March 28, 2026 10:50
… template (cjpais#1175)

Add a visible feature freeze section near the top of CONTRIBUTING.md
so contributors see it before starting work. Fix "GATEHRED" typo in
the PR template.
* audio_toolkit: remove long repeating words

I've noticed a couple of times that Paraket v3 may produce
repeats of words that are not folded by Handy so I propose
to lift up word length limitation.

* Update text.rs

---------

Co-authored-by: CJ Pais <cj@cjpais.com>
* fix: don't log cloud provider keys

* test: make sure keys are redacted

* change to newtype

* change secret struct to public

* test: test secretmap directly

* Update bindings.ts

---------

Co-authored-by: Shaan <shaankhosla@macbook-pro.mynetworksettings.com>
Co-authored-by: CJ Pais <cj@cjpais.com>
Fixes cjpais#1184.

The tray icon was created and refreshed without setting a tooltip, which caused Windows to show an empty hover tooltip.

This change sets the tooltip when building the tray icon and refreshes it during tray menu updates using the existing version label.
Co-authored-by: CJ Pais <cj@cjpais.com>
* add cohere

* format

* add appropriate translations

* add chinese properly

* 0.3.8 sentencepiece fix

* update
* fix crash on old cpu

* fix

* disable whisper extensions
… in local models (cjpais#1221)

* perf: add reasoning_effort passthrough to avoid thinking-mode latency

Reasoning models (Gemma 4, Qwen 3, etc.) default to thinking mode,
adding 10-40x latency for simple transcript cleanup. Pass through the
OpenAI-compatible reasoning_effort parameter so users can disable
thinking when speed matters more than deep reasoning.

- Add optional reasoning_effort field to ChatCompletionRequest
- Thread through send_chat_completion / send_chat_completion_with_schema
- New set_post_process_reasoning_effort Tauri command for persistence
- Dropdown in post-processing settings, Custom provider only
- i18n for en and ja

* refactor: simplify reasoning_effort to on/off toggle

Addresses review feedback on the reasoning_effort passthrough:
post-processing rarely benefits from reasoning, so a 5-way dropdown
(default/none/low/medium/high) is overkill. Collapse to a single
boolean toggle "Disable reasoning" that defaults to ON for Custom
providers (sends reasoning_effort: "none"). Cloud providers continue
to receive no reasoning_effort parameter.

- Replace post_process_reasoning_effort: Option<String> with
  post_process_disable_reasoning: bool (default true) in settings
- Map bool to Option<String> at the request site (true -> "none",
  false -> omit) so the API contract is unchanged
- Rename Tauri command to set_post_process_disable_reasoning(disable)
  so parameter naming matches the setting and avoids inversion bugs
- Swap the Dropdown for a ToggleSwitch in the Custom provider section
- Collapse five i18n strings to one label + description, and add the
  translation to all 18 additional locales so non-en/ja users don't
  fall back to English for this control

* refactor: remove disable-reasoning toggle, always disable for custom provider

Per PR cjpais#1221 discussion with @cjpais: since this is an alpha feature,
simplify the UI by always sending reasoning_effort: "none" for custom
provider post-processing. If this causes issues with specific provider
setups, we can revisit.

* disable for openrouter too

---------

Co-authored-by: CJ Pais <cj@cjpais.com>
…#1198)

* fix: surface paste errors as UI toast notification (cjpais#522)

When pasting the transcription fails (e.g. wtype/xdotool/dotool not
available or returns an error on Linux), the error was silently logged
to the backend console only.  Users had no idea why their text was not
pasted.

Emit a new 'paste-error' Tauri event from the Rust side and listen for
it in the frontend App component, showing a sonner toast with the error
detail — mirroring the existing 'recording-error' pattern exactly.

Files changed:
- src-tauri/src/actions.rs: add PasteErrorEvent struct, emit event on paste failure
- src/App.tsx: add useEffect listener that shows toast.error on paste-error
- src/lib/types/events.ts: add PasteErrorEvent interface
- src/i18n/locales/en/translation.json: add pasteFailedTitle/pasteFailed keys

* chore: apply cargo fmt and prettier formatting

* add translationss

---------

Co-authored-by: CJ Pais <cj@cjpais.com>
@cjpais
Copy link
Copy Markdown
Owner

cjpais commented Apr 8, 2026

#1173.

@cjpais cjpais closed this Apr 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.