Skip to content

feat: add Fish Audio as native TTS provider#1526

Open
xuan0x0 wants to merge 14 commits intomoeru-ai:mainfrom
xuan0x0:xuanpan/feat/fish-audio-tts-provider
Open

feat: add Fish Audio as native TTS provider#1526
xuan0x0 wants to merge 14 commits intomoeru-ai:mainfrom
xuan0x0:xuanpan/feat/fish-audio-tts-provider

Conversation

@xuan0x0
Copy link
Copy Markdown
Contributor

@xuan0x0 xuan0x0 commented Mar 30, 2026

Closes #1477

Summary

Adds Fish Audio as a native TTS provider, alongside existing providers like OpenAI, ElevenLabs, and Kokoro.

  • Provider integration — custom fetch override translates @xsai/generate-speech requests to Fish Audio's POST /v1/tts format (text + reference_id in the JSON body)
  • Voice discovery — fetches the user's own voice models and the top community public models in parallel, merges and deduplicates them (own models listed first)
  • CORS workaround (dev) — Fish Audio's API is server-to-server only and does not send Access-Control-Allow-Origin for browser origins. A Vite dev server proxy (/fish-audio-api → https://api.fish.audio) is added to apps/stage-web/vite.config.ts so requests appear same-origin in development. Users who deploy their own instance can point the base URL at their own reverse proxy.
  • Settings page — dedicated /settings/providers/speech/fish-audio page with API key input, base URL override, voice selector, and test playground
  • i18n — strings added for all 9 supported locales (en, zh-Hans, zh-Hant, ja, ko, fr, es, ru, vi)
  • Error visibility — TTS pipeline in Stage.vue was silently swallowing errors; now logs provider/model/voice/error on failure

Known limitations / follow-ups

  • Model selection is not forwarded — Fish Audio accepts the model via a model HTTP request header, but sending custom headers from the browser triggers a CORS preflight that Fish Audio's CDN rejects. The server defaults to s2-pro, which is the only model listed. This matches the behaviour of the official Fish Audio JS SDK.
  • Production deployment — the Vite dev proxy only covers local development. Production deployments need a user-provided reverse proxy (e.g. nginx, Cloudflare Worker) and should set the base URL field accordingly.
  • stage-tamagotchi (Electron) — Electron's renderer is also subject to CORS. A separate proxy via the Electron main process is not yet implemented and is left as a follow-up.

Test plan

  1. Go to Settings → Providers → Fish Audio
  2. Enter a valid Fish Audio API key
  3. Open the voice dropdown — own models appear first, followed by popular public models
  4. Click Test to generate a sample clip
  5. Go to Settings → Modules → Speech, select Fish Audio as the active provider, pick a model (s2-pro) and a voice
  6. Start a chat — AIRI should respond with Fish Audio TTS audio

- Add fish-audio provider entry with a custom fetch override that
  translates xsai generateSpeech requests to Fish Audio's POST /v1/tts
  format (text + reference_id in body)
- Add Vite dev proxy (/fish-audio-api → https://api.fish.audio) to
  bypass browser CORS — Fish Audio's API is server-to-server only and
  does not send Access-Control-Allow-Origin for browser origins
- Add listFishAudioVoices helper that fetches own models and top public
  models in parallel, merges and deduplicates them with own models first
- Add fish-audio settings page with SpeechPlayground integration
- Add i18n strings for all 9 supported locales
- Improve TTS error logging in Stage.vue (was silently swallowing errors)

Made-with: Cursor
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 30, 2026

⏳ Approval required for deploying to Cloudflare Workers (Preview) for stage-web.

Name Link
🔭 Waiting for approval For maintainers, approve here

Hey, maintainers, kindly take some time to review and approve this deployment when you are available. Thank you! 🙏

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 71bd789da9

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread packages/stage-ui/src/stores/providers.ts Outdated
Comment thread packages/stage-pages/src/pages/settings/providers/speech/fish-audio.vue Outdated
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for the Fish Audio speech provider, including a new settings page, a Vite dev server proxy to handle CORS, and the necessary store logic for text-to-speech generation and voice listing. Feedback focuses on improving the robustness of the custom fetch implementation by forwarding abort signals for better resource management and adding safer handling for body parsing to prevent potential runtime exceptions.

Comment thread packages/stage-ui/src/stores/providers.ts
Comment thread packages/stage-ui/src/stores/providers.ts
xuan0x0 and others added 3 commits March 30, 2026 15:08
…voice reload

- Forward init.signal to the custom Fish Audio fetch so the HTTP request
  is cancelled when the TTS pipeline is aborted (e.g. user interrupts playback)
- Guard the Vite dev-server proxy rewrite with a userAgent Electron check;
  Electron's renderer has no matching proxy route so the rewrite caused 404s
- Debounce the apiKey/baseUrl watcher in fish-audio.vue (800 ms) to avoid
  firing repeated requests with partial credentials on every keystroke

Made-with: Cursor
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 80427db7be

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread packages/stage-ui/src/stores/providers.ts
@nekomeowww nekomeowww added scope/providers Scope related to providers we support scope/agent Scope related to how we harness agent, or build the agent workflow scope/audio-output Scope related to audio output (TTS, Voice cloning, etc.) priority/nice-to-have Issue, or Pull Request that nice to have but can be handled later labels Apr 7, 2026
xuan0x0 and others added 4 commits April 7, 2026 20:20
Add /fish-audio-api proxy to apps/stage-pocket/vite.config.ts so that
stage-pocket DEV builds don't 404 when providers.ts rewrites the base
URL. Previously the rewrite was only guarded against Electron, causing
404s in Capacitor dev mode where no matching proxy existed. Update the
NOTICE comment in providers.ts to reference both vite configs.

Made-with: Cursor
Add /fish-audio-api proxy to apps/stage-pocket/vite.config.ts so that
stage-pocket DEV builds don't 404 when providers.ts rewrites the base
URL. Previously the rewrite was only guarded against Electron, causing
404s in Capacitor dev mode where no matching proxy existed. Update the
NOTICE comment in providers.ts to reference both vite configs.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c8bc7bfcc9

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread packages/stage-ui/src/stores/providers.ts
validateProviderConfig now surfaces a clear error when the default
api.fish.audio URL is used in contexts where browser CORS will block
the request and no proxy/IPC path exists:
- Electron (all modes): renderer enforces CORS; IPC path not yet built
- Web/Capacitor production: Vite dev-server proxy is absent

Capacitor native builds (iOS/Android) are exempt because the OS HTTP
stack is not subject to browser CORS.

Users with a custom base URL (self-hosted CORS proxy) are unaffected.
Adds a TODO comment marking the Electron IPC path as follow-up work.

Made-with: Cursor
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 540df09ba7

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread packages/stage-ui/src/stores/providers.ts
xuan0x0 and others added 4 commits April 9, 2026 20:26
…or guard

`navigator.gpu` is a WebGPU-only property not present in the standard
TypeScript DOM lib, causing TS2339 errors in packages whose tsconfigs do
not include WebGPU types. Replace all live usages of `!!navigator.gpu`
with `'gpu' in navigator` (boolean presence check, no extra lib needed),
and for call sites that also invoke `navigator.gpu.requestAdapter()` use
a minimal structural cast after the `in` guard confirms existence.

Made-with: Cursor
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 50aa33ca34

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

defaultOptions: () => {
const capabilities = getCachedWebGPUCapabilities()
const hasWebGPU = capabilities?.supported ?? (typeof navigator !== 'undefined' && !!navigator.gpu)
const hasWebGPU = capabilities?.supported ?? (typeof navigator !== 'undefined' && 'gpu' in navigator)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Check navigator.gpu value, not just property presence

Using 'gpu' in navigator can report WebGPU as available even when navigator.gpu resolves to undefined (for example in contexts where the API is exposed but not usable), which makes Kokoro default/model-selection logic treat unsupported environments as supported and then fail later when loading WebGPU models. The previous !!navigator.gpu guard avoided this false-positive path.

Useful? React with 👍 / 👎.

Comment on lines +1818 to +1819
if (!config.apiKey) {
errors.push(new Error('API key is required.'))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Trim Fish API key during provider validation

validateProviderConfig only checks !config.apiKey, so a whitespace-only key is treated as valid, but createProvider trims the key before use and will send an empty bearer token. This marks the provider configured while all voice/TTS requests fail with auth errors; validating config.apiKey.trim() would keep configured state aligned with runtime behavior.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

priority/nice-to-have Issue, or Pull Request that nice to have but can be handled later scope/agent Scope related to how we harness agent, or build the agent workflow scope/audio-output Scope related to audio output (TTS, Voice cloning, etc.) scope/providers Scope related to providers we support

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feature request: natively support Fish Audio TTS

2 participants