Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added api.jar
Binary file not shown.
3 changes: 2 additions & 1 deletion build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -16,11 +16,12 @@ configurations.runtimeClasspath {

dependencies {
implementation platform('run.halo.tools.platform:plugin:2.24.0')
implementation 'com.theokanning.openai-gpt3-java:api:0.17.0'

compileOnly 'run.halo.app:api'
compileOnly files('api.jar')

testImplementation 'run.halo.app:api'
testImplementation files('api.jar')
testImplementation 'org.springframework.boot:spring-boot-starter-test'
testRuntimeOnly 'org.junit.platform:junit-platform-launcher'
}
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
schema: spec-driven
created: 2026-05-15
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
## Context

Live2D chat currently sends the frontend's message history to `/live2d/ai/chat-process`, where the plugin prepends a system prompt and then delegates generation to plugin-owned `ChatClient` implementations. The only concrete client today is an OpenAI-specific streaming client backed by plugin settings for proxy, token, base URL, and model selection.

The target state is to keep the Live2D chat UX and endpoint shape stable while delegating model discovery and invocation to Halo AI 基础设施. The upstream SDK is not yet available from a published repository, so this change must integrate through the provided `api-1.0.0-SNAPSHOT.jar` and the documented `AiServices` / `AiModelService` APIs.

## Goals / Non-Goals

**Goals:**
- Route Live2D chat generation through Halo AI foundation instead of the plugin's custom OpenAI client stack.
- Preserve the existing frontend chat contract so the widget keeps working with minimal behavioral change.
- Reduce plugin-owned AI configuration to chat-specific controls only: enablement, anonymous access, role prompt, model selection, and frontend timing values.
- Remove obsolete backend classes, settings, and dependencies once Halo AI integration is in place.

**Non-Goals:**
- Redesign the Live2D chat UI or change the existing streaming presentation model.
- Add provider management, credential storage, or model provisioning inside the Live2D plugin.
- Generalize this change into a reusable abstraction for non-chat AI features.

## Decisions

### Use Halo AI foundation as the only chat backend
The backend will replace `ChatClient`-based provider dispatch with a single service path built on `AiServices.getModelService()`. The endpoint will obtain the configured Halo model name, resolve a `LanguageModel`, build a foundation `ChatRequest`, and stream foundation `ChatChunk` events back into the existing SSE response format consumed by the frontend.

**Why this approach**
- It removes duplicated provider integration code from the plugin.
- It aligns provider credentials and model lifecycle with Halo's centralized AI management.
- It minimizes frontend churn because the SSE endpoint can still emit the current `ChatResult` payload shape.

**Alternatives considered**
- Keep the `ChatClient` abstraction and add a Halo-backed implementation: rejected because the abstraction exists only to multiplex plugin-owned providers and would preserve unnecessary complexity.
- Proxy raw Halo AI responses directly to the frontend: rejected because the frontend already depends on `ChatResult` chunks and changing that contract would widen the migration surface.

### Keep a plugin-level model selector, but not provider credentials
The plugin will retain a single AI-model identifier field that stores the Halo AI model resource name (for example `openai/gpt-4o`) alongside the existing persona prompt and access-control settings. Provider token, base URL, proxy, and provider enablement will move fully out of this plugin.

**Why this approach**
- Halo AI foundation still requires the caller to request a concrete model.
- Selecting a model per plugin preserves expected plugin autonomy without reintroducing provider-specific configuration.
- It avoids brittle "first configured model wins" behavior.

**Alternatives considered**
- Always use the first available Halo model: rejected because availability order is undefined and can change across environments.
- Add no plugin-level model choice and wait for a Halo-wide default model API: rejected because the current SDK and documentation do not define that capability.

### Depend on the local SDK jar until the upstream artifact is published
The Gradle build will reference the provided `api-1.0.0-SNAPSHOT.jar` as a compile-time dependency and remove the old OpenAI SDK dependency that is no longer needed.

**Why this approach**
- It matches the currently available distribution mechanism.
- It keeps the migration unblockable without inventing a separate publishing pipeline.

**Alternatives considered**
- Depend on a Maven coordinate that is not yet published: rejected because builds would not be reproducible in the current repository state.
- Vendor copied source or decompiled classes: rejected because it would increase maintenance cost and drift from the upstream SDK.

### Preserve backend-to-frontend error semantics at the plugin boundary
The endpoint will translate Halo AI foundation failures into readable plugin responses consistent with the current chat UX: unauthorized access remains an HTTP 401, while model/plugin/provider failures stream a user-facing error message through the existing SSE contract and terminate cleanly.

**Why this approach**
- The current frontend already handles 401 and streamed error text.
- It avoids exposing raw provider internals in the browser while still giving administrators actionable logs on the server side.

## Risks / Trade-offs

- **[SDK distribution friction]** Local jar dependency is less ergonomic than a published artifact → Document the dependency strategy in the change and isolate it so it can be swapped to a Maven coordinate later.
- **[Model selection discoverability]** A free-form model name field is easier to misconfigure than a populated dropdown → Validate the field on use, surface a clear error when the model cannot be resolved, and consider a follow-up enhancement for dynamic options if Halo exposes them.
- **[Streaming contract translation]** Halo AI chunk types may not map 1:1 to the plugin's current `[DONE]`-terminated SSE stream → Normalize chunk types in one backend adapter layer and keep the frontend contract unchanged.
- **[Plugin dependency coupling]** Live2D chat will now require the ai-foundation plugin at runtime when AI chat is enabled → Fail fast with a clear administrative message when the dependency is missing or disabled.

## Migration Plan

1. Add the Halo AI SDK dependency from the provided jar and remove the obsolete OpenAI SDK usage.
2. Replace the custom backend generation path with a Halo AI-backed service that adapts foundation chunks into existing `ChatResult` SSE events.
3. Simplify `settings.yaml` and runtime config shaping to remove provider/proxy fields while keeping chat-specific controls.
4. Verify the frontend works unchanged against the preserved endpoint contract, then remove dead backend classes and configuration paths.

## Open Questions

- Whether Halo will soon expose a canonical default chat model API; if it does, the plugin-level model name field could be simplified in a follow-up change.
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
## Why

Live2D chat currently depends on plugin-owned AI settings, custom OpenAI client logic, and proxy/token management that duplicate capabilities Halo is beginning to provide centrally. Migrating to Halo AI 基础设施 will reduce duplicated backend code, align chat behavior with Halo-wide model management, and let administrators configure AI once instead of per plugin.

## What Changes

- Replace the Live2D chat backend integration with Halo AI 基础设施 instead of the plugin's custom OpenAI client pipeline.
- Use the AI foundation SDK from the provided `api-1.0.0-SNAPSHOT.jar` until the upstream repository artifact is publicly available.
- Keep the existing Live2D chat entry point and streaming user experience, but source model access and streaming responses from Halo AI services.
- Simplify plugin settings so AI chat keeps only widget-specific behavior controls and persona prompt configuration.
- **BREAKING** Remove plugin-specific provider settings such as OpenAI token/base URL/model and proxy configuration from the Live2D plugin.
- Remove unused custom backend AI classes and dependency wiring after Halo AI integration is in place.

## Capabilities

### New Capabilities
- `halo-ai-chat-integration`: Live2D chat streams replies through Halo AI foundation services while preserving the widget-facing chat API and persona behavior.

### Modified Capabilities
- None.

## Impact

- Backend chat flow under `src/main/java/run/halo/live2d/chat/**`
- Plugin settings schema in `src/main/resources/extensions/settings.yaml`
- Runtime config shaping in `src/main/java/run/halo/live2d/Live2dSettingProcess.java`
- Frontend chat client and widget integration in `packages/live2d/src/**`
- Build dependency management in `build.gradle`
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
## ADDED Requirements

### Requirement: Live2D chat SHALL stream replies through Halo AI foundation
When AI chat is enabled, the Live2D backend SHALL invoke Halo AI foundation for chat generation instead of any plugin-owned provider client, while preserving the existing widget-facing streaming endpoint contract.

#### Scenario: Chat requests are fulfilled by Halo AI foundation
- **WHEN** a user sends a chat message to `/live2d/ai/chat-process`
- **THEN** the backend MUST resolve the configured Halo AI language model through `AiServices.getModelService()`
- **AND** it MUST submit the system prompt plus conversation history to that model
- **AND** it MUST stream assistant text chunks back through the existing SSE response format

#### Scenario: Stream completion preserves the current frontend terminator
- **WHEN** Halo AI foundation reports a successful end of generation
- **THEN** the backend MUST emit the plugin's existing completion marker payload for the frontend consumer
- **AND** it MUST complete the SSE response without requiring frontend protocol changes

### Requirement: Live2D AI chat settings SHALL only keep plugin-specific chat controls
The plugin SHALL configure Live2D chat through Halo-integrated settings that keep widget-specific behavior in this plugin and remove provider-specific connection settings from the plugin schema.

#### Scenario: Plugin settings retain persona and model selection
- **WHEN** an administrator enables AI chat in the Live2D plugin settings
- **THEN** the plugin MUST allow configuration of anonymous-access behavior, persona/system prompt, Halo AI model identifier, and frontend timing controls
- **AND** those settings MUST be used to build chat requests and widget behavior

#### Scenario: Provider-specific settings are removed from the plugin
- **WHEN** an administrator opens the Live2D AI chat settings after this change
- **THEN** OpenAI token, base URL, provider toggle, proxy host, and proxy port settings MUST NOT be present in the plugin configuration form
- **AND** provider credentials and provider enablement MUST be managed through Halo AI infrastructure instead

#### Scenario: Public runtime config excludes backend-only AI foundation settings
- **WHEN** the plugin exposes public runtime configuration to the frontend
- **THEN** it MUST continue exposing AI chat enablement and widget timing fields needed by the browser
- **AND** it MUST NOT expose backend-only Halo model identifiers or provider configuration details to the public config payload

### Requirement: Live2D chat SHALL surface Halo AI dependency failures clearly
The Live2D chat integration SHALL translate Halo AI foundation availability and model-resolution failures into stable plugin behavior for both users and administrators.

#### Scenario: Anonymous-disabled chat still enforces login
- **WHEN** AI chat is enabled but anonymous chat is disabled and an unauthenticated user sends a message
- **THEN** the endpoint MUST reject the request with HTTP 401
- **AND** it MUST NOT invoke Halo AI foundation for that request

#### Scenario: Missing AI foundation dependency or model configuration is reported cleanly
- **WHEN** the ai-foundation plugin is unavailable, disabled, or the configured Halo model cannot be resolved
- **THEN** the backend MUST log the failure for administrators
- **AND** it MUST return a user-facing chat failure message through the existing plugin response flow instead of an unhandled server error
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
## 1. Dependency and integration setup

- [x] 1.1 Add the Halo AI foundation SDK from `api-1.0.0-SNAPSHOT.jar` to the Gradle build and remove the obsolete OpenAI SDK dependency.
- [x] 1.2 Introduce a Halo AI-backed chat service path that obtains `AiModelService` via `AiServices` and resolves the configured language model.

## 2. Backend chat migration

- [x] 2.1 Replace the current `ChatClient`-based backend flow with a single Halo AI streaming adapter that converts foundation chunks into existing `ChatResult` SSE events.
- [x] 2.2 Preserve current access-control behavior in `AiChatEndpoint`, including anonymous toggle handling and HTTP 401 responses for unauthenticated requests.
- [x] 2.3 Add backend error handling for missing ai-foundation availability, disabled providers, and unknown model names so failures become readable plugin responses and server logs.

## 3. Settings and runtime config cleanup

- [x] 3.1 Simplify `settings.yaml` to remove OpenAI and proxy settings while adding or retaining only chat-specific controls needed after the Halo migration.
- [x] 3.2 Update `Live2dSettingProcess` so public runtime config still exposes frontend timing fields but does not leak backend-only Halo model configuration.
- [x] 3.3 Remove dead custom AI backend classes, configuration records, and other unused code paths after the Halo AI integration is wired.

## 4. Frontend compatibility verification

- [x] 4.1 Confirm the frontend chat client continues to work against the preserved `/live2d/ai/chat-process` SSE contract without protocol changes.
- [x] 4.2 Update any frontend typing or config handling needed to stay compatible with the simplified backend settings shape.
46 changes: 46 additions & 0 deletions openspec/specs/halo-ai-chat-integration/spec.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
## ADDED Requirements

### Requirement: Live2D chat SHALL stream replies through Halo AI foundation
When AI chat is enabled, the Live2D backend SHALL invoke Halo AI foundation for chat generation instead of any plugin-owned provider client, while preserving the existing widget-facing streaming endpoint contract.

#### Scenario: Chat requests are fulfilled by Halo AI foundation
- **WHEN** a user sends a chat message to `/live2d/ai/chat-process`
- **THEN** the backend MUST resolve the configured Halo AI language model through `AiServices.getModelService()`
- **AND** it MUST submit the system prompt plus conversation history to that model
- **AND** it MUST stream assistant text chunks back through the existing SSE response format

#### Scenario: Stream completion preserves the current frontend terminator
- **WHEN** Halo AI foundation reports a successful end of generation
- **THEN** the backend MUST emit the plugin's existing completion marker payload for the frontend consumer
- **AND** it MUST complete the SSE response without requiring frontend protocol changes

### Requirement: Live2D AI chat settings SHALL only keep plugin-specific chat controls
The plugin SHALL configure Live2D chat through Halo-integrated settings that keep widget-specific behavior in this plugin and remove provider-specific connection settings from the plugin schema.

#### Scenario: Plugin settings retain persona and model selection
- **WHEN** an administrator enables AI chat in the Live2D plugin settings
- **THEN** the plugin MUST allow configuration of anonymous-access behavior, persona/system prompt, Halo AI model identifier, and frontend timing controls
- **AND** those settings MUST be used to build chat requests and widget behavior

#### Scenario: Provider-specific settings are removed from the plugin
- **WHEN** an administrator opens the Live2D AI chat settings after this change
- **THEN** OpenAI token, base URL, provider toggle, proxy host, and proxy port settings MUST NOT be present in the plugin configuration form
- **AND** provider credentials and provider enablement MUST be managed through Halo AI infrastructure instead

#### Scenario: Public runtime config excludes backend-only AI foundation settings
- **WHEN** the plugin exposes public runtime configuration to the frontend
- **THEN** it MUST continue exposing AI chat enablement and widget timing fields needed by the browser
- **AND** it MUST NOT expose backend-only Halo model identifiers or provider configuration details to the public config payload

### Requirement: Live2D chat SHALL surface Halo AI dependency failures clearly
The Live2D chat integration SHALL translate Halo AI foundation availability and model-resolution failures into stable plugin behavior for both users and administrators.

#### Scenario: Anonymous-disabled chat still enforces login
- **WHEN** AI chat is enabled but anonymous chat is disabled and an unauthenticated user sends a message
- **THEN** the endpoint MUST reject the request with HTTP 401
- **AND** it MUST NOT invoke Halo AI foundation for that request

#### Scenario: Missing AI foundation dependency or model configuration is reported cleanly
- **WHEN** the ai-foundation plugin is unavailable, disabled, or the configured Halo model cannot be resolved
- **THEN** the backend MUST log the failure for administrators
- **AND** it MUST return a user-facing chat failure message through the existing plugin response flow instead of an unhandled server error
Loading