-
Notifications
You must be signed in to change notification settings - Fork 14
feat: integrate LiveKit for real-time voice agents #286
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
pulinduvidmal
wants to merge
11
commits into
develop
Choose a base branch
from
feature/livekit-integration
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
a318304
feat: integrate LiveKit for real-time voice agents
pulinduvidmal 8012ce7
fix: copilot comments
pulinduvidmal 1d7c159
fix: copilot comments
pulinduvidmal d81ffc1
fix: copilot comments
pulinduvidmal 7b7128d
feat: implement livekit voice and vision integration
pulinduvidmal 8bdf586
Merge branche
pulinduvidmal 89458f3
refactor: .env
pulinduvidmal 56cd02d
fix:docs
pulinduvidmal b76e87e
fix: copilot comments
pulinduvidmal 7c95dbf
fix: add optional auth_dependency
pulinduvidmal 03967bd
Merge branch 'develop' into feature/livekit-integration
lakindu-yl File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,82 @@ | ||
| # LiveKit Voice Integration | ||
|
|
||
| Agent Kernel supports real-time, ultra-low latency voice integrations via [LiveKit](https://livekit.io/). | ||
|
|
||
| By treating LiveKit as an **Integration**, you can build an agent once (using CrewAI, LangGraph, OpenAI, etc.), equip it with tools and memory via Agent Kernel, and then use LiveKit to allow users to **talk to your agent over a real-time voice call**. | ||
|
|
||
| LiveKit handles the WebRTC voice connection, Speech-to-Text (STT), and Text-to-Speech (TTS), while Agent Kernel handles the intelligence, routing, and tools. | ||
|
|
||
| ## Architecture | ||
|
|
||
| When you use the LiveKit integration: | ||
| 1. The user speaks into their microphone via a LiveKit frontend. | ||
| 2. LiveKit's **Speech-to-Text (STT)** plugin transcribes the voice into text. | ||
| 3. The transcribed text is intercepted by our custom `LiveKitLLM` bridge. | ||
| 4. The bridge forwards the text to **Agent Kernel** (`AgentService().run(text)`). | ||
| 5. Agent Kernel's selected agent processes the text and generates a response. | ||
| 6. The response is sent back to LiveKit's **Text-to-Speech (TTS)** plugin. | ||
| 7. The TTS plugin synthesizes the voice and streams it back to the user. | ||
|
|
||
| ## Setup | ||
|
|
||
| First, ensure you have installed the LiveKit optional dependencies: | ||
|
|
||
| ```bash | ||
| pip install "agentkernel[livekit]" | ||
| ``` | ||
|
|
||
| You will also need: | ||
| 1. A free account on [LiveKit Cloud](https://cloud.livekit.io/). | ||
| 2. Your LiveKit API keys (`AK_LIVEKIT__URL`, `AK_LIVEKIT__API_KEY`, `AK_LIVEKIT__API_SECRET`). | ||
| 3. API keys for your preferred STT/TTS providers (e.g., `OPENAI_API_KEY`, `DEEPGRAM_API_KEY`, etc.). | ||
|
pulinduvidmal marked this conversation as resolved.
|
||
|
|
||
|
pulinduvidmal marked this conversation as resolved.
|
||
| ## Configuration | ||
|
|
||
| In your `config.yaml`, configure which Agent Kernel agent should respond to LiveKit voice interactions, as well as your preferred STT and TTS providers: | ||
|
|
||
| ```yaml | ||
| livekit: | ||
| agent: "my-voice-agent" | ||
| stt_provider: "deepgram" # Options: deepgram, openai | ||
| tts_provider: "openai" # Options: openai, elevenlabs, google | ||
| url: "wss://your-project-id.livekit.cloud" # Optional, can use AK_LIVEKIT__URL env var | ||
| api_key: "your_api_key" # Optional, can use AK_LIVEKIT__API_KEY env var | ||
| api_secret: "your_api_secret" # Optional, can use AK_LIVEKIT__API_SECRET env var | ||
| ``` | ||
|
|
||
| You can also set these via environment variables: | ||
| ```bash | ||
| export AK_LIVEKIT__AGENT="my-voice-agent" | ||
| export AK_LIVEKIT__STT_PROVIDER="openai" | ||
| ``` | ||
|
|
||
| ## Example Usage | ||
|
|
||
| Create a Python script (e.g., `server.py`) that initializes your Agent Kernel agent and starts the REST API. The `AgentLiveKitRequestHandler` will automatically launch the LiveKit background worker alongside your FastAPI server. | ||
|
|
||
| ```python | ||
| import os | ||
| import logging | ||
| from agentkernel.api import RESTAPI | ||
| from agentkernel.openai import OpenAIModule | ||
| from agentkernel.livekit import AgentLiveKitRequestHandler | ||
| from agents import Agent as OpenAIAgent | ||
|
|
||
| logging.basicConfig(level=logging.INFO) | ||
|
|
||
| # 1. Define your Agent Kernel Agent | ||
| voice_agent = OpenAIAgent( | ||
| name="my-voice-agent", | ||
| handoff_description="Agent for voice interactions", | ||
| instructions="You are a concise voice assistant. Do not use markdown or emojis.", | ||
| ) | ||
|
|
||
| # 2. Register the agent with Agent Kernel | ||
| OpenAIModule([voice_agent]) | ||
|
|
||
| # 3. Start the server with the LiveKit Handler | ||
| if __name__ == "__main__": | ||
| RESTAPI.run([AgentLiveKitRequestHandler()]) | ||
| ``` | ||
|
|
||
| > **Note:** The `AgentLiveKitRequestHandler` exposes a `/livekit/token` API endpoint on your FastAPI server. Your frontend (e.g., a React application or the LiveKit Agent Console) can hit this endpoint to generate secure access tokens for users joining the voice room. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,14 @@ | ||
| """ | ||
| Agent Kernel Integration with LiveKit | ||
|
|
||
| This package contains the Agent Kernel integration implementations for LiveKit Voice Agents. | ||
| """ | ||
|
|
||
| import importlib.metadata | ||
|
|
||
| try: | ||
| __version__ = importlib.metadata.version("agentkernel") | ||
| except importlib.metadata.PackageNotFoundError: | ||
| __version__ = "0.1.0" | ||
|
|
||
| from .livekit_handler import AgentLiveKitRequestHandler, LiveKitLLM |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.