Skip to content
Draft
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions .env
Original file line number Diff line number Diff line change
Expand Up @@ -192,3 +192,21 @@ JAEGER_GRPC_PORT=4317
PROMETHEUS_PORT=9090
PROMETHEUS_HOST=prometheus
PROMETHEUS_ADDR=${PROMETHEUS_HOST}:${PROMETHEUS_PORT}

# Agent
GRAPH_RECURSION_LIMIT=25
MCP_ENDPOINT=mcp
MCP_PORT=8011
MCP_ENABLED=False
AGENT_ENDPOINT=agent
AGENT_PORT=8010
CHATBOT_PORT=7860
LLM_BASE_URL=
LLM_MODEL=Azure/gpt-4o
Comment thread
julianocosta89 marked this conversation as resolved.
Outdated
USE_VCR=True
Comment on lines +204 to +207
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.env has LLM_BASE_URL= and LLM_MODEL= blank, and USE_VCR=True by default which expects VCR cassette files that are gitignored and don't exist. First-time users will get confusing errors. Please consider adding a comment section explaining the setup.

AGENT_DOCKERFILE=src/agent/Dockerfile
MCP_DOCKERFILE=src/mcp/Dockerfile
CHATBOT_DOCKERFILE=src/chatbot/Dockerfile
AGENT_ADDR=${AGENT_ENDPOINT}:${AGENT_PORT}
APPLICATION_ENDPOINT=frontend:8080
OTEL_EXPORTER_OTLP_INSECURE=true
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -55,3 +55,5 @@ test/tracetesting/tracetesting-vars.yaml
!src/currency/build
src/cart/src/Program.cs.bak
src/flagd/demo.flagd.json.bak
Comment on lines +64 to +65
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please remove these, if not needed?


src/agent/fixtures/*
117 changes: 107 additions & 10 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -603,7 +603,7 @@ services:
deploy:
resources:
limits:
memory: 500M # This is high to enable supporting the recommendationCache feature flag use case
memory: 500M # This is high to enable supporting the recommendationCache feature flag use case
restart: unless-stopped
ports:
- "${RECOMMENDATION_PORT}"
Expand Down Expand Up @@ -674,18 +674,13 @@ services:
- GOMEMLIMIT=60MiB
- OTEL_RESOURCE_ATTRIBUTES=${OTEL_RESOURCE_ATTRIBUTES},service.criticality=low
- OTEL_SERVICE_NAME=flagd
command: [
"start",
"--uri",
"file:./etc/flagd/demo.flagd.json"
]
command: ["start", "--uri", "file:./etc/flagd/demo.flagd.json"]
ports:
- "${FLAGD_PORT}"
- "${FLAGD_OFREP_PORT}"
volumes:
- ./src/flagd:/etc/flagd
logging:
*logging
logging: *logging

# Flagd UI for configuring the feature flag service
flagd-ui:
Expand Down Expand Up @@ -778,6 +773,105 @@ services:
condition: service_started
logging: *logging

# Agent
agent:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The agent, mcp, and chatbot services are missing OTEL_SERVICE_NAME and OTEL_RESOURCE_ATTRIBUTES environment variables that all other services define. For an observability demo, telemetry data from these services won't be properly labeled.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added for 3 new services

image: ${IMAGE_NAME}:${DEMO_VERSION}-agent
container_name: agent
build:
context: ./
dockerfile: ${AGENT_DOCKERFILE}
cache_from:
- ${IMAGE_NAME}:${IMAGE_VERSION}-agent
deploy:
resources:
limits:
memory: 500M
restart: unless-stopped
ports:
- "${AGENT_PORT}"
environment:
- OTEL_EXPORTER_OTLP_ENDPOINT=${OTEL_COLLECTOR_HOST}:${OTEL_COLLECTOR_PORT_GRPC}
- GRAPH_RECURSION_LIMIT
- MCP_ENDPOINT
- MCP_PORT
- MCP_ENABLED
- AGENT_ENDPOINT
- AGENT_PORT
- LLM_BASE_URL
- LLM_MODEL
- USE_VCR=True
- OPENAI_API_KEY
- APPLICATION_ENDPOINT=frontend:8080
- OTEL_EXPORTER_OTLP_INSECURE=true
depends_on:
jaeger:
condition: service_started
otel-collector:
condition: service_started
product-catalog:
condition: service_started
logging: *logging

# MCP
mcp:
image: ${IMAGE_NAME}:${DEMO_VERSION}-mcp
container_name: mcp
build:
context: ./
dockerfile: ${MCP_DOCKERFILE}
cache_from:
- ${IMAGE_NAME}:${IMAGE_VERSION}-mcp
deploy:
resources:
limits:
memory: 500M
restart: unless-stopped
ports:
- "${MCP_PORT}"
environment:
- OTEL_EXPORTER_OTLP_ENDPOINT=${OTEL_COLLECTOR_HOST}:${OTEL_COLLECTOR_PORT_GRPC}
- MCP_ENDPOINT
- MCP_PORT
- OTEL_EXPORTER_OTLP_INSECURE=true
- APPLICATION_ENDPOINT=frontend:8080
depends_on:
jaeger:
condition: service_started
otel-collector:
condition: service_started
product-catalog:
condition: service_started
logging: *logging

# ChatBot
chatbot:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The chatbot service (port 7860) has no route configured in the envoy frontend-proxy, so it's only accessible via Docker's random port mapping. Other UI services (Jaeger, Grafana, LoadGen) are all exposed through port 8080.

image: ${IMAGE_NAME}:${DEMO_VERSION}-chatbot
container_name: chatbot
build:
context: ./
dockerfile: ${CHATBOT_DOCKERFILE}
cache_from:
- ${IMAGE_NAME}:${IMAGE_VERSION}-chatbot
deploy:
resources:
limits:
memory: 500M
restart: unless-stopped
ports:
- "${CHATBOT_PORT}"
environment:
- AGENT_PORT
- AGENT_ENDPOINT
- CHATBOT_PORT
depends_on:
jaeger:
condition: service_started
otel-collector:
condition: service_started
product-catalog:
condition: service_started
Comment thread
julianocosta89 marked this conversation as resolved.
Outdated
logging: *logging

# Postgresql used by Accounting service
postgresql:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

postgresql: service key before astronomy-db. This may cause the docker compose to fail.

image: ${POSTGRES_IMAGE}
Expand Down Expand Up @@ -811,7 +905,6 @@ services:
- "${VALKEY_PORT}"
logging: *logging


# ********************
# Telemetry Components
# ********************
Expand Down Expand Up @@ -867,7 +960,11 @@ services:
limits:
memory: 200M
restart: unless-stopped
command: [ "--config=/etc/otelcol-config.yml", "--config=/etc/otelcol-config-extras.yml" ]
command:
[
"--config=/etc/otelcol-config.yml",
"--config=/etc/otelcol-config-extras.yml",
]
user: 0:0
volumes:
- ${HOST_FILESYSTEM}:/hostfs:ro
Expand Down
23 changes: 23 additions & 0 deletions src/agent/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Copyright The OpenTelemetry Authors
# SPDX-License-Identifier: Apache-2.0

FROM python:3.11-slim

ENV PYTHONDONTWRITEBYTECODE=1 \
PYTHONUNBUFFERED=1
WORKDIR /app

RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
&& rm -rf /var/lib/apt/lists/*

COPY src/agent/requirements.txt .

RUN pip install --no-cache-dir -r requirements.txt

COPY src/agent/src src
COPY src/agent/run.py .

EXPOSE ${AGENT_PORT}

CMD ["python", "run.py"]
19 changes: 19 additions & 0 deletions src/agent/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
requests==2.32.5
traceloop-sdk==0.47.5
dotenv==0.9.9
httpx==0.28.1
langchain_openai==1.0.1
langchain==1.0.5
langgraph-prebuilt==1.0.5
mcp==1.22.0
fastmcp>=2.13.1
langchain_mcp_adapters==0.1.14
langchain_community==0.4.1
fastapi==0.120.4
opentelemetry-api==1.38.0
opentelemetry-sdk==1.38.0
opentelemetry-semantic-conventions==0.59b0
opentelemetry-instrumentation-langchain<0.53.0
fastmcp==2.13.1
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like a dupe of line 9 where we have fastmcp>=2.13.1

gradio==6.0.1
vcrpy==7.0.0
37 changes: 37 additions & 0 deletions src/agent/run.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
#!/usr/bin/python

# Copyright The OpenTelemetry Authors
# SPDX-License-Identifier: Apache-2.0


import asyncio
import logging
import os

from dotenv import load_dotenv
from src.agents.agents import Agent
from traceloop.sdk import Traceloop

logging.basicConfig(level=logging.INFO)

load_dotenv()

Traceloop.init(
app_name="AstronomyShopAgent",
api_endpoint=os.getenv("OTEL_EXPORTER_OTLP_ENDPOINT", "localhost:4317"),
)


async def start_servers():
"""Run both the LangGraph Agent and the MCP server concurrently."""
tasks = []
agent = Agent()
tasks.append(agent.launch())
await asyncio.gather(*tasks)


if __name__ == "__main__":
try:
asyncio.run(start_servers())
except KeyboardInterrupt:
logging.info("Shutting down servers...")
109 changes: 109 additions & 0 deletions src/agent/src/agents/agents.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
#!/usr/bin/python

# Copyright The OpenTelemetry Authors
# SPDX-License-Identifier: Apache-2.0

import logging
import os
from contextlib import asynccontextmanager
from typing import Dict, List

import uvicorn
from fastapi import FastAPI, HTTPException
from langchain.agents import create_agent
from langchain.tools import tool
from langchain_mcp_adapters.tools import load_mcp_tools
from pydantic import BaseModel
from src.agents.llm import ChatLLM
from src.agents.mcp_client import MCPClient
from src.agents.tools import (
add_to_cart,
checkout,
convert_currency,
empty_cart,
get_ads,
get_cart,
get_product,
get_recommendations,
get_shipping_quote,
get_supported_currencies,
list_products,
)
from traceloop.sdk.decorators import workflow


class ChatRequest(BaseModel):
message: str
history: List[Dict] = []


class Agent:
def __init__(self):
self.app = FastAPI(lifespan=self.lifespan)
self.app.post("/prompt")(self.handle_prompt)
self.agentRecursionLimit = int(os.getenv("GRAPH_RECURSION_LIMIT", "25"))
self.mcp_server_url = f"http://{os.getenv('MCP_ENDPOINT', '127.0.0.1')}:{os.getenv('MCP_PORT', '8011')}/mcp"

self.mcp_server = None

@asynccontextmanager
async def lifespan(self, app: FastAPI):
mcp_enabled = os.getenv("MCP_ENABLED", "False") == "True"
if mcp_enabled:
logging.info("MCP tools enabled")
self.mcp_server = MCPClient()
yield
if self.mcp_server:
await self.mcp_server.cleanup()

async def handle_prompt(self, request: ChatRequest):
return await self.run_agent(request.message)

async def get_tool_list(self):
mcp_enabled = os.getenv("MCP_ENABLED", "False") == "True"
if mcp_enabled and self.mcp_server is not None:
await self.mcp_server.connect_to_mcp_server(self.mcp_server_url)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_tool_list() calls connect_to_mcp_server() on every /prompt request, creating a new MCP session each time. The connection should be established once during the lifespan startup and reused.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed this to initialize the session in lifespan and reuse it for each request

return await load_mcp_tools(self.mcp_server.session)
else:
tool_list = [
add_to_cart,
checkout,
convert_currency,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

convert_currency is in the tool list but is never imported or defined. Will this crash at runtime when MCP_ENABLED=False

empty_cart,
get_ads,
get_cart,
get_product,
get_recommendations,
get_shipping_quote,
get_supported_currencies,
list_products,
]
return [tool(t) for t in tool_list]

@workflow(name="astronomy_shop_agent_workflow")
async def run_agent(self, input_prompt):
model = ChatLLM()
tools = await self.get_tool_list()
agent = create_agent(
model,
tools=tools,
system_prompt="You are a helpful assistant. Be concise and accurate.",
)
try:
result = await agent.ainvoke(
{"messages": [{"role": "user", "content": input_prompt}]}
)
return {"response": result}
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))

async def launch(self):
agent_port = int(os.getenv("AGENT_PORT", "8010"))
agent_config = uvicorn.Config(
app=self.app,
host="0.0.0.0",
port=agent_port,
log_level="info",
)
agent_server = uvicorn.Server(agent_config)
await agent_server.serve()
Loading
Loading