Skip to content

feat: Migrate dependency management from pip to uv workspaces#202

Open
haroon0x wants to merge 2 commits intokubeflow:mainfrom
haroon0x:uv
Open

feat: Migrate dependency management from pip to uv workspaces#202
haroon0x wants to merge 2 commits intokubeflow:mainfrom
haroon0x:uv

Conversation

@haroon0x
Copy link
Copy Markdown
Contributor

@haroon0x haroon0x commented Apr 9, 2026

Summary

This PR replaces the per-directory requirements.txt files with pyproject.toml files managed by uv as a workspace. The three Python components (server, server-https, pipelines) are now workspace members under a single root configuration. Dependencies are resolved together, locked into a single uv.lock file, and installed into a shared venv at the root.

Motivation

The existing setup had each component managing its own requirements.txt with no lockfile and no cross-component dependency resolution. This meant:

  • No guarantee that server and server-https (which share most of their dependency tree) were using compatible versions of shared packages like pymilvus, sentence-transformers, or torch.
  • No lockfile, so builds were not reproducible. Running pip install on two different days could produce different dependency trees.
  • Slow Docker builds. pip resolves dependencies at install time.

uv solves all three problems. It resolves dependencies across the entire workspace in a single pass, produces a deterministic lockfile, and installs packages faster than pip (10-20x in practice).

Over the coming months, the codebase will include:

  • An agent/ layer with Kagnet for multi-tool routing and stateful reasoning
  • MCP (Model Context Protocol) tool implementations for both the website frontend and developer IDE integrations
  • A frontend/ chat UI with feedback loops and golden dataset construction
  • Evaluation frameworks (RAGAS) for continuous quality assurance
  • Helm charts, Terraform modules, and CI/CD workflows for reproducible deployments

Each of these additions will introduce its own set of Python dependencies. Without a workspace-level dependency manager and a deterministic lockfile, the dependency graph will become unmanageable as the repository scales. Migrating to uv now, while the codebase is still small, avoids a much more painful migration later when there are five or six workspace members instead of three.

What changed

New files

File Purpose
pyproject.toml (root) Workspace definition. Lists workspace members, configures the PyTorch CPU-only index, and pins requires-python >= 3.11.
server/pyproject.toml Declares dependencies for the WebSocket server: websockets, httpx, pymilvus, sentence-transformers, torch, numpy.
server-https/pyproject.toml Declares dependencies for the FastAPI server: fastapi, uvicorn[standard], pydantic, httpx, pymilvus, sentence-transformers, torch, numpy.
pipelines/pyproject.toml Declares dependencies for the Kubeflow pipelines: kfp, requests, beautifulsoup4, sentence-transformers, langchain-text-splitters, torch, feast[milvus], pandas, numpy.
.python-version Pins the workspace to Python 3.12, matching the local development environment.
uv.lock Auto-generated lockfile. 136 packages resolved deterministically.

Modified files

File Change
server/Dockerfile Replaced pip install -r requirements.txt with uv pip install -r pyproject.toml. The uv binary is pulled in as a multi-stage copy from ghcr.io/astral-sh/uv:latest.
server-https/Dockerfile Same Dockerfile migration as above.
.gitignore Added test-venv-swfs/ (stale test virtual environment that should not be tracked).

Deleted files

File Reason
server/requirements.txt Replaced by server/pyproject.toml.
server-https/requirements.txt Replaced by server-https/pyproject.toml.
pipelines/requirements.txt Replaced by pipelines/pyproject.toml.

PyTorch CPU index

The root pyproject.toml configures a PyTorch CPU-only index:

[[tool.uv.index]]
name = "pytorch-cpu"
url = "https://download.pytorch.org/whl/cpu"
explicit = true

[tool.uv.sources]
torch = { index = "pytorch-cpu" }

This replicates the --extra-index-url https://download.pytorch.org/whl/cpu that was previously in the requirements.txt files. The servers do not run inference locally (they call KServe), so CUDA support is unnecessary. The CPU-only wheel is ~180 MB versus ~2 GB for the full CUDA build.

The Dockerfiles also pass --extra-index-url explicitly since they run uv pip install outside the workspace context.

What does NOT change

  • Kubernetes manifests (deployment.yaml, service.yaml, etc.) are unchanged. They reference Docker image names, which have not changed.
  • Docker build commands are unchanged. docker build -t <image> server/ works exactly as before.
  • Pipeline compilation is unchanged. uv run python pipelines/kubeflow-pipeline.py produces the same YAML output.
  • KFP component dependencies are unchanged. The @dsl.component decorators still specify their own packages_to_install lists, which are resolved independently inside the KFP container runtime.
  • kagent-feast-mcp/ is not part of this migration. It retains its own dependency management.

How to use

# Install all workspace member dependencies into .venv at root
uv sync --all-packages

Once synced, there are two equivalent ways to run scripts:

Option A: Activate the venv (recommended for active development)

source .venv/bin/activate
python server/app.py
python pipelines/kubeflow-pipeline.py

Option B: Use uv run (useful for one-off commands and CI)

uv run python server/app.py

uv run is a shortcut that temporarily activates the .venv for a single command. For day-to-day work where you are running things repeatedly, activating the venv is simpler.

Managing dependencies:

# Add a dependency to a specific member
uv add --package docs-agent-server some-new-package

# Re-lock after editing any pyproject.toml
uv lock
uv sync --all-packages

Verification

All of the following were tested and pass:

Test Result
uv lock 136 packages resolved
uv sync --all-packages All packages installed
Import checks for server (websockets, httpx, pymilvus, sentence-transformers, numpy) Pass
Import checks for server-https (fastapi, uvicorn, pydantic, httpx, pymilvus, sentence-transformers, numpy) Pass
Import checks for pipelines (kfp, requests, bs4, langchain-text-splitters, feast, pandas, numpy) Pass
Torch version is CPU-only (2.11.0+cpu) Pass
Pipeline YAML compilation (kubeflow-pipeline.py) Pass
Docker build for server Pass
Docker build for server-https Pass
Docker container runtime test (all imports inside container) Pass

Signed-off-by: haroon0x <haroonbmc0@gmail.com>
@google-oss-prow
Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign franciscojavierarceo for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Signed-off-by: haroon0x <haroonbmc0@gmail.com>
@haroon0x
Copy link
Copy Markdown
Contributor Author

/assign @franciscojavierarceo

@haroon0x
Copy link
Copy Markdown
Contributor Author

haroon0x commented Apr 13, 2026

@franciscojavierarceo when you have a moment, could you please review this PR and merge if everything looks good?

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants