Skip to content
Merged
16 changes: 12 additions & 4 deletions docs/user_guide/admin_guide/deployment/brev_deployment.rst
Original file line number Diff line number Diff line change
Expand Up @@ -302,11 +302,15 @@ Dockerfile, install the dependency in the image:

.. code-block:: dockerfile

RUN pip install kubernetes
RUN pip install "kubernetes!=36.0.0"

The repository ``docker/Dockerfile.parent`` already installs the NVFlare
``K8S`` extra, which includes this dependency. Keep that install line, or add
the explicit ``pip install kubernetes`` line above before building your image.
the explicit ``pip install kubernetes!=36.0.0`` line above before building your image.

The prepared Brev launcher uses in-cluster Kubernetes config
(``job_launcher.config_file_path: null``), so the parent pod authenticates with
its ServiceAccount token.

.. code-block:: shell

Expand Down Expand Up @@ -576,7 +580,9 @@ in that namespace.
Copy the prepared server ``startup/`` and ``local/`` directories into the
``nvflws`` PVC. The chart starts the server with
``-m /var/tmp/nvflare/workspace``, so the PVC root must contain ``startup/``
and ``local/`` directly.
and ``local/`` directly. The temporary copy pod image must contain ``tar``
because ``kubectl cp`` requires it in the target container; ``busybox:1.36``
includes ``tar``.

.. code-block:: shell

Expand Down Expand Up @@ -701,7 +707,9 @@ same launcher settings from ``/tmp/nvflare-k8s.yaml``. Keep the Helm namespace
consistent with the ``namespace`` value used by ``nvflare deploy prepare``.

Copy the prepared ``site-1`` ``startup/`` and ``local/`` directories into the
client ``nvflws`` PVC:
client ``nvflws`` PVC. The temporary copy pod image must contain ``tar``
because ``kubectl cp`` requires it in the target container; ``busybox:1.36``
includes ``tar``:

.. code-block:: shell

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,11 @@ if prepared_namespace != namespace:
f"Prepared launcher namespace is {prepared_namespace!r}, but launch NAMESPACE is {namespace!r}. "
"Use the same NAMESPACE for prepare and launch."
)
config_file_path = args.get("config_file_path")
if config_file_path not in (None, ""):
raise SystemExit(
f"Prepared launcher config_file_path is {config_file_path!r}; expected null/empty for in-cluster config."
)
if not args.get("workspace_mount_path"):
raise SystemExit("k8s_launcher args missing workspace_mount_path")

Expand Down Expand Up @@ -197,6 +202,7 @@ spec:
restartPolicy: Never
containers:
- name: copy
# kubectl cp requires tar in the target container; busybox includes it.
image: busybox:1.36
command:
- sh
Expand Down Expand Up @@ -298,6 +304,14 @@ install_chart() {
helm "${helm_args[@]}"
}

verify_parent_kubernetes_client() {
kubectl -n "${NAMESPACE}" exec "deploy/${PARTICIPANT}" -- "${PARENT_PYTHON_PATH}" -c '
import kubernetes

print(f"kubernetes-python-client={kubernetes.__version__}")
'
}

if [[ "${1:-}" == "-h" || "${1:-}" == "--help" ]]; then
usage
exit 0
Expand All @@ -318,6 +332,7 @@ ARCHIVE="${ARCHIVE:-${HOME}/nvflare-${PARTICIPANT}.tgz}"
COPY_POD="${COPY_POD:-nvflare-pvc-copy}"
ROLLOUT_TIMEOUT="${ROLLOUT_TIMEOUT:-300s}"
LOG_TAIL="${LOG_TAIL:-100}"
PARENT_PYTHON_PATH="${PARENT_PYTHON_PATH:-/usr/local/bin/python3}"

require_cmd kubectl helm tar python3
[[ -f "${ARCHIVE}" ]] || fail "Archive not found: ${ARCHIVE}"
Expand Down Expand Up @@ -361,6 +376,7 @@ fi
install_chart

kubectl -n "${NAMESPACE}" rollout status "deployment/${PARTICIPANT}" --timeout="${ROLLOUT_TIMEOUT}"
verify_parent_kubernetes_client
kubectl -n "${NAMESPACE}" get pods
kubectl -n "${NAMESPACE}" logs "deploy/${PARTICIPANT}" --tail="${LOG_TAIL}" || true

Expand Down
13 changes: 11 additions & 2 deletions docs/user_guide/admin_guide/deployment/helm_chart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,9 @@ Before you start, make sure you have:
``nvflare deploy prepare``.
* ``kubectl`` configured for the target cluster. Use a ``kubectl`` version that
is compatible with the Kubernetes API server.
* ``tar`` installed locally and in any temporary pod image used with
``kubectl cp``. The staging examples below use ``busybox:1.36``, which
includes ``tar``.
* Helm 3.
* A Kubernetes cluster with standard ``apps/v1`` Deployment,
``rbac.authorization.k8s.io/v1`` Role/RoleBinding, Service, Secret, and PVC
Expand Down Expand Up @@ -94,6 +97,9 @@ The generated Helm chart does not run submitted jobs directly. It installs the
parent participant process, its Kubernetes Service, its ServiceAccount, and the
Role/RoleBinding that allow the launcher to create job pods.

When ``job_launcher.config_file_path`` is omitted or set to ``null``, the
launcher uses Kubernetes in-cluster config from the parent pod's ServiceAccount.

The parent Service is the stable in-cluster address for dynamically launched job
pods. ``nvflare deploy prepare`` patches the prepared kit's internal
communication settings to use the generated Service name and ``parent_port``.
Expand Down Expand Up @@ -315,7 +321,9 @@ mounts ``parent.workspace_pvc`` at ``parent.workspace_mount_path``, but it does
not upload files to the PVC. Copy the prepared kit's ``startup/`` and
``local/`` directories into the root of that workspace PVC before installing the
chart. For server kits, also create or copy ``transfer/`` at the workspace root
for admin file-transfer storage.
for admin file-transfer storage. If you use ``kubectl cp`` as shown below, the
temporary copy pod image must contain ``tar`` because ``kubectl cp`` requires it
in the target container.

Example ``workspace-pvc.yaml``:

Expand Down Expand Up @@ -977,7 +985,8 @@ Check the parent logs for Kubernetes import or authorization failures:
--as=system:serviceaccount:"$NAMESPACE":server

If the logs show that the ``kubernetes`` Python package is missing, rebuild the
parent image with the NVFlare ``K8S`` extra or ``pip install kubernetes``.
parent image with the NVFlare ``K8S`` extra or
``pip install "kubernetes!=36.0.0"``.

If the logs show ``SSLCertVerificationError`` with
``CA cert does not include key usage extension``, the parent Kubernetes client
Expand Down
1 change: 1 addition & 0 deletions docs/user_guide/admin_guide/deployment/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ Deployment Guide
operation
containerized_deployment
helm_chart
openshift/index
brev_scripted_deployment
brev_deployment
cloud_deployment
Expand Down
77 changes: 77 additions & 0 deletions docs/user_guide/admin_guide/deployment/openshift/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
# OpenShift Deployment Helpers

This directory contains the OpenShift-specific NVFlare deployment guide and helper scripts.

- `index.rst` is the user guide page for OpenShift deployment.
- Repository `docker/Dockerfile.parent` builds the parent image used by server/client and admin pods.
- Repository `docker/Dockerfile.job` builds the workload image used by job pods.
- `scripts/create_openshift_cluster.sh` configures Red Hat OpenShift Local (CRC) and optionally starts it.
- `scripts/start_openshift_cluster.sh` starts CRC, logs in with `oc`, and prepares the target project.
- `scripts/cleanup_openshift_cluster.sh` deletes scripted deployment resources and stops CRC.
- `scripts/k8s_provision.sh` runs `nvflare provision` for the sample server, `site-1`, `site-2`, and admin.
- `scripts/k8s_deploy.sh` prepares K8s startup kits, stages PVC workspaces, installs Helm charts, and verifies parent pods can import the Kubernetes Python client.
- `scripts/k8s_submit_job.sh` submits `hello-numpy` from an in-cluster admin pod and waits for successful completion.
- `scripts/k8s_watch.sh` shows an in-place live Rich pod table for the created pods.
- `scripts/k8s_watch.py` implements the Rich table used by the shell wrapper.
- `scripts/k8s_e2e.sh` runs provision, deploy, and submit in order.

## Create a Local OpenShift Cluster

Use the CRC helper scripts only when you need a single-node Red Hat OpenShift
Local cluster for development or testing. Production OpenShift clusters are
platform-specific; create those with your organization's approved installer or
cloud service workflow, then use the deployment scripts here against that
cluster.

Before using the local-cluster scripts, install Red Hat OpenShift Local so the
`crc` command is available, download your Red Hat OpenShift pull secret from
`https://console.redhat.com/openshift/create/local`, enable host hardware
virtualization, and make sure the host has enough CPU, memory, and disk for
OpenShift plus the NVFlare test pods. The create script defaults to 6 vCPUs,
24576 MiB memory, and 120 GiB disk.

Use `scripts/create_openshift_cluster.sh` for first-time local CRC setup. It
validates that `crc` exists, requires `PULL_SECRET_FILE` when the cluster will
be started, writes CRC settings such as resource sizing and shared-directory
behavior, runs `crc setup`, and starts the cluster by delegating to
`scripts/start_openshift_cluster.sh` unless `START_AFTER_CREATE=false` is set.

```bash
export PULL_SECRET_FILE="$HOME/Downloads/pull-secret.txt"
export NAMESPACE=nvflare-e2e

bash docs/user_guide/admin_guide/deployment/openshift/scripts/create_openshift_cluster.sh
```

Use `scripts/start_openshift_cluster.sh` after CRC has already been configured,
or when restarting after `crc stop`. It runs `crc start` when needed, adds the
CRC-provided `oc` to `PATH` if needed, waits for OpenShift to report running,
logs in with `oc`, creates or selects `NAMESPACE`, and prints the console URL
and available StorageClasses.

```bash
PULL_SECRET_FILE="$HOME/Downloads/pull-secret.txt" \
bash docs/user_guide/admin_guide/deployment/openshift/scripts/start_openshift_cluster.sh
```

Run scripts from the repository root. Build the maintained images from `docker/Dockerfile.parent` and `docker/Dockerfile.job`, push them to a registry the cluster can pull from, then set `IMAGE` to the parent image and `JOB_IMAGE` to the workload image. `ADMIN_IMAGE` defaults to `IMAGE`, so the parent image can also be used for the temporary admin pod. The parent image needs NVFlare with the `K8S` extra/Kubernetes Python client. A custom `COPY_IMAGE` needs `sh`, `sleep`, and `tar`; `JOB_IMAGE` only needs `tar` when the job workload itself needs it.

```bash
export IMAGE=registry.example.com/nvflare-parent:dev
export JOB_IMAGE=registry.example.com/nvflare-job:dev
export NAMESPACE=nvflare-e2e

bash docs/user_guide/admin_guide/deployment/openshift/scripts/k8s_e2e.sh
```

The watch tool requires the Python `rich` package:

```bash
python3 -m pip install rich
```

Clean up generated resources and stop OpenShift Local:

```bash
bash docs/user_guide/admin_guide/deployment/openshift/scripts/cleanup_openshift_cluster.sh
```
Loading
Loading