Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 20 additions & 7 deletions docker/Build_instructions_blackwell.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,21 +12,34 @@ This will create a Docker image named `openfold-3-blackwell` with the `latest` t
## test Pytorch and CUDA

```bash
docker run --gpus all --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 openfold-3-blackwell:latest python -c "import torch; print('CUDA:', torch.version.cuda); print('PyTorch:', torch.__version__)"
docker run \
--gpus all \
--ipc=host \
--ulimit memlock=-1 \
openfold-3-blackwell:latest \
python -c "import torch; print('CUDA:', torch.version.cuda); print('PyTorch:', torch.__version__)"
```

Should print something like:
CUDA: 12.8
PyTorch: 2.7.0a0+ecf3bae40a.nv25.02

```
CUDA: 13.1
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is important: with CUDA 12.9+ we get sm121 support out of the box

PyTorch: 2.10.0a0+b4e4ee81d3.nv25.12
```

## test run_openfold inference example

docker run --gpus all -it --ipc=host --ulimit memlock=-1 \
-v $(pwd):/output \
```bash
docker run \
--gpus all -it \
--ipc=host \
--ulimit memlock=-1 \
-v /home/jandom/.openfold3:/root/.openfold3 \
-v $(pwd)/output:/output \
-w /output openfold-3-blackwell:latest \
run_openfold predict \
--query_json=/opt/openfold-3/examples/example_inference_inputs/query_ubiquitin.json \
--query_json=/opt/openfold3/examples/example_inference_inputs/query_ubiquitin.json \
--num_diffusion_samples=1 \
--num_model_seeds=1 \
--use_templates=false
--use_templates=false
```
56 changes: 22 additions & 34 deletions docker/Dockerfile.blackwell
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Simple OpenFold3 Dockerfile using NVIDIA PyTorch container
FROM nvcr.io/nvidia/pytorch:25.02-py3
FROM nvcr.io/nvidia/pytorch:25.12-py3

# Install system dependencies
RUN apt-get update && apt-get install -y \
Expand All @@ -13,15 +13,21 @@ RUN apt-get update && apt-get install -y \
libxft2 \
&& rm -rf /var/lib/apt/lists/*

# Clone OpenFold3 source and modify environment file
# Install CUTLASS for DeepSpeed Evoformer attention kernel
# We need only the headers for DeepSpeed JIT, don't need the pip package with bindings
WORKDIR /opt
RUN git clone https://github.com/aqlaboratory/openfold-3.git && \
cd openfold-3 && \
cp -p environments/production-linux-64.yml environments/production.yml.backup && \
grep -v "pytorch::pytorch" environments/production.yml > environments/production.yml.tmp && \
mv environments/production.yml.tmp environments/production.yml
Comment on lines -18 to -22
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was completely unused: everything is installed via the system python+pip

RUN git clone https://github.com/NVIDIA/cutlass --branch v3.6.0 --depth 1

# Pre-compile DeepSpeed operations for Blackwell GPUs to avoid runtime compilation
# Create necessary cache directories
RUN python3 -c "import os; os.makedirs('/root/.triton/autotune', exist_ok=True)"
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is empirically needed in my tests, which is a bit odd


WORKDIR /opt/openfold-3
# Set environment variables including CUDA architecture for Blackwell
ENV PYTHONUNBUFFERED=1 \
PYTHONDONTWRITEBYTECODE=1 \
KMP_AFFINITY=none \
CUTLASS_PATH=/opt/cutlass \
TORCH_CUDA_ARCH_LIST="12.1"
Comment on lines +25 to +30
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can still remove some of these – all of those could be provided at runtime, and are quite specific to the use case here


# Install Python dependencies
RUN pip install --no-cache-dir \
Expand All @@ -46,36 +52,18 @@ RUN pip install --no-cache-dir \
awscli \
memory_profiler \
func_timeout \
biotite==1.2.0 \
"nvidia-cutlass<4" \
"cuda-python<12.9.1"
Comment on lines -50 to -51
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • We get coda-python with the image, no need to duplicate that
  • We also only need the cutlass headers, no need to install the package

biotite==1.2.0

# Install CUTLASS for DeepSpeed Evoformer attention kernel
WORKDIR /opt
RUN git clone https://github.com/NVIDIA/cutlass --branch v3.6.0 --depth 1
COPY pyproject.toml /opt/openfold3/
COPY openfold3/__init__.py /opt/openfold3/openfold3/
COPY scripts/ /opt/openfold3/scripts/

# Install OpenFold3 package itself (provides run_openfold command)
WORKDIR /opt/openfold-3
RUN python3 -m pip install --editable --no-deps .

# Set environment variables including CUDA architecture for Blackwell
ENV PYTHONUNBUFFERED=1 \
PYTHONDONTWRITEBYTECODE=1 \
KMP_AFFINITY=none \
CUTLASS_PATH=/opt/cutlass \
TORCH_CUDA_ARCH_LIST="12.0"

# Pre-compile DeepSpeed operations for Blackwell GPUs to avoid runtime compilation
# Create necessary cache directories
RUN python3 -c "import os; os.makedirs('/root/.triton/autotune', exist_ok=True)"
WORKDIR /opt/openfold3
RUN python3 -m pip install --no-deps --editable .

# Create a Python sitecustomize.py to set TORCH_CUDA_ARCH_LIST before any imports
# This ensures the variable is set before PyTorch's cpp_extension checks it
RUN mkdir -p /usr/local/lib/python3.12/site-packages && \
echo 'import os' > /usr/local/lib/python3.12/site-packages/sitecustomize.py && \
echo 'os.environ.setdefault("TORCH_CUDA_ARCH_LIST", "12.0")' >> /usr/local/lib/python3.12/site-packages/sitecustomize.py && \
echo 'os.environ.setdefault("CUTLASS_PATH", "/opt/cutlass")' >> /usr/local/lib/python3.12/site-packages/sitecustomize.py && \
echo 'os.environ.setdefault("KMP_AFFINITY", "none")' >> /usr/local/lib/python3.12/site-packages/sitecustomize.py
Comment on lines -74 to -78
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of this can be removed

# Copy the entire source tree directly (at the very end for optimal caching)
COPY . /opt/openfold3

# Default command
CMD ["/bin/bash"]
Loading