Skip to content

Build Alma9 base images for WMCore services, specifically MSUnmerged#1604

Closed
d-ylee wants to merge 6 commits into
dmwm:masterfrom
d-ylee:fix-11922_msunmerged-alma9-image
Closed

Build Alma9 base images for WMCore services, specifically MSUnmerged#1604
d-ylee wants to merge 6 commits into
dmwm:masterfrom
d-ylee:fix-11922_msunmerged-alma9-image

Conversation

@d-ylee

@d-ylee d-ylee commented May 8, 2025

Copy link
Copy Markdown
Contributor

fixes #11922
extends #1452

Images Added

  • cmsweb-alma9-base
  • dmwm-base based on cmsweb-alma9-base and including python 3.12
  • msunmerged based on dmwm-base alma9

@d-ylee d-ylee requested review from amaltaro and mapellidario May 8, 2025 14:54
@d-ylee d-ylee changed the title cmsweb-alma9-base, dmwm-base alma9 with py3.12, msunmerged alma9 Build Alma9 base images for WMCore services, specifically MSUnmerged May 8, 2025

@mapellidario mapellidario left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest a couple of name changes to improve a bit the clarity of our setup:

  • move docker/pypi/dmwm-base/Dockerfile.alma9 to docker/pypi/dmwm-alma9-base/Dockerfile
    • i really do not like to have completely different images (debian-based and alma9-based) to be differentiated only by a tag. the image name should be different, so i suggest that we create a new directory/docker image, not simply a new dockerfile for new tags of an already existing image.
  • move docker/pypi/msunmerged/Dockerfile.alma9 to docker/pypi/msunmerged/Dockerfile
    • be bold! the current docker/pypi/msunmerged/Dockerfile should go away. we can keep it around with a .cc7 suffix for the time being, but we should make an effort to make sure that the new file used gets the shiny Dockerfile filename.

then, as a bonus, we could also

  • deprecate docker/pypi/alma-base

what do you think? :)

@d-ylee

d-ylee commented May 8, 2025

Copy link
Copy Markdown
Contributor Author

@d-ylee That sounds good.

The error I got regarding importing imp is fixed with this PR: dmwm/WMCore#12336

It was due to the older future library

@d-ylee

d-ylee commented May 8, 2025

Copy link
Copy Markdown
Contributor Author

Here are the tags on harbor I'm using for the images. I specified the os version and python version in the tag names:

registry.cern.ch/cmsweb/cmsweb-alma9-base:20250506
registry.cern.ch/cmsweb/dmwm-base:alma9-py3.12-20250506
registry.cern.ch/cmsweb/msunmerged:alma9-2.3.11.1-py3.12-20250506

Renamed Dockerfiles

Separated dmwm-alma9-base
Removed alma-base
Renamed current msunmerged Dockerfile to Dockerfile.cc7
Set alma9 based msunmerged to Dockerfile

@mapellidario mapellidario left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the changes! they are good to me :)

@d-ylee

d-ylee commented May 12, 2025

Copy link
Copy Markdown
Contributor Author

Thanks to @amaltaro for creating a new RC with changes by @mapellidario , I created a new msunmerged image:
registry.cern.ch/cmsweb/msunmerged:alma9-py3.12-2.4.0rc2

Added grid certificates from egi repo
Update dmwm-alma9-base to use updated cmsweb-alma9-base
Added boost and gfal build image for alma9 and python 3.12
Copy built boost and gfal for msunmerged image
Refactored msunmerged Dockerfile steps to better use build cache on rebuilds

Removed rotatelogs references
@d-ylee

d-ylee commented May 16, 2025

Copy link
Copy Markdown
Contributor Author

@amaltaro @mapellidario
I found out that the python3-gfal2 package in the DNF repositories is only for use with Python3.9. As a result, I had to build my own gfal2 for Python3.12.

I added a new Dockerfile.alma9 under docker/pypi/gfal to build gfal2. Through this process, I also discovered that the boost-devel package that is required to build gfal2 only has the libboost_python39.so, so I also had to build boost from source using the dnf package installed Python 3.12. After that, I was able to successfully pip install gfal2-python using Python3.12. I then copied the resulting gfal2.so to the proper location in the msunmerged image.

Additionally, since the new Alma9 images do not have apache-utils for rotatelogs, I have manually patched this PR: dmwm/WMCore#11955

These are the current image tags I have made to make these work:

  • registry.cern.ch/cmsweb/cmsweb-alma9-base:20250515
  • registry.cern.ch/cmsweb/dmwm-base:alma9-py3.12-20250515
  • registry.cern.ch/cmsweb/gfal:alma9-py3.12-v1.82.0
  • registry.cern.ch/cmsweb/msunmerged:alma9-py3.12-2.4.0rc2-patch11955

I currently am using the registry.cern.ch/cmsweb/msunmerged:alma9-py3.12-2.4.0rc2-patch11955 image in cmsweb-testbed for ms-unmer-t2t3. So far, there haven't been any errors. Will continue to monitor it.

tl;dr

  • Build own boost and gfal2 libraries/packages for Python 3.12
  • Copied boost and gfal2 to msunmerged image
  • Running registry.cern.ch/cmsweb/msunmerged:alma9-py3.12-2.4.0rc2-patch11955 in cmsweb-testbed for ms-unmer-t2t3

@d-ylee d-ylee requested a review from mapellidario May 16, 2025 05:08

@amaltaro amaltaro left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for providing all these changes, @d-ylee . I left a few comments/questions along the code, some only for my own education.

Instead of building gfal2 ourselves, I am considering to reach out to the gfal2 developers to see if they have any suggestion on the gfal2-python package.

Let me tag @belforte as well, as he might have some insights on the GFAL2 package.

@@ -0,0 +1,23 @@
FROM cern/alma9-base:latest

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks a clean base image. I would suggest discussing this with Aroosha to see if there is anything that we can reuse from different images, or if CMSWEB could actually adopt this image as base for other services.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alan,

I would recommend against adopting pypi/alma-base as the base image for other services for several reasons:

  • Lack of Generality: A base image should be minimal and serve as a common foundation for all services, regardless of the programming language or specific team requirements. alma-base is tailored to the needs of the WM group and includes packages such as curl-minimal, libcurl-minimal, vim, python3-pycurl, pip, sudo, and less. These are not broadly applicable—for example, Python is irrelevant for Go-based applications, and tools like vim are unnecessary unless one intends to log into containers for interactive editing, which we generally discourage.

  • On-Demand Tooling: Tools like vim or curl should be installed on demand within a pod when needed for debugging or maintenance. This can be done easily using standard package managers (yum, dnf, apt-get, etc.). This approach keeps the base image lean and reduces the surface area for potential vulnerabilities, while still allowing flexibility for developers. Once debugging or inspection is complete, the pod should be terminated and allowed to restart in a clean state, in line with the stateless nature of Kubernetes workloads.

  • Security implications: Each package added to a base image introduces additional dependencies and potential vulnerabilities. If a service does not require these tools, including them unnecessarily increases risk and image size without clear benefit.

While the alma-base image may be appropriate for the WM group's specific needs, I would suggest that we not promote it as the standard base for all services.

As a side note, the only consistently shared requirement across our services appears to be access to CA certificates. However, the CMSWEB team has recently transitioned to retrieving these from Kubernetes nodes, now that CERN IT provides them via cluster configuration. This might even allow us to remove them from the CMSWEB base image in the future, though that's a separate discussion.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree with basically everything that you said.
I think the base image needs to provide requirements that we have for CERN-related systems only (mostly CAs as you pointed out). Nonetheless, if this one is still more inflated than it needs, it is already progress compared to the CC7 base image that many of the services still rely on.

CERN-CA-certs ca-certificates dummy-ca-certs ca-policy-lcg ca-policy-egi-core \
ca_EG-GRID ca_CERN-GridCA ca_CERN-LCG-IOTA-CA ca_CERN-Root-2 \
&& yum install -y --nogpgcheck wlcg-voms-cms && yum clean all && rm -rf /var/cache/yum && ln -s /bin/bash /usr/bin/bashs && echo "32 */6 * * * root ! /usr/sbin/fetch-crl -q -r 360" > /etc/cron.d/fetch-crl-docker
ca_EG-GRID ca_CERN-GridCA ca_CERN-LCG-IOTA-CA ca_CERN-Root-2 && \

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about installing these in the cmsweb-alma9-base image and stop using this base image in our Dockerfiles (or, at least in our MSUnmerged dockerfile)?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

beware that CA certificates may change. Unless you plan to rebuild often, picking the latest yum, there's a danger. CMSWEB folks have a way to keep them update on the pod hosts, you should pick the certs from there.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i agree, with the new py312 images we confirmed that we can reliably load the certificates / CAs / CRLs / vomses from the k8s hosts, there is no need to install these packages inside the docker images

RUN git clone --branch v1.13.0 https://github.com/cern-fts/gfal2-python.git

WORKDIR /tmp/gfal2-python
RUN ./ci/fedora-packages.sh

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps add a brief comment here to say what this script is supposed to do.

RUN python -m venv .venv
ENV PATH="$WDIR/.venv/bin:$PATH"

RUN .venv/bin/pip install -v gfal2-python

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this command actually pulling the gfal2-python package that was built in the lines above? Or is it pulling this package from the PyPi registry?

RUN cat requirements.txt | grep -v gfal2 > req.txt
RUN pip install -r req.txt
RUN pip install --no-deps msunmerged==$TAG
RUN cat requirements.txt | grep dbs3-client > req.txt

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need dbs3-client in this image, so I would suggest to completely remove this line grepping for it.
About L17 above, didn't you fix that FIXME message in the line below?

Comment thread docker/pypi/msunmerged/Dockerfile Outdated
ENV PATH="$WDIR/.venv/bin:$PATH"

# patch dmwm/WMCore https://github.com/dmwm/WMCore/pull/11955
ADD 11955.patch $WDIR/11955.patch

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot forget to remove this patch before merging this in.

@@ -0,0 +1,23 @@
FROM registry.cern.ch/cmsweb/dmwm-base:alma9-py3.12-20250515

WORKDIR /tmp

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am inclined to say that we could rename this dockerfile as:

  • the current debian/conda based to be named as Dockerfile.debian
  • and this file to Dockerfile

What do you think?

@belforte

Copy link
Copy Markdown
Member

thanks for tagging me Alan. But I am afraid that have no knowledge about gfal2-python nor did ever use it. Asking Mihai is surely always good.

@vkuznet

vkuznet commented May 16, 2025

Copy link
Copy Markdown
Collaborator

@d-ylee , I recommend running a vulnerability scanner such as Trivy, which is also used by the CERN container registry, to compare the number of vulnerabilities in the updated images (reflecting your proposed changes) against the current ones. As we make changes to our Docker images, we should aim to minimize vulnerabilities as much as possible.

@mapellidario mapellidario left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Dennis! I agree with the recipe to install gfal in docker/pypi/gfal/Dockerfile.alma9 (that we can also rename, this will be the production image, no need for any suffix), it's the same one i explored when trying to install it in the debian image.

I left a few comments along the code. I think we do not need to add the CAs / CRLs / vomses inside our images, everything should come from the k8s hosts, and i already changed the manifests for msunmerged as well in my PR [1]

[1] https://github.com/dmwm/CMSKubernetes/pull/1620/files

CERN-CA-certs ca-certificates dummy-ca-certs ca-policy-lcg ca-policy-egi-core \
ca_EG-GRID ca_CERN-GridCA ca_CERN-LCG-IOTA-CA ca_CERN-Root-2 \
&& yum install -y --nogpgcheck wlcg-voms-cms && yum clean all && rm -rf /var/cache/yum && ln -s /bin/bash /usr/bin/bashs && echo "32 */6 * * * root ! /usr/sbin/fetch-crl -q -r 360" > /etc/cron.d/fetch-crl-docker
ca_EG-GRID ca_CERN-GridCA ca_CERN-LCG-IOTA-CA ca_CERN-Root-2 && \

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i agree, with the new py312 images we confirmed that we can reliably load the certificates / CAs / CRLs / vomses from the k8s hosts, there is no need to install these packages inside the docker images

Comment on lines +4 to +17
# Install EPEL repository (required for voms, fetch-crl and CA-related packages)
RUN dnf -y install epel-release && dnf -y upgrade && \
dnf install -y http://linuxsoft.cern.ch/wlcg/el9/x86_64/wlcg-repo-1.0.0-1.el9.noarch.rpm && \
dnf clean all

ADD http://repository.egi.eu/sw/production/cas/1/current/repo-files/egi-trustanchors.repo /etc/yum.repos.d/egi.repo

# Upgrade packages from the base image and install CMSWEB required packages
RUN dnf -y install fetch-crl cronie cern-get-certificate CERN-CA-certs ca-certificates \
dummy-ca-certs ca-policy-lcg ca-policy-egi-core \
ca_CERN-GridCA ca_CERN-Root-2 \
wlcg-voms-cms && \
dnf clean all && \
echo "32 */6 * * * root ! /usr/sbin/fetch-crl -q -r 360" > /etc/cron.d/fetch-crl-docker

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure we really need these anymore, all of these things should come from the k8s hosts

Comment thread docker/pypi/dmwm-alma9-base/Dockerfile Outdated
RUN dnf -y upgrade && \
dnf -y install --skip-broken curl libcurl && \
dnf -y install sudo vim less procps && \
dnf -y install python3.12 python3.12-pip python3-pycurl pip && \

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would remove python3-pycurl from this list. it's not for python3.12, and we already install pycurl from requirements.txt

https://github.com/dmwm/WMCore/blob/ecf8b2e0a934145f60df7c52647f0cb4d4f1671f/requirements.txt#L34

Comment on lines +8 to +9
COPY --from=cmsweb-base /etc/grid-security/certificates /etc/grid-security/certificates
COPY --from=cmsweb-base /etc/grid-security/vomsdir /etc/grid-security/vomsdir

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, i would remove these two lines :)

@amaltaro

amaltaro commented Aug 1, 2025

Copy link
Copy Markdown
Contributor

@d-ylee I suspect this PR is no longer relevant. If you can confirm this, shall we close this out?

@d-ylee

d-ylee commented Aug 1, 2025

Copy link
Copy Markdown
Contributor Author

Yes, I think we can close this with the merging of #1627

@amaltaro

amaltaro commented Aug 2, 2025

Copy link
Copy Markdown
Contributor

Thank you Dennis. Closing this out.

@amaltaro amaltaro closed this Aug 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Build a MSUnmerged docker image based on Almalinux

5 participants