-
Notifications
You must be signed in to change notification settings - Fork 48
Build Alma9 base images for WMCore services, specifically MSUnmerged #1604
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 4 commits
1ce38ba
67362fe
6dafe2c
4c3f22b
3ad0631
0c7207b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,23 @@ | ||
| FROM cern/alma9-base:latest | ||
| LABEL author="Dennis Lee dylee@fnal.gov" | ||
|
|
||
| # Install EPEL repository (required for voms, fetch-crl and CA-related packages) | ||
| RUN dnf -y install epel-release && dnf -y upgrade && \ | ||
| dnf install -y http://linuxsoft.cern.ch/wlcg/el9/x86_64/wlcg-repo-1.0.0-1.el9.noarch.rpm && \ | ||
| dnf clean all | ||
|
|
||
| ADD http://repository.egi.eu/sw/production/cas/1/current/repo-files/egi-trustanchors.repo /etc/yum.repos.d/egi.repo | ||
|
|
||
| # Upgrade packages from the base image and install CMSWEB required packages | ||
| RUN dnf -y install fetch-crl cronie cern-get-certificate CERN-CA-certs ca-certificates \ | ||
| dummy-ca-certs ca-policy-lcg ca-policy-egi-core \ | ||
| ca_CERN-GridCA ca_CERN-Root-2 \ | ||
| wlcg-voms-cms && \ | ||
| dnf clean all && \ | ||
| echo "32 */6 * * * root ! /usr/sbin/fetch-crl -q -r 360" > /etc/cron.d/fetch-crl-docker | ||
|
Comment on lines
+4
to
+17
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am not sure we really need these anymore, all of these things should come from the k8s hosts |
||
|
|
||
| # Required OS packages | ||
| RUN dnf -y install vim less procps python3-pycurl pip && dnf clean all | ||
| RUN ln -s /usr/bin/python3 /usr/bin/python | ||
|
|
||
| RUN update-ca-trust | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -7,7 +7,10 @@ ADD slc7-cernonly.repo /etc/yum.repos.d/slc7-cernonly.repo | |
| # see https://developers.redhat.com/blog/2016/03/09/more-about-docker-images-size/ | ||
| RUN yum install -y sudo cern-get-certificate fetch-crl \ | ||
| CERN-CA-certs ca-certificates dummy-ca-certs ca-policy-lcg ca-policy-egi-core \ | ||
| ca_EG-GRID ca_CERN-GridCA ca_CERN-LCG-IOTA-CA ca_CERN-Root-2 \ | ||
| && yum install -y --nogpgcheck wlcg-voms-cms && yum clean all && rm -rf /var/cache/yum && ln -s /bin/bash /usr/bin/bashs && echo "32 */6 * * * root ! /usr/sbin/fetch-crl -q -r 360" > /etc/cron.d/fetch-crl-docker | ||
| ca_EG-GRID ca_CERN-GridCA ca_CERN-LCG-IOTA-CA ca_CERN-Root-2 && \ | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How about installing these in the cmsweb-alma9-base image and stop using this base image in our Dockerfiles (or, at least in our MSUnmerged dockerfile)?
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. beware that CA certificates may change. Unless you plan to rebuild often, picking the latest yum, there's a danger. CMSWEB folks have a way to keep them update on the pod hosts, you should pick the certs from there.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i agree, with the new py312 images we confirmed that we can reliably load the certificates / CAs / CRLs / vomses from the k8s hosts, there is no need to install these packages inside the docker images |
||
| yum install -y --nogpgcheck wlcg-voms-cms && \ | ||
| yum clean all && rm -rf /var/cache/yum && \ | ||
| ln -s /bin/bash /usr/bin/bashs && \ | ||
| echo "32 */6 * * * root ! /usr/sbin/fetch-crl -q -r 360" > /etc/cron.d/fetch-crl-docker | ||
|
|
||
| RUN update-ca-trust | ||
This file was deleted.
This file was deleted.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,26 @@ | ||
| FROM registry.cern.ch/cmsweb/cmsweb-alma9-base:20250515 AS cmsweb-base | ||
| FROM registry.cern.ch/cmsweb/exporters AS exporters | ||
| FROM almalinux:latest | ||
| LABEL org.opencontainers.image.authors="Alan Malta alan.malta@cern.ch" | ||
|
|
||
| # base image stuff: certificates, monitoring, exporters, etc | ||
| RUN mkdir /etc/grid-security | ||
| COPY --from=cmsweb-base /etc/grid-security/certificates /etc/grid-security/certificates | ||
| COPY --from=cmsweb-base /etc/grid-security/vomsdir /etc/grid-security/vomsdir | ||
|
Comment on lines
+8
to
+9
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. again, i would remove these two lines :) |
||
| COPY --from=exporters /data/cmsweb-ping /usr/bin/cmsweb-ping | ||
| COPY --from=exporters /data/process_exporter /usr/bin/process_exporter | ||
| COPY --from=exporters /data/cpy_exporter /usr/bin/cpy_exporter | ||
|
|
||
| # Required OS packages | ||
| RUN dnf -y upgrade && \ | ||
| dnf -y install --skip-broken curl libcurl && \ | ||
| dnf -y install sudo vim less procps && \ | ||
| dnf -y install python3.12 python3.12-pip python3-pycurl pip && \ | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i would remove python3-pycurl from this list. it's not for python3.12, and we already install pycurl from requirements.txt https://github.com/dmwm/WMCore/blob/ecf8b2e0a934145f60df7c52647f0cb4d4f1671f/requirements.txt#L34 |
||
| dnf clean all | ||
| RUN ln -s /usr/bin/python3.12 /usr/bin/python | ||
|
|
||
| ENV WDIR=/data | ||
| ADD run.sh $WDIR/run.sh | ||
| ADD monitor.sh $WDIR/monitor.sh | ||
| ADD manage $WDIR/manage | ||
| WORKDIR /data | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,105 @@ | ||
| #!/bin/bash | ||
| ##H Usage: manage ACTION [ATTRIBUTE] [SECURITY-STRING] | ||
| ##H | ||
| ##H Available actions: | ||
| ##H help show this help | ||
| ##H version get current version of the service | ||
| ##H status show current service's status | ||
| ##H restart (re)start the service | ||
| ##H start (re)start the service | ||
| ##H stop stop the service | ||
|
|
||
| # common settings to prettify output | ||
| echo_e=-e | ||
| COLOR_OK="\\033[0;32m" | ||
| COLOR_WARN="\\033[0;31m" | ||
| COLOR_NORMAL="\\033[0;39m" | ||
|
|
||
| # service settings | ||
| srv=`echo $USER | sed -e "s,_,,g" | sed -e "s,t0req,t0_req,g"` | ||
| LOGDIR=/data/srv/logs/$srv | ||
| AUTHDIR=/data/srv/current/auth/$srv | ||
| STATEDIR=/data/srv/state/$srv | ||
| CFGDIR=/data/srv/current/config/$srv | ||
| CFGFILE=$CFGDIR/config.py | ||
| # some MS services uses different config naming convention, therefore we'll | ||
| # adjust CFGFILE assingment | ||
| for c in monitor output ruleCleaner transferor unmerged; do | ||
| if [ -f $CFGDIR/config-${c}.py ]; then | ||
| CFGFILE=$CFGDIR/config-${c}.py | ||
| fi | ||
| done | ||
|
|
||
| # necessary env settings for all WM services | ||
| export PYTHONPATH=$PYTHONPATH:/etc/secrets:/data/srv/current/config/$srv | ||
| export X509_USER_KEY=$AUTHDIR/dmwm-service-key.pem | ||
| export X509_USER_CERT=$AUTHDIR/dmwm-service-cert.pem | ||
| export REQMGR_CACHE_DIR=$STATEDIR | ||
| export WMCORE_CACHE_DIR=$STATEDIR | ||
| # MSUnmerged also needs to access a proxy with additional voms roles | ||
| if [ -f $AUTHDIR/proxy.cert ]; then | ||
| export X509_USER_PROXY=$AUTHDIR/proxy.cert | ||
| fi | ||
|
|
||
| # by default Rucio relies on /opt/rucio/etc/config.cfg | ||
| # if necessary we may setup RUCIO_HOME which should provide this location | ||
| # but in k8s we mount rucio config.cfg under /opt/rucio/etc area | ||
|
|
||
| usage() | ||
| { | ||
| cat $0 | grep "^##H" | sed -e "s,##H,,g" | ||
| } | ||
|
|
||
| start_srv() | ||
| { | ||
| wmc-httpd -r -d $STATEDIR -l "$LOGDIR/$srv-`hostname -s`.log" $CFGFILE | ||
| } | ||
|
|
||
| stop_srv() | ||
| { | ||
| local pid=`ps auxwww | egrep "wmc-httpd" | grep -v grep | awk 'BEGIN{ORS=" "} {print $2}'` | ||
| echo "Stop $srv service... ${pid}" | ||
| if [ -n "${pid}" ]; then | ||
| kill -9 ${pid} | ||
| fi | ||
| } | ||
|
|
||
| status_srv() | ||
| { | ||
| local pid=`ps auxwww | egrep "wmc-httpd" | grep -v grep | awk 'BEGIN{ORS=" "} {print $2}'` | ||
| if [ -z "${pid}" ]; then | ||
| echo "$srv service is not running" | ||
| return | ||
| fi | ||
| if [ ! -z "${pid}" ]; then | ||
| echo $echo_e "$srv service is ${COLOR_OK}RUNNING${COLOR_NORMAL}, PID=${pid}" | ||
| ps -f -wwww -p ${pid} | ||
| else | ||
| echo $echo_e "$srv service is ${COLOR_WARN}NOT RUNNING${COLOR_NORMAL}" | ||
| fi | ||
| } | ||
|
|
||
| # Main routine, perform action requested on command line. | ||
| case ${1:-status} in | ||
| start | restart ) | ||
| stop_srv | ||
| start_srv | ||
| ;; | ||
|
|
||
| status ) | ||
| status_srv | ||
| ;; | ||
|
|
||
| stop ) | ||
| stop_srv | ||
| ;; | ||
|
|
||
| help ) | ||
| usage | ||
| ;; | ||
|
|
||
| * ) | ||
| echo "$0: unknown action '$1', please try '$0 help' or documentation." 1>&2 | ||
| exit 1 | ||
| ;; | ||
| esac |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,26 @@ | ||
| #!/bin/bash | ||
|
|
||
| echo -e "\nTrying to start process_exporter..." | ||
| # start process exporter | ||
| configs="config config-monitor config-output config-transferor config-ruleCleaner config-unmerged" | ||
| for p in $configs; do | ||
| if [ -f /etc/secrets/${p}.py ]; then | ||
| echo " Using configuration file: /etc/secrets/${p}.py" | ||
| pat="wmc-httpd.*$p" | ||
| pid=`ps axjfwww | grep "$pat" | grep -v grep | grep -v process_monitor | grep -v " 1 " | awk '{print $1}'` | ||
| if [ -n "$pid" ]; then | ||
| app=`grep ^main.application /etc/secrets/${p}.py | grep -v application_dir | sed -e 's,#.*,,g' | awk '{split($0,a,"="); print a[2]}' | sed -e "s, ,,g" -e 's,",,g' -e "s,-,_,g"` | ||
| echo " Using PID: $pid and app name: '$app'" | ||
| if [ -n "$app" ]; then | ||
| prefix=${app} | ||
| port=`grep main.port /etc/secrets/${p}.py | sed -e 's,#.*,,g' | awk '{split($0,a,"="); print a[2]}' | sed -e "s, ,,g"` | ||
| address=":1${port}" | ||
| echo " Starting process_exporter with prefix ${prefix} on ${address}" | ||
| nohup process_exporter -pid $pid -prefix $prefix -address "$address" 2>&1 1>& ${prefix}.log < /dev/null & | ||
| #cpyAddr=`echo ${address} | sed "s,8,9,g"` | ||
| #echo "Start cpy_exporter on ${cpyAddr}" | ||
| #nohup cpy_exporter -address "$address" 2>&1 1>& cpy_${prefix}.log < /dev/null & | ||
| fi | ||
| fi | ||
| fi | ||
| done |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,86 @@ | ||
| #!/bin/bash | ||
| # script to start ReqMgr2 | ||
|
|
||
| srv=`echo $USER | sed -e "s,_,,g"` | ||
| STATEDIR=/data/srv/state/$srv | ||
| LOGDIR=/data/srv/logs/$srv | ||
| AUTHDIR=/data/srv/current/auth/$srv | ||
| CONFIGDIR=/data/srv/current/config/$srv | ||
| CONFIGFILE=${CONFIGFILE:-config.py} | ||
| CFGFILE=/etc/secrets/$CONFIGFILE | ||
|
|
||
| ### permission update to workaround issues with mounting logs volume | ||
| sudo chown -R $USER.$USER /data | ||
|
|
||
| mkdir -p $LOGDIR | ||
| mkdir -p $STATEDIR | ||
| mkdir -p $AUTHDIR | ||
| mkdir -p $CONFIGDIR | ||
| mkdir -p $AUTHDIR/../wmcore-auth | ||
|
|
||
| # environment variables required to run some of the WMCore services | ||
| export REQMGR_CACHE_DIR=$STATEDIR | ||
| export WMCORE_CACHE_DIR=$STATEDIR | ||
|
|
||
| # overwrite host PEM files in /data/srv area by the robot certificate | ||
| # Note that the proxy file is not required and used | ||
| if [ -f /etc/robots/robotkey.pem ]; then | ||
| sudo cp /etc/robots/robotkey.pem $AUTHDIR/dmwm-service-key.pem | ||
| sudo cp /etc/robots/robotcert.pem $AUTHDIR/dmwm-service-cert.pem | ||
| sudo chown $USER.$USER $AUTHDIR/dmwm-service-key.pem | ||
| sudo chown $USER.$USER $AUTHDIR/dmwm-service-cert.pem | ||
| sudo chmod 0400 $AUTHDIR/dmwm-service-key.pem | ||
| fi | ||
|
|
||
| if [ -e $AUTHDIR/dmwm-service-cert.pem ] && [ -e $AUTHDIR/dmwm-service-key.pem ]; then | ||
| export X509_USER_CERT=$AUTHDIR/dmwm-service-cert.pem | ||
| export X509_USER_KEY=$AUTHDIR/dmwm-service-key.pem | ||
| fi | ||
|
|
||
| # overwrite header-auth key file with one from secrets | ||
| if [ -f /etc/hmac/hmac ]; then | ||
| sudo cp /etc/hmac/hmac $AUTHDIR/../wmcore-auth/header-auth-key | ||
| sudo chown $USER.$USER $AUTHDIR/../wmcore-auth/header-auth-key | ||
| sudo chmod 0600 $AUTHDIR/../wmcore-auth/header-auth-key | ||
| fi | ||
|
|
||
| # use service configuration files from /etc/secrets if they are present | ||
| files=`ls /etc/secrets` | ||
| for fname in $files; do | ||
| if [ -f /etc/secrets/$fname ]; then | ||
| if [ -f $CONFIGDIR/$fname ]; then | ||
| rm $CONFIGDIR/$fname | ||
| fi | ||
| sudo cp /etc/secrets/$fname $CONFIGDIR/$fname | ||
| sudo chown $USER.$USER $CONFIGDIR/$fname | ||
| if [ "$fname" == "$CONFIGFILE" ]; then | ||
| CFGFILE=$CONFIGDIR/$CONFIGFILE | ||
| fi | ||
| fi | ||
| done | ||
| files=`ls /etc/secrets` | ||
| for fname in $files; do | ||
| if [ ! -f $CONFIGDIR/$fname ]; then | ||
| sudo cp /etc/secrets/$fname $AUTHDIR/$fname | ||
| sudo chown $USER.$USER $AUTHDIR/$fname | ||
| fi | ||
| done | ||
|
|
||
| export PYTHONPATH=$PYTHONPATH:/etc/secrets:$AUTHDIR/$fname | ||
|
|
||
| # backward compatible changes for RPM based deployment location of aux files | ||
| if [ -d /usr/local/data ] && [ "$USER" == "_reqmgr2" ]; then | ||
| sudo mkdir -p /data/srv/current/apps/reqmgr2 | ||
| sudo ln -s /usr/local/data /data/srv/current/apps/reqmgr2 | ||
| fi | ||
|
|
||
| # start the service | ||
| wmc-httpd -r -d $STATEDIR -l "|rotatelogs $LOGDIR/$srv-%Y%m%d-`hostname -s`.log 86400" $CFGFILE | ||
|
|
||
| # start monitor.sh script | ||
| if [ -f /data/monitor.sh ]; then | ||
| /data/monitor.sh | ||
| fi | ||
|
|
||
| # hack to keep the container running | ||
| tail -f /etc/hosts |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,33 @@ | ||
| FROM registry.cern.ch/cmsweb/dmwm-base:alma9-py3.12-20250522 | ||
| LABEL org.opencontainers.image.authors="Dennis Lee dylee@fnal.gov" | ||
|
|
||
| WORKDIR /tmp | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am inclined to say that we could rename this dockerfile as:
What do you think? |
||
| RUN dnf -y upgrade && \ | ||
| dnf install epel-release -y && dnf clean all && \ | ||
| dnf -y install git bzip2 gfal2-devel python3.12-devel | ||
|
|
||
| RUN git clone --branch v1.13.0 https://github.com/cern-fts/gfal2-python.git | ||
|
|
||
| WORKDIR /tmp/gfal2-python | ||
| RUN ./ci/fedora-packages.sh | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps add a brief comment here to say what this script is supposed to do. |
||
|
|
||
| WORKDIR /tmp | ||
| # get and install boost | ||
| # RUN curl -ksL https://archives.boost.io/release/1.82.0/source/boost_1_82_0.tar.bz2 -o boost_1_82_0.tar.bz2 && \ | ||
| # tar --bzip2 -xf /tmp/boost_1_82_0.tar.bz2 && \ | ||
| # cd boost_1_82_0 && \ | ||
| # ./bootstrap.sh --with-python=/usr/bin/python3.12 && \ | ||
| # ./b2 install | ||
|
|
||
| RUN git clone https://github.com/boostorg/boost.git -b boost-1.85.0 boost_1_85_0 --depth 1 && \ | ||
| cd boost_1_85_0 && \ | ||
| git submodule update --depth 1 -q --init tools/boostdep && \ | ||
| git submodule update --depth 1 -q --init libs/python && \ | ||
| python tools/boostdep/depinst/depinst.py -X test -g "--depth 1" python && \ | ||
| ./bootstrap.sh --with-python=/usr/bin/python3.12 && \ | ||
| ./b2 install --with-python | ||
|
|
||
| RUN python -m venv .venv | ||
| ENV PATH="$WDIR/.venv/bin:$PATH" | ||
|
|
||
| RUN .venv/bin/pip install -v gfal2-python | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this command actually pulling the |
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks a clean base image. I would suggest discussing this with Aroosha to see if there is anything that we can reuse from different images, or if CMSWEB could actually adopt this image as base for other services.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alan,
I would recommend against adopting pypi/alma-base as the base image for other services for several reasons:
Lack of Generality: A base image should be minimal and serve as a common foundation for all services, regardless of the programming language or specific team requirements. alma-base is tailored to the needs of the WM group and includes packages such as
curl-minimal, libcurl-minimal, vim, python3-pycurl, pip, sudo, and less. These are not broadly applicable—for example,Pythonis irrelevant for Go-based applications, and tools likevimare unnecessary unless one intends to log into containers for interactive editing, which we generally discourage.On-Demand Tooling: Tools like
vimorcurlshould be installed on demand within a pod when needed for debugging or maintenance. This can be done easily using standard package managers (yum, dnf, apt-get, etc.). This approach keeps the base image lean and reduces the surface area for potential vulnerabilities, while still allowing flexibility for developers. Once debugging or inspection is complete, the pod should be terminated and allowed to restart in a clean state, in line with the stateless nature of Kubernetes workloads.Security implications: Each package added to a base image introduces additional dependencies and potential vulnerabilities. If a service does not require these tools, including them unnecessarily increases risk and image size without clear benefit.
While the alma-base image may be appropriate for the WM group's specific needs, I would suggest that we not promote it as the standard base for all services.
As a side note, the only consistently shared requirement across our services appears to be access to CA certificates. However, the CMSWEB team has recently transitioned to retrieving these from Kubernetes nodes, now that CERN IT provides them via cluster configuration. This might even allow us to remove them from the CMSWEB base image in the future, though that's a separate discussion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree with basically everything that you said.
I think the base image needs to provide requirements that we have for CERN-related systems only (mostly CAs as you pointed out). Nonetheless, if this one is still more inflated than it needs, it is already progress compared to the CC7 base image that many of the services still rely on.