Skip to content

ci: add NVTX to CI container images for CUDA builds#1224

Open
bibrakc wants to merge 1 commit intoaws:masterfrom
bibrakc:ci/nvtx-container-support
Open

ci: add NVTX to CI container images for CUDA builds#1224
bibrakc wants to merge 1 commit intoaws:masterfrom
bibrakc:ci/nvtx-container-support

Conversation

@bibrakc
Copy link
Copy Markdown
Contributor

@bibrakc bibrakc commented May 5, 2026

Description of changes:

NVTX tracing is not built by CI, so build breakages under
--with-nvtx=... have been landing undetected. For example, the
shared_ptr refactor in PR #1193 left an implicit pointer cast in
tracing_impl/nvtx.h that no CI job caught.

Extend the tracing-enabled Ubuntu images to install cuda-nvtx in
addition to liblttng-ust-dev whenever ENABLE_CUDA is also set. The
two tracing backends are independent in the plugin (separate
HAVE_LIBLTTNG_UST and HAVE_NVTX_TRACING guards, each tracepoint
macro expands to calls into both) so they coexist cleanly in a
single container and a single build. Neuron images still get just
lttng because NVTX requires the CUDA toolkit.

Parameterise the CUDA toolkit release as a CUDA_VERSION build arg
(default 12-6) so cuda-cudart-dev, cuda-crt, and cuda-nvtx all stay
on the same release.

No matrix, workflow, or tag changes: the existing 'lttng' tracing
variant now also brings in NVTX on CUDA containers. A follow-up
PR will add --with-nvtx=... to the configure line in distcheck.yaml
so the plugin actually gets built with NVTX in CI.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@bibrakc bibrakc requested a review from a team as a code owner May 5, 2026 20:51
Comment thread .github/matrix-config.json Outdated
"tracing": ["lttng", "lttng-nvtx", "none"],
"sdk": ["cuda", "neuron"],
"exclude": [
{"tracing": "lttng-nvtx", "sdk": "neuron"},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should add variants here and make more containers. There's no reason that lttng and nvtx can't live in the same container.

Copy link
Copy Markdown
Contributor Author

@bibrakc bibrakc May 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I modified. Now cuda enabled containers will have both lttng and nvtx.

@bibrakc bibrakc force-pushed the ci/nvtx-container-support branch from 9562b68 to 6d7d255 Compare May 5, 2026 21:03
NVTX tracing is not built by CI, so build breakages under
--with-nvtx=... have been landing undetected.  For example, the
shared_ptr refactor in PR aws#1193 left an implicit pointer cast in
tracing_impl/nvtx.h that no CI job caught.

Extend the tracing-enabled Ubuntu images to install cuda-nvtx in
addition to liblttng-ust-dev whenever ENABLE_CUDA is also set.  The
two tracing backends are independent in the plugin (separate
HAVE_LIBLTTNG_UST and HAVE_NVTX_TRACING guards, each tracepoint
macro expands to calls into both) so they coexist cleanly in a
single container and a single build.  Neuron images still get just
lttng because NVTX requires the CUDA toolkit.

Parameterise the CUDA toolkit release as a CUDA_VERSION build arg
(default 12-6) so cuda-cudart-dev, cuda-crt, and cuda-nvtx all stay
on the same release.

No matrix, workflow, or tag changes: the existing 'lttng' tracing
variant now also brings in NVTX on CUDA containers.  A follow-up
PR will add --with-nvtx=... to the configure line in distcheck.yaml
so the plugin actually gets built with NVTX in CI.

Signed-off-by: Bibrak Qamar Chandio <bibracha@amazon.com>
@bibrakc bibrakc force-pushed the ci/nvtx-container-support branch from 6d7d255 to 28691da Compare May 5, 2026 21:09
ARG CC_VERSION
ARG ENABLE_TRACING=false
ARG ENABLE_CUDA=false
ARG CUDA_VERSION=12-6
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unrelated change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants