Skip to content

{math}[GCCcore/13.2.0] ArmComputeLibrary v23.08#21309

Merged
ocaisa merged 4 commits into
easybuilders:developfrom
migueldiascosta:20240904163603_new_pr_ArmComputeLibrary2308
Apr 8, 2026
Merged

{math}[GCCcore/13.2.0] ArmComputeLibrary v23.08#21309
ocaisa merged 4 commits into
easybuilders:developfrom
migueldiascosta:20240904163603_new_pr_ArmComputeLibrary2308

Conversation

@migueldiascosta
Copy link
Copy Markdown
Member

@migueldiascosta migueldiascosta commented Sep 4, 2024

(created using eb --new-pr)

The motivation for this easyconfig was that on Arm (at least on a64fx, but probably also applies to other Arm processors) a pip-installed PyTorch was multiple times faster than an easybuilt one, and an analysis with perf showed that ACL was being used (also a recent OpenBLAS with support for ARM_SVE, but should be taken care by using PyTorch with a more recent toolchain and OpenBLAS, e.g. #20489)

This is not the most recent version of ACL, but PyTorch 2.3 (the one in #20489) says that the maximum supported version is 23.08

Using this with PyTorch 2.3 requires setting USE_MKLDNN=ON, USE_MKLDNN_ACL=ON, USE_MKLDNN_CBLAS=ON, and a patch derived from Ryo-not-rio/oneDNN@ca60ff4 to the bundled oneDNN

@migueldiascosta migueldiascosta marked this pull request as draft September 4, 2024 08:36

buildopts = "os=linux arch=armv8a build=native multi_isa=1 "
buildopts += "Werror=0 debug=0 neon=1 opencl=0 embed_kernels=0 "
buildopts += "fixed_format_kernels=1 openmp=1 cppthreads=0 "
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from https://github.com/pytorch/pytorch/blob/main/.ci/docker/common/install_acl.sh#L13-L16

in particular, arch=armv8a multi_isa=1 should be more generic without loosing functionality/performance:

https://github.com/ARM-software/ComputeLibrary/blob/de7288cb71e6b9190f52e50a44ed68c309e4a041/docs/user_guide/library.dox#L567-L578

benchmarks on a64fx compared to arch=armv8.2-a-sve didn't show any difference

@migueldiascosta
Copy link
Copy Markdown
Member Author

Test report by @migueldiascosta
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
cna0003.deucalion.macc.fccn.pt - Linux Rocky Linux 8.5, AArch64, UNKNOWN, Python 3.6.8
See https://gist.github.com/migueldiascosta/4f83eebbd8ba97bd57069d1d5a16be30 for a full test report.

@migueldiascosta migueldiascosta marked this pull request as ready for review September 11, 2024 03:59
@migueldiascosta migueldiascosta added this to the release after 4.9.3 milestone Sep 13, 2024
@boegel boegel modified the milestones: release after 4.9.4, release after 5.0.0 Mar 18, 2025
@Flamefire
Copy link
Copy Markdown
Contributor

When we set $ACL_ROOT_DIR we do not need the part of the patch where the FindACL is changed, see easybuilders/easybuild-easyblocks#4096

@boegel
Copy link
Copy Markdown
Member

boegel commented Mar 25, 2026

@migueldiascosta Is it worth still merging this now that we have a newer version merged? See

@Flamefire
Copy link
Copy Markdown
Contributor

@boegel That one is for a different toolchain. Do you suggest to update the version for the toolchain used in this PR?

@migueldiascosta
Copy link
Copy Markdown
Member Author

I'm ok with closing this PR, since we're not likely to enable ACL for PyTorch/2.3-foss-2023b (the one this was originally targeted at)

@Flamefire
Copy link
Copy Markdown
Contributor

Flamefire commented Apr 2, 2026

Why not? Given the huge performance difference I'd actually update all PyTorch easyconfigs to use imkl on x68 and ACL for Arm maybe starting at 2023a, as 2022b is the oldest active one

As for versions I'd use the PYPI PyTorch packages as reference

@migueldiascosta
Copy link
Copy Markdown
Member Author

@Flamefire just thought we would likely not bother. let me fix the shared library extension in this PR then, same as in the merged one

@github-actions github-actions Bot added update and removed new labels Apr 2, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 2, 2026

Updated software ArmComputeLibrary-23.08-GCCcore-13.2.0.eb

Diff against ArmComputeLibrary-25.02-GCCcore-14.3.0.eb

easybuild/easyconfigs/a/ArmComputeLibrary/ArmComputeLibrary-25.02-GCCcore-14.3.0.eb

diff --git a/easybuild/easyconfigs/a/ArmComputeLibrary/ArmComputeLibrary-25.02-GCCcore-14.3.0.eb b/easybuild/easyconfigs/a/ArmComputeLibrary/ArmComputeLibrary-23.08-GCCcore-13.2.0.eb
index ceccbe8f67..8a0c0b24dd 100644
--- a/easybuild/easyconfigs/a/ArmComputeLibrary/ArmComputeLibrary-25.02-GCCcore-14.3.0.eb
+++ b/easybuild/easyconfigs/a/ArmComputeLibrary/ArmComputeLibrary-23.08-GCCcore-13.2.0.eb
@@ -1,21 +1,21 @@
 easyblock = 'SCons'
 
 name = 'ArmComputeLibrary'
-version = '25.02'
+version = '23.08'
 
 homepage = 'https://github.com/ARM-software/ComputeLibrary'
 description = """The Arm Compute Library is a collection of low-level machine learning functions optimized for
  Arm® Cortex®-A, Arm® Neoverse® and Arm® Mali™ GPUs architectures."""
 
-toolchain = {'name': 'GCCcore', 'version': '14.3.0'}
+toolchain = {'name': 'GCCcore', 'version': '13.2.0'}
 
 source_urls = ['https://github.com/ARM-software/ComputeLibrary/archive/refs/tags/']
 sources = ['v%(version)s.tar.gz']
-checksums = ['339376cd05b5efe83a3909333956d7663022f0dd8c7977a35e04b35551546be6']
+checksums = ['62f514a555409d4401e5250b290cdf8cf1676e4eb775e5bd61ea6a740a8ce24f']
 
 builddependencies = [
-    ('binutils', '2.44'),
-    ('SCons', '4.9.1'),
+    ('binutils', '2.40'),
+    ('SCons', '4.6.0'),
 ]
 
 prefix_arg = 'install_dir='
Diff against ArmComputeLibrary-25.02-GCCcore-14.2.0.eb

easybuild/easyconfigs/a/ArmComputeLibrary/ArmComputeLibrary-25.02-GCCcore-14.2.0.eb

diff --git a/easybuild/easyconfigs/a/ArmComputeLibrary/ArmComputeLibrary-25.02-GCCcore-14.2.0.eb b/easybuild/easyconfigs/a/ArmComputeLibrary/ArmComputeLibrary-23.08-GCCcore-13.2.0.eb
index 31d540119e..8a0c0b24dd 100644
--- a/easybuild/easyconfigs/a/ArmComputeLibrary/ArmComputeLibrary-25.02-GCCcore-14.2.0.eb
+++ b/easybuild/easyconfigs/a/ArmComputeLibrary/ArmComputeLibrary-23.08-GCCcore-13.2.0.eb
@@ -1,21 +1,21 @@
 easyblock = 'SCons'
 
 name = 'ArmComputeLibrary'
-version = '25.02'
+version = '23.08'
 
 homepage = 'https://github.com/ARM-software/ComputeLibrary'
 description = """The Arm Compute Library is a collection of low-level machine learning functions optimized for
  Arm® Cortex®-A, Arm® Neoverse® and Arm® Mali™ GPUs architectures."""
 
-toolchain = {'name': 'GCCcore', 'version': '14.2.0'}
+toolchain = {'name': 'GCCcore', 'version': '13.2.0'}
 
 source_urls = ['https://github.com/ARM-software/ComputeLibrary/archive/refs/tags/']
 sources = ['v%(version)s.tar.gz']
-checksums = ['339376cd05b5efe83a3909333956d7663022f0dd8c7977a35e04b35551546be6']
+checksums = ['62f514a555409d4401e5250b290cdf8cf1676e4eb775e5bd61ea6a740a8ce24f']
 
 builddependencies = [
-    ('binutils', '2.42'),
-    ('SCons', '4.9.1'),
+    ('binutils', '2.40'),
+    ('SCons', '4.6.0'),
 ]
 
 prefix_arg = 'install_dir='
Diff against ArmComputeLibrary-25.02-GCCcore-13.3.0.eb

easybuild/easyconfigs/a/ArmComputeLibrary/ArmComputeLibrary-25.02-GCCcore-13.3.0.eb

diff --git a/easybuild/easyconfigs/a/ArmComputeLibrary/ArmComputeLibrary-25.02-GCCcore-13.3.0.eb b/easybuild/easyconfigs/a/ArmComputeLibrary/ArmComputeLibrary-23.08-GCCcore-13.2.0.eb
index 0c178b7c0b..8a0c0b24dd 100644
--- a/easybuild/easyconfigs/a/ArmComputeLibrary/ArmComputeLibrary-25.02-GCCcore-13.3.0.eb
+++ b/easybuild/easyconfigs/a/ArmComputeLibrary/ArmComputeLibrary-23.08-GCCcore-13.2.0.eb
@@ -1,21 +1,21 @@
 easyblock = 'SCons'
 
 name = 'ArmComputeLibrary'
-version = '25.02'
+version = '23.08'
 
 homepage = 'https://github.com/ARM-software/ComputeLibrary'
 description = """The Arm Compute Library is a collection of low-level machine learning functions optimized for
  Arm® Cortex®-A, Arm® Neoverse® and Arm® Mali™ GPUs architectures."""
 
-toolchain = {'name': 'GCCcore', 'version': '13.3.0'}
+toolchain = {'name': 'GCCcore', 'version': '13.2.0'}
 
 source_urls = ['https://github.com/ARM-software/ComputeLibrary/archive/refs/tags/']
 sources = ['v%(version)s.tar.gz']
-checksums = ['339376cd05b5efe83a3909333956d7663022f0dd8c7977a35e04b35551546be6']
+checksums = ['62f514a555409d4401e5250b290cdf8cf1676e4eb775e5bd61ea6a740a8ce24f']
 
 builddependencies = [
-    ('binutils', '2.42'),
-    ('SCons', '4.9.0'),
+    ('binutils', '2.40'),
+    ('SCons', '4.6.0'),
 ]
 
 prefix_arg = 'install_dir='

@Flamefire
Copy link
Copy Markdown
Contributor

I expect that less tests rather than more will fail so changing those ECs will be little work with high gain, which makes it worth going back as far as reasonably possible.
Are you going to create PRs for ACL in the other toolchains? I.e. 12.3, 14.2, 14.3. Then I'll open the PRs to add them to PyTorch

@migueldiascosta
Copy link
Copy Markdown
Member Author

I can create those ACL PRs, yes. For PyTorch-2.1.2-foss-2023a.eb (GCCcore 12.3.0) though, not sure which ACL version to use, there was no .ci/docker/common/install_acl.sh on that version of PyTorch, any suggestions?

@migueldiascosta
Copy link
Copy Markdown
Member Author

migueldiascosta commented Apr 2, 2026

from https://github.com/pytorch/pytorch/blob/v2.1.2/cmake/public/ComputeLibrary.cmake looks like anything higher than ACL 21.02 should be ok, but probably safer to use exactly ACL 21.02 for PyTorch 2.1.2

update:

versions taken from the corresponding torch aarch64 wheel on PyPI

@Flamefire
Copy link
Copy Markdown
Contributor

I found a way: Extract the wheel and run strings torch.libs/libarm_compute-*.so | grep arm_compute_version

  • 2.3.0: arm_compute_version=v23.08
  • 2.1.2: arm_compute_version=v23.05.1

@migueldiascosta migueldiascosta added the aarch64 Related to Arm 64-bit (aarch64) label Apr 8, 2026
@ocaisa ocaisa self-assigned this Apr 8, 2026
@ocaisa
Copy link
Copy Markdown
Member

ocaisa commented Apr 8, 2026

Closing and re-opening to trigger the tagbot

@ocaisa ocaisa closed this Apr 8, 2026
@ocaisa ocaisa reopened this Apr 8, 2026
Copy link
Copy Markdown
Member

@ocaisa ocaisa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ocaisa
Copy link
Copy Markdown
Member

ocaisa commented Apr 8, 2026

Test report by @ocaisa
SUCCESS
Build succeeded for 2 out of 2 (total: 7 mins 12 secs) (1 easyconfigs in total)
aoc-laptop - Linux Ubuntu 24.04.4 LTS (Noble Numbat), AArch64, UNKNOWN, Python 3.13.4
See https://gist.github.com/ocaisa/e918fb344929541b911adbfff981c3e3 for a full test report.

@ocaisa
Copy link
Copy Markdown
Member

ocaisa commented Apr 8, 2026

@migueldiascosta Can you sync this with the target branch, CI is outdated in the feature branch

EDIT: Never mind, I misunderstood what was happening!

@ocaisa ocaisa merged commit d573555 into easybuilders:develop Apr 8, 2026
12 checks passed
@boegel boegel modified the milestones: 5.x, next release (5.3.0) Apr 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

2023b aarch64 Related to Arm 64-bit (aarch64) update

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants