Add a16w8 per-op test for bmm by christine-long-meta · Pull Request #19599 · pytorch/executorch

christine-long-meta · 2026-05-14T16:48:38Z

Summary:
Add int16 activation / int8 weight (a16w8) quantization tests for aten.bmm on Ethos-U55 and Ethos-U85.

Context

Batch matrix multiply (bmm) implements the core Q @ K^T and attn_weights @ V operations in the multi-head attention of the EMG2Pose Conformer. At int16 IO precision the accumulator width and rescale path differ between U55 and U85, so dedicated per-op coverage is needed to catch numerics divergence before it surfaces as an end-to-end SNR regression. The test matrix includes square, rectangular, and large-batch configurations to exercise different tiling strategies in the Vela backend.

Also removes unused aten_op_mm / exir_op_mm variables that were dead code in test_bmm.py.

Changes

Add a16w8_bmm_test_parameters dict with 5 test configurations covering same-shape, different-shape, rectangular, batch-10, and negative-value tensors
Add test_bmm_a16w8_u55_INT using EthosU55PipelineINT with a16w8_quantization=True, symmetric_io_quantization=True, qtol=128, epsilon=2**-16
Add test_bmm_a16w8_u85_INT using EthosU85PipelineINT with same kwargs
Remove unused aten_op_mm and exir_op_mm variables
Register ops/test_bmm.py in fbcode/ and xplat/ targets.bzl

bypass-pytorch-oss-checks

Reviewed By: Ninja91

Differential Revision: D104532363

pytorch-bot · 2026-05-14T16:48:43Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19599

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 2 Active SEVs

There are 2 currently active SEVs. If your PR is affected, please view them below:

⏳ No Failures, 269 Pending

As of commit 1c06ec5 with merge base 41a38d8 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

pytorch-bot · 2026-05-14T16:49:00Z

~~Workflows were awaiting approval.~~ CI has now been triggered for the ciflow labels on this PR.

github-actions · 2026-05-14T16:49:48Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Summary: Add int16 activation / int8 weight (a16w8) quantization tests for `aten.bmm` on Ethos-U55 and Ethos-U85. ## Changes - Add `a16w8_bmm_test_parameters` dict with 5 test configurations covering same-shape, different-shape, rectangular, batch-10, and negative-value tensors - Add `test_bmm_a16w8_u55_INT` using `EthosU55PipelineINT` with `a16w8_quantization=True, symmetric_io_quantization=True, qtol=128, epsilon=2**-16` - Add `test_bmm_a16w8_u85_INT` using `EthosU85PipelineINT` with same kwargs - Remove unused `aten_op_mm` and `exir_op_mm` variables - Register `ops/test_bmm.py` in `fbcode/` and `xplat/` `targets.bzl` bypass-pytorch-oss-checks Differential Revision: D104532363

meta-codesync · 2026-05-14T16:54:24Z

@christine-long-meta has exported this pull request. If you are a Meta employee, you can view the originating Diff in D104532363.

meta-codesync · 2026-05-14T16:54:34Z

@christine-long-meta has exported this pull request. If you are a Meta employee, you can view the originating Diff in D104532363.

Summary: Pull Request resolved: pytorch#19599 Add int16 activation / int8 weight (a16w8) quantization tests for `aten.bmm` on Ethos-U55 and Ethos-U85. ## Changes - Add `a16w8_bmm_test_parameters` dict with 5 test configurations covering same-shape, different-shape, rectangular, batch-10, and negative-value tensors - Add `test_bmm_a16w8_u55_INT` using `EthosU55PipelineINT` with `a16w8_quantization=True, symmetric_io_quantization=True, qtol=128, epsilon=2**-16` - Add `test_bmm_a16w8_u85_INT` using `EthosU85PipelineINT` with same kwargs - Remove unused `aten_op_mm` and `exir_op_mm` variables - Register `ops/test_bmm.py` in `fbcode/` and `xplat/` `targets.bzl` bypass-pytorch-oss-checks Differential Revision: D104532363

zingo · 2026-05-18T07:35:40Z

You seem to got a strange error in Yest ARM Backend, It seems very unrelated, I rerun the tests maybe it was a network glitch and will go away.

zingo

The test re-run passed, approving :) Thanks!

Summary: Add int16 activation / int8 weight (a16w8) quantization tests for `aten.bmm` on Ethos-U55 and Ethos-U85. ## Changes - Add `a16w8_bmm_test_parameters` dict with 5 test configurations covering same-shape, different-shape, rectangular, batch-10, and negative-value tensors - Add `test_bmm_a16w8_u55_INT` using `OpNotSupportedPipeline` to verify that bmm with INT16 inputs is correctly rejected on U55 (which does not support bmm with int16) - Add `test_bmm_a16w8_u85_INT` using `EthosU85PipelineINT` with `a16w8_quantization=True, symmetric_io_quantization=True` - Remove unused `aten_op_mm` and `exir_op_mm` variables - Register `ops/test_bmm.py` in `fbcode/` and `xplat/` `targets.bzl` Reviewed By: Ninja91 Differential Revision: D104532363

Summary: Add int16 activation / int8 weight (a16w8) quantization tests for `aten.bmm` on Ethos-U55 and Ethos-U85. ## Context Batch matrix multiply (`bmm`) implements the core `Q @ K^T` and `attn_weights @ V` operations in the multi-head attention of the EMG2Pose Conformer. At int16 IO precision the accumulator width and rescale path differ between U55 and U85, so dedicated per-op coverage is needed to catch numerics divergence before it surfaces as an end-to-end SNR regression. The test matrix includes square, rectangular, and large-batch configurations to exercise different tiling strategies in the Vela backend. Also removes unused `aten_op_mm` / `exir_op_mm` variables that were dead code in `test_bmm.py`. ## Changes - Add `a16w8_bmm_test_parameters` dict with 5 test configurations covering same-shape, different-shape, rectangular, batch-10, and negative-value tensors - Add `test_bmm_a16w8_u55_INT` using `EthosU55PipelineINT` with `a16w8_quantization=True, symmetric_io_quantization=True, qtol=128, epsilon=2**-16` - Add `test_bmm_a16w8_u85_INT` using `EthosU85PipelineINT` with same kwargs - Remove unused `aten_op_mm` and `exir_op_mm` variables - Register `ops/test_bmm.py` in `fbcode/` and `xplat/` `targets.bzl` bypass-pytorch-oss-checks Reviewed By: Ninja91 Differential Revision: D104532363

christine-long-meta requested a review from digantdesai as a code owner May 14, 2026 16:48

github-actions Bot added ciflow/trunk module: arm Issues related to arm backend labels May 14, 2026

christine-long-meta force-pushed the export-D104532363 branch from c0c8f38 to 1854072 Compare May 14, 2026 16:48

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 14, 2026

meta-codesync Bot added fb-exported meta-exported labels May 14, 2026

christine-long-meta force-pushed the export-D104532363 branch from 1854072 to d5349b3 Compare May 14, 2026 16:53

meta-codesync Bot changed the title ~~Add a16w8 per-op test for bmm~~ Add a16w8 per-op test for bmm (#19599) May 14, 2026

christine-long-meta force-pushed the export-D104532363 branch from d5349b3 to 77a9a41 Compare May 14, 2026 16:53

christine-long-meta force-pushed the export-D104532363 branch 2 times, most recently from 6a8da35 to 71103e7 Compare May 16, 2026 02:07

meta-codesync Bot changed the title ~~Add a16w8 per-op test for bmm (#19599)~~ Add a16w8 per-op test for bmm May 16, 2026

christine-long-meta force-pushed the export-D104532363 branch from 71103e7 to 2381b0f Compare May 16, 2026 02:08

meta-codesync Bot changed the title ~~Add a16w8 per-op test for bmm~~ Add a16w8 per-op test for bmm (#19599) May 16, 2026

christine-long-meta force-pushed the export-D104532363 branch from 2381b0f to aa997b7 Compare May 17, 2026 16:06

meta-codesync Bot changed the title ~~Add a16w8 per-op test for bmm (#19599)~~ Add a16w8 per-op test for bmm May 17, 2026

zingo approved these changes May 18, 2026

View reviewed changes

christine-long-meta force-pushed the export-D104532363 branch from f45c559 to 37d1935 Compare May 19, 2026 19:22

meta-codesync Bot changed the title ~~Add a16w8 per-op test for bmm~~ Add a16w8 per-op test for bmm (#19599) May 19, 2026

christine-long-meta force-pushed the export-D104532363 branch from 914bead to da4ffde Compare May 19, 2026 19:39

meta-codesync Bot changed the title ~~Add a16w8 per-op test for bmm (#19599)~~ Add a16w8 per-op test for bmm May 19, 2026

christine-long-meta force-pushed the export-D104532363 branch from da4ffde to 1c06ec5 Compare May 19, 2026 19:49

christine-long-meta merged commit afd32cc into pytorch:main May 19, 2026
433 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a16w8 per-op test for bmm#19599

Add a16w8 per-op test for bmm#19599
christine-long-meta merged 1 commit into
pytorch:mainfrom
christine-long-meta:export-D104532363

christine-long-meta commented May 14, 2026 •

edited by meta-codesync Bot

Loading

Uh oh!

pytorch-bot Bot commented May 14, 2026 •

edited

Loading

Uh oh!

pytorch-bot Bot commented May 14, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 14, 2026

Uh oh!

meta-codesync Bot commented May 14, 2026

Uh oh!

meta-codesync Bot commented May 14, 2026

Uh oh!

zingo commented May 18, 2026

Uh oh!

zingo left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

christine-long-meta commented May 14, 2026 • edited by meta-codesync Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Context

Changes

Uh oh!

pytorch-bot Bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19599

❗ 2 Active SEVs

⏳ No Failures, 269 Pending

Uh oh!

pytorch-bot Bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented May 14, 2026

This PR needs a release notes: label

Uh oh!

meta-codesync Bot commented May 14, 2026

Uh oh!

meta-codesync Bot commented May 14, 2026

Uh oh!

zingo commented May 18, 2026

Uh oh!

zingo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

christine-long-meta commented May 14, 2026 •

edited by meta-codesync Bot

Loading

pytorch-bot Bot commented May 14, 2026 •

edited

Loading

pytorch-bot Bot commented May 14, 2026 •

edited

Loading

This PR needs a `release notes:` label