Add a16w8 per-op test for bmm#19599
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19599
Note: Links to docs will display an error until the docs builds have been completed. ❗ 2 Active SEVsThere are 2 currently active SEVs. If your PR is affected, please view them below: ⏳ No Failures, 269 PendingAs of commit 1c06ec5 with merge base 41a38d8 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
c0c8f38 to
1854072
Compare
|
|
This PR needs a
|
Summary: Add int16 activation / int8 weight (a16w8) quantization tests for `aten.bmm` on Ethos-U55 and Ethos-U85. ## Changes - Add `a16w8_bmm_test_parameters` dict with 5 test configurations covering same-shape, different-shape, rectangular, batch-10, and negative-value tensors - Add `test_bmm_a16w8_u55_INT` using `EthosU55PipelineINT` with `a16w8_quantization=True, symmetric_io_quantization=True, qtol=128, epsilon=2**-16` - Add `test_bmm_a16w8_u85_INT` using `EthosU85PipelineINT` with same kwargs - Remove unused `aten_op_mm` and `exir_op_mm` variables - Register `ops/test_bmm.py` in `fbcode/` and `xplat/` `targets.bzl` bypass-pytorch-oss-checks Differential Revision: D104532363
1854072 to
d5349b3
Compare
d5349b3 to
77a9a41
Compare
|
@christine-long-meta has exported this pull request. If you are a Meta employee, you can view the originating Diff in D104532363. |
1 similar comment
|
@christine-long-meta has exported this pull request. If you are a Meta employee, you can view the originating Diff in D104532363. |
Summary: Pull Request resolved: pytorch#19599 Add int16 activation / int8 weight (a16w8) quantization tests for `aten.bmm` on Ethos-U55 and Ethos-U85. ## Changes - Add `a16w8_bmm_test_parameters` dict with 5 test configurations covering same-shape, different-shape, rectangular, batch-10, and negative-value tensors - Add `test_bmm_a16w8_u55_INT` using `EthosU55PipelineINT` with `a16w8_quantization=True, symmetric_io_quantization=True, qtol=128, epsilon=2**-16` - Add `test_bmm_a16w8_u85_INT` using `EthosU85PipelineINT` with same kwargs - Remove unused `aten_op_mm` and `exir_op_mm` variables - Register `ops/test_bmm.py` in `fbcode/` and `xplat/` `targets.bzl` bypass-pytorch-oss-checks Differential Revision: D104532363
6a8da35 to
71103e7
Compare
71103e7 to
2381b0f
Compare
2381b0f to
aa997b7
Compare
|
You seem to got a strange error in Yest ARM Backend, It seems very unrelated, I rerun the tests maybe it was a network glitch and will go away. |
zingo
left a comment
There was a problem hiding this comment.
The test re-run passed, approving :) Thanks!
f45c559 to
37d1935
Compare
Summary: Add int16 activation / int8 weight (a16w8) quantization tests for `aten.bmm` on Ethos-U55 and Ethos-U85. ## Changes - Add `a16w8_bmm_test_parameters` dict with 5 test configurations covering same-shape, different-shape, rectangular, batch-10, and negative-value tensors - Add `test_bmm_a16w8_u55_INT` using `OpNotSupportedPipeline` to verify that bmm with INT16 inputs is correctly rejected on U55 (which does not support bmm with int16) - Add `test_bmm_a16w8_u85_INT` using `EthosU85PipelineINT` with `a16w8_quantization=True, symmetric_io_quantization=True` - Remove unused `aten_op_mm` and `exir_op_mm` variables - Register `ops/test_bmm.py` in `fbcode/` and `xplat/` `targets.bzl` Reviewed By: Ninja91 Differential Revision: D104532363
914bead to
da4ffde
Compare
Summary: Add int16 activation / int8 weight (a16w8) quantization tests for `aten.bmm` on Ethos-U55 and Ethos-U85. ## Context Batch matrix multiply (`bmm`) implements the core `Q @ K^T` and `attn_weights @ V` operations in the multi-head attention of the EMG2Pose Conformer. At int16 IO precision the accumulator width and rescale path differ between U55 and U85, so dedicated per-op coverage is needed to catch numerics divergence before it surfaces as an end-to-end SNR regression. The test matrix includes square, rectangular, and large-batch configurations to exercise different tiling strategies in the Vela backend. Also removes unused `aten_op_mm` / `exir_op_mm` variables that were dead code in `test_bmm.py`. ## Changes - Add `a16w8_bmm_test_parameters` dict with 5 test configurations covering same-shape, different-shape, rectangular, batch-10, and negative-value tensors - Add `test_bmm_a16w8_u55_INT` using `EthosU55PipelineINT` with `a16w8_quantization=True, symmetric_io_quantization=True, qtol=128, epsilon=2**-16` - Add `test_bmm_a16w8_u85_INT` using `EthosU85PipelineINT` with same kwargs - Remove unused `aten_op_mm` and `exir_op_mm` variables - Register `ops/test_bmm.py` in `fbcode/` and `xplat/` `targets.bzl` bypass-pytorch-oss-checks Reviewed By: Ninja91 Differential Revision: D104532363
da4ffde to
1c06ec5
Compare
Summary:
Add int16 activation / int8 weight (a16w8) quantization tests for
aten.bmmon Ethos-U55 and Ethos-U85.Context
Batch matrix multiply (
bmm) implements the coreQ @ K^Tandattn_weights @ Voperations in the multi-head attention of the EMG2Pose Conformer. At int16 IO precision the accumulator width and rescale path differ between U55 and U85, so dedicated per-op coverage is needed to catch numerics divergence before it surfaces as an end-to-end SNR regression. The test matrix includes square, rectangular, and large-batch configurations to exercise different tiling strategies in the Vela backend.Also removes unused
aten_op_mm/exir_op_mmvariables that were dead code intest_bmm.py.Changes
a16w8_bmm_test_parametersdict with 5 test configurations covering same-shape, different-shape, rectangular, batch-10, and negative-value tensorstest_bmm_a16w8_u55_INTusingEthosU55PipelineINTwitha16w8_quantization=True, symmetric_io_quantization=True, qtol=128, epsilon=2**-16test_bmm_a16w8_u85_INTusingEthosU85PipelineINTwith same kwargsaten_op_mmandexir_op_mmvariablesops/test_bmm.pyinfbcode/andxplat/targets.bzlbypass-pytorch-oss-checks
Reviewed By: Ninja91
Differential Revision: D104532363