Skip to content

Faster {Copy,Extract}Sub{Matrix,Vector} for plist matrices/vectors#6329

Merged
fingolfin merged 1 commit intomasterfrom
mh/faster-ExtractCopy-Sub-MatrixVector
Apr 22, 2026
Merged

Faster {Copy,Extract}Sub{Matrix,Vector} for plist matrices/vectors#6329
fingolfin merged 1 commit intomasterfrom
mh/faster-ExtractCopy-Sub-MatrixVector

Conversation

@fingolfin
Copy link
Copy Markdown
Member

@fingolfin fingolfin commented Apr 17, 2026

... by avoiding some method dispatch, providing optimized methods
using {...} syntax, and turning ExtractSubVector, CopySubVector,
ExtractSubMatrix, and CopySubMatrix into kernel ops.

Then run some benchmarks via benchmark/matobj/bench-submat.g and
the new benchmark/matobj/bench-subvec.g.

Before:

+----------------------------------------------------------------+
| Testing submatrix extraction for integer matrix: list of lists |
+----------------------------------------------------------------+
Testing m{}{}:
  129 µs per iteration; 7749 iterations per second; (200 iterations)
Testing ExtractSubMatrix:
  720 µs per iteration; 1388 iterations per second; (200 iterations)
...now testing submatrix copying...
Testing m{}{}:=n{}{}:
  342 µs per iteration; 2921 iterations per second; (200 iterations)
Testing CopySubMatrix:
  1032 µs per iteration; 969 iterations per second; (200 iterations)

+------------------------------------------------+
| Testing submatrix extraction for GF(2) rowlist |
+------------------------------------------------+
Testing m{}{}:
  167 µs per iteration; 5999 iterations per second; (200 iterations)
Testing ExtractSubMatrix:
  3288 µs per iteration; 304 iterations per second; (92 iterations)
...now testing submatrix copying...
Testing m{}{}:=n{}{}:
  1992 µs per iteration; 502 iterations per second; (151 iterations)
Testing CopySubMatrix:
  2606 µs per iteration; 384 iterations per second; (116 iterations)

+-------------------------------------------------------------+
| Testing subvector extraction for integer vector: plain list |
+-------------------------------------------------------------+
Testing x{}:=v{}:
  393 µs per iteration; 2542 iterations per second; (200 iterations)
Testing CopySubVector:
  2324 µs per iteration; 430 iterations per second; (130 iterations)

+---------------------------------------------------+
| Testing subvector extraction for GF(2) row vector |
+---------------------------------------------------+
Testing x{}:=v{}:
  441 µs per iteration; 2266 iterations per second; (200 iterations)
Testing CopySubVector:
  2458 µs per iteration; 407 iterations per second; (123 iterations)

After:

+----------------------------------------------------------------+
| Testing submatrix extraction for integer matrix: list of lists |
+----------------------------------------------------------------+
Testing m{}{}:
  136 µs per iteration; 7355 iterations per second; (200 iterations)
Testing ExtractSubMatrix:
  140 µs per iteration; 7135 iterations per second; (200 iterations)
...now testing submatrix copying...
Testing m{}{}:=n{}{}:
  301 µs per iteration; 3319 iterations per second; (200 iterations)
Testing CopySubMatrix:
  299 µs per iteration; 3347 iterations per second; (200 iterations)

+------------------------------------------------+
| Testing submatrix extraction for GF(2) rowlist |
+------------------------------------------------+
Testing m{}{}:
  154 µs per iteration; 6490 iterations per second; (200 iterations)
Testing ExtractSubMatrix:
  146 µs per iteration; 6854 iterations per second; (200 iterations)
...now testing submatrix copying...
Testing m{}{}:=n{}{}:
  1895 µs per iteration; 528 iterations per second; (159 iterations)
Testing CopySubMatrix:
  1935 µs per iteration; 517 iterations per second; (156 iterations)

+-------------------------------------------------------------+
| Testing subvector extraction for integer vector: plain list |
+-------------------------------------------------------------+
Testing x{}:=v{}:
  428 µs per iteration; 2336 iterations per second; (200 iterations)
Testing CopySubVector:
  453 µs per iteration; 2209 iterations per second; (200 iterations)

+---------------------------------------------------+
| Testing subvector extraction for GF(2) row vector |
+---------------------------------------------------+
Testing x{}:=v{}:
  427 µs per iteration; 2344 iterations per second; (200 iterations)
Testing CopySubVector:
  460 µs per iteration; 2172 iterations per second; (200 iterations)

So overall the Copy/Extract methods are now more or less on par
with the code using {...} syntax.

But actually this ExtractSubMatrix for a GF(2) rowlist is now faster
than the optimized one for a compressed GF(2) matrix:

+----------------------------------------------------------+
| Testing submatrix extraction for GF(2) compressed matrix |
+----------------------------------------------------------+
Testing m{}{}:
  1057 µs per iteration; 946 iterations per second; (200 iterations)
Testing ExtractSubMatrix:
  1618 µs per iteration; 618 iterations per second; (186 iterations)

A substantial part of that is due to the the ExtractSubMatrix
method for IsGF2MatrixRep calling ConvertToMatrixRepNC(mm,2);
after removing that, it went down from 1618 to 392 µs (which is still
worse).


AI assistance: Codex implemented the kernel plumbing,
method cleanup, and test updates. It also created
benchmark/matobj/bench-subvec.g.

Co-authored-by: Codex codex@openai.com

@fingolfin fingolfin requested a review from ThomasBreuer April 17, 2026 12:58
@fingolfin fingolfin added kind: enhancement Label for issues suggesting enhancements; and for pull requests implementing enhancements topic: performance bugs or enhancements related to performance (improvements or regressions) topic: library release notes: use title For PRs: the title of this PR is suitable for direct use in the release notes labels Apr 17, 2026
@fingolfin
Copy link
Copy Markdown
Member Author

OK I have a kernel implementation now. Pushed it here, but the timings are not yet updated (will do that later)

Comment thread src/lists.c

static Obj FuncEXTRACT_SUB_MATRIX(Obj self, Obj mat, Obj rows, Obj cols)
{
if (IS_PLIST(mat)) {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The AI originally had this using IS_LIST, which would mean it also handles the IsGF2MatrixRep case. But it would also prevent us form installing better methods for these cases. Plus in all other similar cases, we checked for IS_PLIST.

That's why I changed this to IS_PLIST here.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this comment should go to the code not just to the pull request?

Copy link
Copy Markdown
Contributor

@ThomasBreuer ThomasBreuer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just one minor comment

Copy link
Copy Markdown
Contributor

@james-d-mitchell james-d-mitchell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @ThomasBreuer's comment, o/w this looks good to me!

... by avoiding some method dispatch, providing optimized methods
using `{...}` syntax, and turning ExtractSubVector, CopySubVector,
ExtractSubMatrix, and CopySubMatrix into kernel ops.

Then run some benchmarks via `benchmark/matobj/bench-submat.g` and
the new `benchmark/matobj/bench-subvec.g`.

Before:

    +----------------------------------------------------------------+
    | Testing submatrix extraction for integer matrix: list of lists |
    +----------------------------------------------------------------+
    Testing m{}{}:
      129 µs per iteration; 7749 iterations per second; (200 iterations)
    Testing ExtractSubMatrix:
      720 µs per iteration; 1388 iterations per second; (200 iterations)
    ...now testing submatrix copying...
    Testing m{}{}:=n{}{}:
      342 µs per iteration; 2921 iterations per second; (200 iterations)
    Testing CopySubMatrix:
      1032 µs per iteration; 969 iterations per second; (200 iterations)

    +------------------------------------------------+
    | Testing submatrix extraction for GF(2) rowlist |
    +------------------------------------------------+
    Testing m{}{}:
      167 µs per iteration; 5999 iterations per second; (200 iterations)
    Testing ExtractSubMatrix:
      3288 µs per iteration; 304 iterations per second; (92 iterations)
    ...now testing submatrix copying...
    Testing m{}{}:=n{}{}:
      1992 µs per iteration; 502 iterations per second; (151 iterations)
    Testing CopySubMatrix:
      2606 µs per iteration; 384 iterations per second; (116 iterations)

    +-------------------------------------------------------------+
    | Testing subvector extraction for integer vector: plain list |
    +-------------------------------------------------------------+
    Testing x{}:=v{}:
      393 µs per iteration; 2542 iterations per second; (200 iterations)
    Testing CopySubVector:
      2324 µs per iteration; 430 iterations per second; (130 iterations)

    +---------------------------------------------------+
    | Testing subvector extraction for GF(2) row vector |
    +---------------------------------------------------+
    Testing x{}:=v{}:
      441 µs per iteration; 2266 iterations per second; (200 iterations)
    Testing CopySubVector:
      2458 µs per iteration; 407 iterations per second; (123 iterations)

After:

    +----------------------------------------------------------------+
    | Testing submatrix extraction for integer matrix: list of lists |
    +----------------------------------------------------------------+
    Testing m{}{}:
      136 µs per iteration; 7355 iterations per second; (200 iterations)
    Testing ExtractSubMatrix:
      140 µs per iteration; 7135 iterations per second; (200 iterations)
    ...now testing submatrix copying...
    Testing m{}{}:=n{}{}:
      301 µs per iteration; 3319 iterations per second; (200 iterations)
    Testing CopySubMatrix:
      299 µs per iteration; 3347 iterations per second; (200 iterations)

    +------------------------------------------------+
    | Testing submatrix extraction for GF(2) rowlist |
    +------------------------------------------------+
    Testing m{}{}:
      154 µs per iteration; 6490 iterations per second; (200 iterations)
    Testing ExtractSubMatrix:
      146 µs per iteration; 6854 iterations per second; (200 iterations)
    ...now testing submatrix copying...
    Testing m{}{}:=n{}{}:
      1895 µs per iteration; 528 iterations per second; (159 iterations)
    Testing CopySubMatrix:
      1935 µs per iteration; 517 iterations per second; (156 iterations)

    +-------------------------------------------------------------+
    | Testing subvector extraction for integer vector: plain list |
    +-------------------------------------------------------------+
    Testing x{}:=v{}:
      428 µs per iteration; 2336 iterations per second; (200 iterations)
    Testing CopySubVector:
      453 µs per iteration; 2209 iterations per second; (200 iterations)

    +---------------------------------------------------+
    | Testing subvector extraction for GF(2) row vector |
    +---------------------------------------------------+
    Testing x{}:=v{}:
      427 µs per iteration; 2344 iterations per second; (200 iterations)
    Testing CopySubVector:
      460 µs per iteration; 2172 iterations per second; (200 iterations)

So overall the Copy/Extract methods are now more or less on par
with the code using `{...}` syntax.

But actually this `ExtractSubMatrix` for a GF(2) rowlist is now *faster*
than the optimized one for a compressed GF(2) matrix:

    +----------------------------------------------------------+
    | Testing submatrix extraction for GF(2) compressed matrix |
    +----------------------------------------------------------+
    Testing m{}{}:
      1057 µs per iteration; 946 iterations per second; (200 iterations)
    Testing ExtractSubMatrix:
      1618 µs per iteration; 618 iterations per second; (186 iterations)

A substantial part of that is due to the the `ExtractSubMatrix`
method for `IsGF2MatrixRep` calling `ConvertToMatrixRepNC(mm,2)`;
after removing that, it went down from 1618 to 392 µs (which is still
worse).

---

AI assistance: Codex implemented the kernel plumbing,
method cleanup, and test updates. It also created
`benchmark/matobj/bench-subvec.g`.

Co-authored-by: Codex <codex@openai.com>
@fingolfin fingolfin force-pushed the mh/faster-ExtractCopy-Sub-MatrixVector branch from 3b43425 to ea5aa4a Compare April 21, 2026 23:11
@fingolfin fingolfin enabled auto-merge (squash) April 21, 2026 23:12
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 21, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 78.56%. Comparing base (22d51cc) to head (ea5aa4a).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6329      +/-   ##
==========================================
- Coverage   78.56%   78.56%   -0.01%     
==========================================
  Files         684      684              
  Lines      292844   292882      +38     
  Branches     8657     8660       +3     
==========================================
+ Hits       230070   230096      +26     
- Misses      60964    60977      +13     
+ Partials     1810     1809       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@fingolfin fingolfin merged commit 6cb8237 into master Apr 22, 2026
33 checks passed
@fingolfin fingolfin deleted the mh/faster-ExtractCopy-Sub-MatrixVector branch April 22, 2026 00:25
cdwensley pushed a commit that referenced this pull request Apr 22, 2026
…6329)

... by avoiding some method dispatch, providing optimized methods
using `{...}` syntax, and turning ExtractSubVector, CopySubVector,
ExtractSubMatrix, and CopySubMatrix into kernel ops.

Then run some benchmarks via `benchmark/matobj/bench-submat.g` and
the new `benchmark/matobj/bench-subvec.g`.

Before:

    +----------------------------------------------------------------+
    | Testing submatrix extraction for integer matrix: list of lists |
    +----------------------------------------------------------------+
    Testing m{}{}:
      129 µs per iteration; 7749 iterations per second; (200 iterations)
    Testing ExtractSubMatrix:
      720 µs per iteration; 1388 iterations per second; (200 iterations)
    ...now testing submatrix copying...
    Testing m{}{}:=n{}{}:
      342 µs per iteration; 2921 iterations per second; (200 iterations)
    Testing CopySubMatrix:
      1032 µs per iteration; 969 iterations per second; (200 iterations)

    +------------------------------------------------+
    | Testing submatrix extraction for GF(2) rowlist |
    +------------------------------------------------+
    Testing m{}{}:
      167 µs per iteration; 5999 iterations per second; (200 iterations)
    Testing ExtractSubMatrix:
      3288 µs per iteration; 304 iterations per second; (92 iterations)
    ...now testing submatrix copying...
    Testing m{}{}:=n{}{}:
      1992 µs per iteration; 502 iterations per second; (151 iterations)
    Testing CopySubMatrix:
      2606 µs per iteration; 384 iterations per second; (116 iterations)

    +-------------------------------------------------------------+
    | Testing subvector extraction for integer vector: plain list |
    +-------------------------------------------------------------+
    Testing x{}:=v{}:
      393 µs per iteration; 2542 iterations per second; (200 iterations)
    Testing CopySubVector:
      2324 µs per iteration; 430 iterations per second; (130 iterations)

    +---------------------------------------------------+
    | Testing subvector extraction for GF(2) row vector |
    +---------------------------------------------------+
    Testing x{}:=v{}:
      441 µs per iteration; 2266 iterations per second; (200 iterations)
    Testing CopySubVector:
      2458 µs per iteration; 407 iterations per second; (123 iterations)

After:

    +----------------------------------------------------------------+
    | Testing submatrix extraction for integer matrix: list of lists |
    +----------------------------------------------------------------+
    Testing m{}{}:
      136 µs per iteration; 7355 iterations per second; (200 iterations)
    Testing ExtractSubMatrix:
      140 µs per iteration; 7135 iterations per second; (200 iterations)
    ...now testing submatrix copying...
    Testing m{}{}:=n{}{}:
      301 µs per iteration; 3319 iterations per second; (200 iterations)
    Testing CopySubMatrix:
      299 µs per iteration; 3347 iterations per second; (200 iterations)

    +------------------------------------------------+
    | Testing submatrix extraction for GF(2) rowlist |
    +------------------------------------------------+
    Testing m{}{}:
      154 µs per iteration; 6490 iterations per second; (200 iterations)
    Testing ExtractSubMatrix:
      146 µs per iteration; 6854 iterations per second; (200 iterations)
    ...now testing submatrix copying...
    Testing m{}{}:=n{}{}:
      1895 µs per iteration; 528 iterations per second; (159 iterations)
    Testing CopySubMatrix:
      1935 µs per iteration; 517 iterations per second; (156 iterations)

    +-------------------------------------------------------------+
    | Testing subvector extraction for integer vector: plain list |
    +-------------------------------------------------------------+
    Testing x{}:=v{}:
      428 µs per iteration; 2336 iterations per second; (200 iterations)
    Testing CopySubVector:
      453 µs per iteration; 2209 iterations per second; (200 iterations)

    +---------------------------------------------------+
    | Testing subvector extraction for GF(2) row vector |
    +---------------------------------------------------+
    Testing x{}:=v{}:
      427 µs per iteration; 2344 iterations per second; (200 iterations)
    Testing CopySubVector:
      460 µs per iteration; 2172 iterations per second; (200 iterations)

So overall the Copy/Extract methods are now more or less on par
with the code using `{...}` syntax.

But actually this `ExtractSubMatrix` for a GF(2) rowlist is now *faster*
than the optimized one for a compressed GF(2) matrix:

    +----------------------------------------------------------+
    | Testing submatrix extraction for GF(2) compressed matrix |
    +----------------------------------------------------------+
    Testing m{}{}:
      1057 µs per iteration; 946 iterations per second; (200 iterations)
    Testing ExtractSubMatrix:
      1618 µs per iteration; 618 iterations per second; (186 iterations)

A substantial part of that is due to the the `ExtractSubMatrix`
method for `IsGF2MatrixRep` calling `ConvertToMatrixRepNC(mm,2)`;
after removing that, it went down from 1618 to 392 µs (which is still
worse).

---

AI assistance: Codex implemented the kernel plumbing,
method cleanup, and test updates. It also created
`benchmark/matobj/bench-subvec.g`.

Co-authored-by: Codex <codex@openai.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind: enhancement Label for issues suggesting enhancements; and for pull requests implementing enhancements release notes: use title For PRs: the title of this PR is suitable for direct use in the release notes topic: library topic: performance bugs or enhancements related to performance (improvements or regressions)

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

3 participants