Add IsGenericMatrixRep matrix object type#6322
Conversation
Store dense matrices as plain row lists for faster entry access.
Some microbenchmarks show the benefit, namely that performance for
basic arithmetic in many cases matches that of ist-of-list
matrices, while IsPlistMatrixRep is basically always slower,
sometimes quite substantially so.
Setup:
R:=GF(257);;n:=20;;
M1:=IdentityMat(n,R);;
M2:=IdentityMatrix(IsPlistMatrixRep,R,n);;
M3:=IdentityMatrix(IsFlatPlistMatrixRep,R,n);;
Addition:
gap> for i in [1..1000] do x:=M1+M1; od; time;
4
gap> for i in [1..1000] do x:=M2+M2; od; time;
12
gap> for i in [1..1000] do x:=M3+M3; od; time;
4
Multiplication:
gap> for i in [1..1000] do x:=M1*M1; od; time;
66
gap> for i in [1..1000] do x:=M2*M2; od; time;
1448
gap> for i in [1..1000] do x:=M3*M3; od; time;
60
Inversion:
gap> for i in [1..1000] do x:=Inverse(M1); od; time;
7
gap> for i in [1..1000] do x:=Inverse(M2); od; time;
83
gap> for i in [1..1000] do x:=Inverse(M3); od; time;
4
Powers:
gap>
gap> for i in [1..1000] do x:=M1^5; od; time;
187
gap> for i in [1..1000] do x:=M2^5; od; time;
4350
gap> for i in [1..1000] do x:=M3^5; od; time;
182
Element access (plist-of-plist benefits from a direct kernel
implementation; we could add that for both IsPlistMatrixRep
and IsFlatPlistMatrixRep, if so desired)
gap> for i in [1..1000000] do x:=M1[1,1]; od; time;
19
gap> for i in [1..1000000] do x:=M2[1,1]; od; time;
78
gap> for i in [1..1000000] do x:=M3[1,1]; od; time;
66
Example of a typical matrix property: IsDiagonalMat
gap> for i in [1..10000] do x:=IsDiagonalMat(M1); od; time;
132
gap> for i in [1..10000] do x:=IsDiagonalMat(M2); od; time;
191
gap> for i in [1..10000] do x:=IsDiagonalMat(M3); od; time;
146
IsOne (surprisingly is slower for IsFlatPlistMatrixRep, I do not
yet know why)
gap> for i in [1..10000] do x:=IsOne(M1); od; time;
144
gap> for i in [1..10000] do x:=IsOne(M2); od; time;
145
gap> for i in [1..10000] do x:=IsOne(M3); od; time;
172
PositionNonZeroInRow is among the core functionality for row reduction:
gap> for i in [1..1000000] do x:=PositionNonZeroInRow(M1,1); od; time;
334
gap> for i in [1..1000000] do x:=PositionNonZeroInRow(M2,1); od; time;
486
gap> for i in [1..1000000] do x:=PositionNonZeroInRow(M3,1); od; time;
376
Co-authored-by: Codex <codex@openai.com>
ThomasBreuer
left a comment
There was a problem hiding this comment.
just a few minor remarks
if we want to describe this representation (in comparison with others) then my impression is:
- a matrix type that can be multiplied with any type of vector object, but formally defines
IsPlistVectorRepas itsCompatibleVectorFilter, - a matrix type that deliberately does not support row access,
- matrix type that supports matrices with zero rows or columns (something which should hold for all matrix types except
IsPlistRep.
for curiosity:
Where does the name IsFlatPlistMatrixRep come from?
GAP has a function Flat, thus one could get the idea that an m times n "flat" matrix is internally represented by a "flat" list of length mn.
The number of methods for + may change quite a lot
|
The name In PR #6227 I suggested that we could rename However, it might break private user code, so I am not keen on that. I am open to alternatives. Some braingstorming with AI yields these, but perhaps someone has a better idea:
|
|
I like the idea When one wants to "upgrade" code that was written for list-of-lists matrices to matrix objects, switching to |
I have the same question as @ThomasBreuer, I also expected this. Could you possibly indicate what the key difference between matrices in |
|
Concerning the difference between |
|
Everything @ThomasBreuer wrote is true, but the main advantage of this approach is speed: by not having every row be a "rich" and thus expensive object, we avoid a lot overhead in both memory usage and speed. Also, we can directly call into GAP kernel routines for e.g. multiplying two "list of list matrices". As a result, the new type can be almost as fast as the classic "list of list matrices", and in some cases even faster (as it avoids expensive method dispatch overhead) |
IsFlatPlistMatrixRep matrix object typeIsGenericMatrixRep matrix object type
|
I've now renamed this to |
|
@ThomasBreuer @james-d-mitchell if there are no further objections (also to the new name), I'll merge this later today |
I have no objections but also haven't looked in detail at this. I can do that if desired but not until later today. |
|
As far as I see, the changes refer only to the renaming. Since I had voted for the new name, I do not object. |
|
@james-d-mitchell here is my view: if you would like to have a look at this, I'd be happy to wait for you, also beyond today! But don't feel compelled to look at it -- in the end @ThomasBreuer had a look, and even if I messed up a lot, we can still revert or fix things :-) |
|
One more comment: Since I do now understand the implementation of
|
| ## <Description> | ||
| ## An object <A>obj</A> in <Ref Filt="IsGenericMatrixRep"/> describes | ||
| ## a matrix object (see <Ref Filt="IsMatrixObj"/>) whose entries are stored | ||
| ## as a dense plain list of dense plain row lists. |
There was a problem hiding this comment.
It'd be good to clarify what the difference is between IsPlistMatrixRep and IsGenericMatrixRep is here, as described in one of the comments elsewhere.
There was a problem hiding this comment.
For example, although I know what is intended "a dense plain list of dense plain row lists" and "is not a row list matrix" seem contradictory. More words might indicate more clearly what is meant here.
|
|
||
|
|
||
| # Internal positions for flat plist matrices. | ||
| BindConstant( "FBDPOS", 1 ); |
There was a problem hiding this comment.
Does the leading "F" stand for "Flat"? If so it should probably be "G" for generic no?
|
|
||
| # Internal positions for flat plist matrices. | ||
| BindConstant( "FBDPOS", 1 ); | ||
| BindConstant( "FCOLSPOS", 2 ); |
There was a problem hiding this comment.
Is there a reason that this exists? It seems to only be used in NumberColumns couldn't is just be Length(m![FROWSPOS][1]) instead? Or I suppose
if Length(m![FROWSPOS]) = 0 then
return 0;
else
return Length(m![FROWSPOS][1]);
fi;
If this is for performance reasons, then maybe a comment indicating that?
There was a problem hiding this comment.
Then we'd loose support for 0 x m matrices, which is really annoying in lots of situations in my practical experience.
| if ValueOption( "check" ) <> false then | ||
| if not ob in BaseDomain( M ) then | ||
| Error( "<ob> must lie in the base domain of <M>" ); | ||
| elif not row in [1..NrRows(M)] then |
There was a problem hiding this comment.
This isn't uniformly formatted see line 76 for example
| if not ob in BaseDomain( M ) then | ||
| Error( "<ob> must lie in the base domain of <M>" ); | ||
| elif not row in [1..NrRows(M)] then | ||
| Error( "<row> is out of bounds" ); |
There was a problem hiding this comment.
| Error( "<row> is out of bounds" ); | |
| Error( "<row> ", row, " is out of bounds, expected value in [", 1, ", ", NrRows(M),"]"); |
| function( M, N, srcrows, dstrows, srccols, dstcols ) | ||
| if ValueOption( "check" ) <> false and | ||
| not IsIdenticalObj( BaseDomain(M), BaseDomain(N) ) then | ||
| Error( "<M> and <N> are not compatible" ); |
There was a problem hiding this comment.
| Error( "<M> and <N> are not compatible" ); | |
| Error( "the 1st and 2nd arguments <M> and <N> are not compatible, their BaseDomains must be identical objects" ); |
There was a problem hiding this comment.
The user will be on a break loop, where M and N can be accessed, and if they so care, they can also see that they are argument 1 and 2 of the function in which the error is raised; so I really don't like adding these position information to error message.
| ( not IsIdenticalObj( BaseDomain( a ), BaseDomain( b ) ) or | ||
| NrRows( a ) <> NrRows( b ) or | ||
| NrCols( a ) <> NrCols( b ) ) then | ||
| Error( "<a> and <b> are not compatible" ); |
There was a problem hiding this comment.
This could be more nuanced too.
There was a problem hiding this comment.
Yes it could be. But this is a first iteration. We might decided to discard it again in a couple months. I am not sure it is worth it polishing it to this degree before we even have a single place (other than the tests) using it?
In fact I am tempted to also remove it from the documentation, I think it is a mistake to put it there right now, until we have actually used it a bit and decide to keep it.
|
|
||
| if ValueOption( "check" ) <> false then | ||
| if colsA <> rowsB then | ||
| ErrorNoReturn( "\\*: Matrices do not fit together" ); |
There was a problem hiding this comment.
| ErrorNoReturn( "\\*: Matrices do not fit together" ); | |
| ErrorNoReturn( "\\*: the factors (matrices) do not fit together, the 1st factor has ", colsA, "columns, and the 2nd factor has ", rowsB, " rows, expected these to be equal"); |
or something similar.
|
|
||
| if ValueOption( "check" ) <> false then | ||
| if cols <> Length( v ) then | ||
| Error( "<M> and <v> are not compatible" ); |
There was a problem hiding this comment.
These error messages could be more descriptive or the if-statements could be combined into one.
| return Vector( res, v ); | ||
| end ); | ||
|
|
||
| InstallMethod( ChangedBaseDomain, |
There was a problem hiding this comment.
All of these 1 line functions could be lambdas to save "vertical screen real estate"
There was a problem hiding this comment.
Some but not all (just those with a return)
| fi; | ||
| Print( M![FROWSPOS][i], "\n" ); | ||
| od; | ||
| Print( "]>\n" ); |
There was a problem hiding this comment.
Looks slightly weird to me that the opening bracket is in the same line as the first row, but the closing bracket is essentially in a line by itself.
james-d-mitchell
left a comment
There was a problem hiding this comment.
There are some remnants of the original name "Flat" that should be changed to "Generic" to avoid confusion in the future. The documentation should also be updated to clearly explain what the difference between IsGenericMatrixRep and IsPlistMatrixRep. Otherwise, I've made some suggestions that I think might improve the error messages/usability of the new code, which are less important.
Store dense matrices as plain row lists for faster entry access.
Partially addresses issue #2148 (the unresolved part is that of a "vector object that is not in
IsList)Closes #2973 (an older attempt of something similar)
This PR was created with help of Codex by OpenAI: I planned the work with it, then it did a full first implementation, which was improved by me at first via some rounds of prompts requesting tweaks, then a bunch of manual work to implement details and speedups the AI was not aware.
Since this is fresh, I am sure there are some bugs, and definitely opportunities for things to be improved. And of course plenty of places in the code where using these MatrixObj will blow up because they are not row lists... but that's for future work.
Some microbenchmarks show the benefit, namely that performance for basic arithmetic in many cases matches that of ist-of-list matrices, while IsPlistMatrixRep is basically always slower, sometimes quite substantially so.
I am deliberately doing this over a finite field that is not covered by our compressed matrices, as that is IMHO going to be one of the central use cases in GAP; though for integer/rational matrices, I don't think things will be substantially difference, at least relative to each other. There is no point comparing e.g. over GF(2) to our compressed GF(2) matrices, which of course will be much faster than any of the three implementations being tested here.
Setup:
Addition:
Multiplication:
Inversion:
Powers:
Element access (plist-of-plist benefits from a direct kernel implementation; we could add that for both IsPlistMatrixRep and IsFlatPlistMatrixRep, if so desired)
Example of a typical matrix property: IsDiagonalMat
IsOne (surprisingly is slower for IsFlatPlistMatrixRep, I do not yet know why)
PositionNonZeroInRow is among the core functionality for row reduction: