Skip to content

Add unique local updates#716

Open
TimWhiting wants to merge 1 commit intokoka-lang:devfrom
TimWhiting:unique_locals
Open

Add unique local updates#716
TimWhiting wants to merge 1 commit intokoka-lang:devfrom
TimWhiting:unique_locals

Conversation

@TimWhiting
Copy link
Copy Markdown
Collaborator

@TimWhiting TimWhiting commented May 24, 2025

This allows for truly unique updating of locals.

To do so it needs to be able to copy a value such that it has a unique reference, and the update function needs to not have handleable effects.

Ideally we would introduce some syntax for this, or optimize the assignment syntax to use this when the rhs is total.

I have a version of unique vectors #550 that rebases on top of this https://github.com/TimWhiting/koka/tree/unique-vectorsx.

Comment thread kklib/src/ref.c Outdated
@TimWhiting TimWhiting force-pushed the unique_locals branch 2 times, most recently from 82fe1a3 to cf0b311 Compare June 24, 2025 14:14
@HeikoRibberink
Copy link
Copy Markdown

HeikoRibberink commented Jun 29, 2025

I have a 2d array implementation on top of this:

pub value struct matrix<a>
  rows : int
  cols : int
  data : vector<a>

pub fun ref/assign/@index(mat : ref<h, matrix<a>>, idx : idx, assigned : a) : <pure,read<h>,write<h>> ()
  trace(show(idx) ++ " in " ++ show(((!mat).rows, (!mat).cols))) // For debugging
  (!mat).assert-valid-idx(idx)
  fun unique-copy(matr : matrix<_>)
    matr(data = matr.data.copy)
  mat.ref/update fn(m : matrix<_>)
    val Matrix(r, c, d) = m
    val i = idx.int(c)
    val d' = unsafe-total
      d.set(i, assigned)
    Matrix(r, c, d')

Where assert-valid-idx throws an exception if an index is not valid for a certain matrix. That is why unsafe-total is valid, as we are certain that the idx.int(c) is a valid index for the backing vector.
But when I perform some modifications, like this:

import ds/matrix

fun main()
  val matrix = ref(matrix(10, 10) fn(idx) idx)
  matrix[(3, 3)] := (1, 2)
  matrix.(!)[(3, 3)].show

I get:

(3,3) in (10,10)
uncaught exception: Index (3,3) out of bounds (-578721382704613385,-578721382704613385) at ds/matrix
.kk(74)

Sometimes I also get an error from mimalloc saying that the free list is corrupted:
mimalloc: error: corrupted free list entry of size 40b at 0x0200000700D8: value 0x20003C70150

The bounds above have an interesting binary form:
-0b100000001000000010000000100000001000000010000000100000001001
or hex -0x808080808080809

@HeikoRibberink
Copy link
Copy Markdown

It doesn't work with local-vars:

pub fun local-var/assign/@index(mat : local-var<h, matrix<a>>, idx : idx, assigned : a) : <pure,read<h>,write<h>> ()
  trace(show(idx) ++ " in " ++ show((mat.rows, mat.cols)))
  mat.assert-valid-idx(idx)
  mat.local-var/update(?unique-copy = unique-copy) fn(m)
    m.unsafe-total-set(idx, assigned)

Which fails with:

type error: mat does not match the argument types
  context      :   mat
  term         :   mat
  inferred type: local-var<$h,matrix<$a>>
  expected type: local-var<_h2,local-var<_h1,_b>>

Which I find really strange, as I've never had this problem myself before while working with local-var function parameters.

@TimWhiting
Copy link
Copy Markdown
Collaborator Author

If you look at the linked branch in the PR description you should see how to use this with vectors. I'm not sure what other code you have going on. Especially this last commit: TimWhiting@b0ef63b

For unique-copy you need to dip into C code likely and make sure that you make a truly fresh copy with a reference count of 1, returning it. As I don't know how you defined copy, I'm unsure if that is causing issues.

However, I can point to something that is for sure causing issues - unsafe-total is never safe when you try to mask a handled effect. exn is a handleable effect - not builtin like console. If you mask it - even if you guarantee it will never be called it will for sure mess up other effects in that scope, and possibly outside of it. Use unsafe-assign (copy it into your codebase since it is not currently public).

For the local var issue:
std/core/types/@byref(mat).local-var/update(...)

@HeikoRibberink
Copy link
Copy Markdown

I don't know how you defined copy

copy is predefined in vector.kk in the standard library.

However, I can point to something that is for sure causing issues - unsafe-total is never safe when you try to mask a handled effect. exn is a handleable effect - not builtin like console. If you mask it - even if you guarantee it will never be called it will for sure mess up other effects in that scope, and possibly outside of it. Use unsafe-assign (copy it into your codebase since it is not currently public).

Thanks! I will try your method.

std/core/types/@byref(mat).local-var/update(...)

Amazing, I didn't know that this was possible.

@HeikoRibberink
Copy link
Copy Markdown

I created a MRE of the issue with the corrupted free list on Koka v3.1.3 with this commit merged in this gist
Unfortunately, I very often get something like:
mimalloc: error: corrupted free list entry of size 40b at 0x0200000F1048: value 0x2C0000F1020
There have been a few times while profiling that this did not happen, and in those cases the array indeed did not seem to be copied.
Unfortunately because of mimalloc, I don't know how to debug this myself.

@TimWhiting
Copy link
Copy Markdown
Collaborator Author

TimWhiting commented Jul 22, 2025

Wow that is really strange. Yes, I wouldn't expect the array to be copied using unique-copy (that is the point of this PR).

However, I see the same strange behavior where it randomly will complete correctly, but randomly fail in mimalloc. It gets to around element 102 in the array when this happens. So I don't think it has anything to do with this PR in particular. It gets to element 129 if I compile -O2, so I'm assuming it actually gets further than that but the console doesn't quite keep up with the logging I'm doing.

It seems to be an issue with mimalloc. The only thing that I can think of causing the issue is the repetitive allocation and subsequent deallocation of the update function.
Running it with only 100 elements never fails.
Running it with the vector PR with the inlined assignment @index (not the updating version that takes a lambda), never fails, even at extremely high elements counts, however, the update also fails (I'm assuming since it allocates and deallocates closures immediately very fast). We should probably look into making some closures value types if possible.

@HeikoRibberink
Copy link
Copy Markdown

I'm happy to see that you can reproduce the behaviour. Do you know if Daan has any ideas?

The only thing that I can think of causing the issue is the repetitive allocation and subsequent deallocation of the update function.

Can you explain why? Don't we de/allocate the same closures very often anyway? And then why is this the first time we've encountered this?

@daanx
Copy link
Copy Markdown
Member

daanx commented Jul 22, 2025

Oh, very interesting! We observed a similar error in the wild with mimalloc v3 .. I blamed it on an out of bounds error in the program but now I am wondering if it is a bug in mimalloc.

I will try to repro.

(see the end of issue microsoft/mimalloc#1098. edit: this may be not the case after all as I misunderstood that mimalloc was upgraded inside koka as well.)

@daanx
Copy link
Copy Markdown
Member

daanx commented Jul 22, 2025

Ah, I cannot build it at this point. Can you clarify how to repro? I pull'd the latest dev and then merged this PR. If I look in the kklib/mimalloc directory it is still at v2.2.4.

  • Should I update kklib/mimalloc by hand to v3.1.3 ?
  • If I try to compile the gist, I get that kk_vector_unsafe_assign_borrow is not defined
  • If I rename that primitive to kk_ref_vector_assign_borrow (and remove the ssize_t conversion) it compiles though
  • But then fails with an assertion error: Assertion failed: (i < len), function kk_ref_vector_assign_borrow, file vector.c, line 67.
  • If I compile with debug info -O-1 --debug I get other compile errors due to kk_block_is_no_free in an assertion; if I comment those out I can compile again, but then get another assertion failure: Assertion failed: (kk_block_tag(b) == tag || kk_block_tag(b) == KK_TAG_BOX_ANY), function kk_block_assertx, file kklib.h, line 653 which comes from: ref.c,
53  	kk_unit_t kk_ref_vector_assign_borrow(kk_ref_t _r, kk_integer_t idx, kk_box_t value, kk_context_t* ctx) {
-> 54  	  struct kk_ref_s* r = kk_datatype_as_assert(struct kk_ref_s*, _r, KK_TAG_REF, ctx);

.. so not sure; clearly there are some other bugs as well that should be fixed, but I am also quite interested to repro as is in case there is a mimalloc bug. Let me know how I can repro :-)

@HeikoRibberink
Copy link
Copy Markdown

HeikoRibberink commented Jul 22, 2025

This is my git log:

commit d2d2c4b75f7dff7d5da984ffbd05f7d0b0e6ce61 (HEAD -> dev)
Merge: f0ea5dee 03f181e5
Author: HeikoRibberink <heiko.ribberink@deblijdeaankomst.nl>
Date:   Sat Jun 28 14:13:20 2025 +0200

    Merge branch 'unique-vectors' of https://github.com/TimWhiting/koka into dev

commit f0ea5dee2d93d2f3be10d9ded4c20b0fac830fe2
Merge: 561a11cc cf0b3113
Author: HeikoRibberink <heiko.ribberink@deblijdeaankomst.nl>
Date:   Sat Jun 28 14:12:43 2025 +0200

    Merge branch 'unique_locals' of https://github.com/TimWhiting/koka into dev

commit 561a11cc1944f2cd974562b329c0a31b63365b9e (origin/dev, origin/HEAD)
Author: Tim Whiting <tim@whitings.org>
Date:   Thu Jun 26 20:42:10 2025 -0600

    fix js match issue (#737)

commit cf0b311313ebc8d48e6169164a8c76d06873500d
Author: Tim Whiting <tim@whitings.org>
Date:   Fri May 23 19:28:41 2025 -0600

    Add unique update for locals if the function is total

koka --version:

Koka 3.1.3, 14:14:04 Jun 28 2025 (ghc release version)
version: 3.1.3
bin    : /usr/local/bin
lib    : /usr/local/lib/koka/v3.1.3
share  : /usr/local/share/koka/v3.1.3
output : .koka/v3.1.3/gcc-debug-24e143
cc     : /usr/bin/gcc
flags  : 24e143

mimalloc reamde.md reads Latest release tag: v2.1.6
Please let me know if you need to know anything else to reproduce.

@daanx
Copy link
Copy Markdown
Member

daanx commented Jul 22, 2025 via email

@HeikoRibberink
Copy link
Copy Markdown

No, I did not upgrade mimalloc. How can I best do that? Maybe I should upgrade to Koka v3.2.0 and try again?

@daanx
Copy link
Copy Markdown
Member

daanx commented Jul 22, 2025 via email

@TimWhiting
Copy link
Copy Markdown
Collaborator Author

TimWhiting commented Jul 22, 2025

Hi @daanx, I have been testing on this branch: https://github.com/TimWhiting/koka/tree/unique-vectorsx (PR now available #762 for viewing the diff online), which updates my unique_vectors PR to use this PR for local-vars. That is where the missing primitive is (which is just renamed, removing the drop because we made it a borrow). I've tried various branches of mimalloc all with the same issue. I don't believe it is something in Koka, because it happens in the middle of a repetitive loop -> where 100 identical earlier iterations of the loop finish fine, but I could be wrong.

@TimWhiting TimWhiting mentioned this pull request Jul 22, 2025
@daanx
Copy link
Copy Markdown
Member

daanx commented Jul 22, 2025

@TimWhiting : maybe try to compile with -O-1 --debug and then use lldb <exe> to execute. I got an assertion failure on the tag of the reference which is suspicious..

@TimWhiting
Copy link
Copy Markdown
Collaborator Author

TimWhiting commented Jul 22, 2025

Thanks for the encouragement to take a closer look. I was dropping an already dropped reference (I had switched the closure from borrowed to owned, and thought I needed a drop, but had forgotten that closures take ownership of themselves when called). Speaking of which, I probably need to dup the other function which is borrowed.

I don't know why this didn't fail on the first iteration though. It does with debug assertions.

Any idea of why that is? Does mimalloc not try to commit any deallocations until later - such that it can handle some amount of double frees until it flushes some buffer or something?

@TimWhiting TimWhiting marked this pull request as draft July 31, 2025 02:41
@TimWhiting
Copy link
Copy Markdown
Collaborator Author

TimWhiting commented Jul 31, 2025

Converted to draft, because I hadn't really considered value types, and there are some improvements I want to make that will make this a lot cleaner.

@TimWhiting TimWhiting marked this pull request as ready for review July 31, 2025 15:21
@TimWhiting
Copy link
Copy Markdown
Collaborator Author

Okay, I was way overcomplicating this. As long as we ensure totality / no access of the local we can do this by simply passing ownership to the update function. I'd like to allow slightly more effects, but it would require some more implicit constraint work.

@TimWhiting TimWhiting force-pushed the dev branch 3 times, most recently from 2c83d90 to 00399ab Compare August 13, 2025 15:46
@TimWhiting
Copy link
Copy Markdown
Collaborator Author

TimWhiting commented Aug 15, 2025

@daanx, I misspoke during our meeting, actually the string buffer that outperforms C++ concatenation is in #762 - based on this PR. Though this PR helps for the variants that use local variables.

That PR adds unique updates to vectors, and unique updates to vectors in locals. It extends the test in this PR with another variant of string buffers that is faster than C++. More details on the performance evaluation is in this discussion, starting here: #765 (reply in thread)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants