8380060: C2: Wrong execution with COH and arraycopy by rkennke · Pull Request #30775 · openjdk/jdk

rkennke · 2026-04-16T19:48:09Z

C2's generate_block_arraycopy miscompiles when compact object headers are enabled. With compact headers, arrayOopDesc::base_offset_in_bytes(T_BYTE) returns 12 instead of 16. Since 12 is not 8-byte aligned, the block arraycopy optimization picks off an initial 32-bit word via a typed LoadI/StoreI pair before switching to a long-word copy.

The LoadI reads from the source array using the typed memory slice. When the source array was initialized by a prior arraycopy (e.g. Arrays.copyOfRange optimized to a clone), that arraycopy wrote to raw memory. The typed memory slice does not reflect these raw writes. The IGVN folds the typed load to zero, corrupting the first 4 bytes of the copied array.

The initial-word pick-off path is only reached when the array data offset is not 8-byte aligned. Without compact headers the offset is 16 (aligned), so this path is never taken through standard library code. With compact headers it is always taken for byte/boolean/short/char/int/float arrays copied via tightly-coupled allocations.

The proposed fix is to make the clone AC update all memory slices, not just raw. This is already done for all other places where we create an ArrayCopyNode.

I also turned the Test.java test into a jtreg test (with some improvements to cover more scenarios).

Special thanks to Luca Ardueser, BSI Software AG who found and reported the issue, and helped with initial investigation and came up with a test-case.

Testing:

the originally submitted test-case
the new derived jtreg test
tier1 (default)
tier1 (+UseCompactObjectHeaders)

I confirm that I make this contribution in accordance with the OpenJDK Interim AI Policy.

Progress

Change must not contain extraneous whitespace
Commit message must refer to an issue
Change must be properly reviewed (2 reviews required, with at least 1 Reviewer, 1 Author)

Issue

JDK-8380060: C2: Wrong execution with COH and arraycopy (Bug - P3)

Reviewers

Vladimir Kozlov (@vnkozlov - Reviewer)
Quan Anh Mai (@merykitty - Reviewer)

Contributors

Christian Hagedorn <chagedorn@openjdk.org>

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/30775/head:pull/30775
$ git checkout pull/30775

Update a local copy of the PR:
$ git checkout pull/30775
$ git pull https://git.openjdk.org/jdk.git pull/30775/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 30775

View PR using the GUI difftool:
$ git pr show -t 30775

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/30775.diff

Using Webrev

Link to Webrev Comment

rkennke · 2026-04-16T19:49:12Z

/contributor add chagedorn

rkennke · 2026-04-16T19:49:46Z

/contributor add Luca Ardueser, BSI Software AG

bridgekeeper · 2026-04-16T19:50:10Z

👋 Welcome back rkennke! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

openjdk · 2026-04-16T19:54:06Z

@rkennke This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8380060: C2: Wrong execution with COH and arraycopy

Co-authored-by: Christian Hagedorn <chagedorn@openjdk.org>
Reviewed-by: kvn, qamai

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 111 new commits pushed to the master branch:

3d72318: 8382359: Align 32-bit data in AOT Code Cache to sizeof(uint)
35a499e: 8335355: Shenandoah: Fix race condition in gc/shenandoah/mxbeans/TestPauseNotifications.java
0993467: 8382306: Remove JvmtiTagMap::check_hashmap
... and 108 more: https://git.openjdk.org/jdk/compare/ac74e6f9433f1e0ad177e579bde7f2338189d6c6...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

openjdk · 2026-04-16T20:23:19Z

@rkennke
Contributor Christian Hagedorn <chagedorn@openjdk.org> successfully added.

openjdk · 2026-04-16T20:39:59Z

@rkennke Luca Ardueser, BSI Software AG was not found in the census.

Syntax: /contributor (add|remove) [@user | openjdk-user | Full Name <email@address>]. For example:

/contributor add @openjdk-bot
/contributor add duke
/contributor add J. Duke <duke@openjdk.org>

User names can only be used for users in the census associated with this repository. For other contributors you need to supply the full name and email address.

openjdk · 2026-04-16T20:40:57Z

@rkennke The following label will be automatically applied to this pull request:

hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

openjdk · 2026-04-16T20:41:04Z

The total number of required reviews for this PR has been set to 2 based on the presence of this label: hotspot-compiler. This can be overridden with the /reviewers command.

merykitty · 2026-04-17T06:52:23Z

Just a drive-by comment: I don't really understand. The comment talks about the possibility that src is unintialized, but the code checks dest_uninitialized. Besides, if generating a Load from the result of an arraycopy leads to incorrect folding, how do normal loads work? It seems that this load we generate here reads the incorrect memory slice.

A better fix would have been to turn the picked-off LoadI to load from raw slice, but this doesn't easily work, because we can't do raw accesses from oops (or I could not figure out how to do it).

An AllocateNode has a raw pointer output, which is CheckCastPPed into a typed oop. If you want to do a raw access, you need to look for that raw pointer, you need to ensure that those accesses lie between the AllocateNode and the InitializeNode, too.

@rwestrel recently work with this, I believe he should take a look and/or have a better idea.

openjdk · 2026-04-17T11:40:42Z

@rkennke hotspot has been added to this pull request based on files touched in new commit(s).

rkennke · 2026-04-17T11:50:18Z

Just a drive-by comment: I don't really understand. The comment talks about the possibility that src is unintialized, but the code checks dest_uninitialized. Besides, if generating a Load from the result of an arraycopy leads to incorrect folding, how do normal loads work? It seems that this load we generate here reads the incorrect memory slice.

Right. The dest_initialized check did nothing. That path is only ever taken when dest_initialized=true to begin with, meaning the change disabled the pick-off load/store wholesale. And it is not the right fix.

The real problem (as far as my limited understanding of C2 goes) is that the preceding clone/arraycopy only feeds into the 'raw' memory slice. The pick-off load/store would (via complicated ways) not see this, but instead see the initialize node, and get turned into constant 0.

I am not sure why regular Java-level loads are not affected by this. Maybe a lucky coincidence? The trailing membar after clone might also have helped?

Anyhow, the real fix seems to be that the AC in clone needs to update all memory slices, not just raw. This is also what all other occurances of ArrayCopyNode do (e.g. copyOf/copyOfRange/arraycopy). See updated proposed change. It fixes the test-case for me. I'm currently running tier1 tests...

…hape of AC memory

mlbridge · 2026-04-20T08:56:12Z

Webrevs

merykitty · 2026-04-20T15:01:53Z

+      // Expected pattern: out_mem → MergeMem → MemBar, or
+      // out_mem → MemBar (when MergeMem was folded away).
+      Node* mem_use = out_mem->raw_out(0);
+      if (out_mem->outcnt() != 1 ||


This De Morgan is quite hard to grasp, do you think !(mem_use->is_MemBar() || (mem_use->is_MergeMem() && mem_use->outcnt() == 1 && mem_use->raw_out(0)->is_MemBar()) would be a little easier to read? Not so sure tbh.

I agree, it is easier to read when it's mostly positive tests, with a single leading negation. Also, I am pulling out the outcnt == 1 test to before we actually fetch out_mem->raw_out(0), while that should be safe, it looks brittle and is definitely safer if checked before.

merykitty · 2026-04-20T15:06:31Z

        return false;
      }

-      igvn->replace_node(out_mem->raw_out(0), mem);


I don't quite follow this piece, why do we try to replace only the first output of out_mem, what happens if out_mem has multiple outputs? If out_mem should only have 1 output, then an assert seems more suitable.

I think the assumption is that out_mem only has one use. We check that in the big expression above and it was pre-existing. I also think this should be an assertion, but I'm a little hesitant tbh.

ah wait, I think the assertion there already answers this: the known case where the expectation outcnt == 1 might not hold true is with card-marking, and we bail out and not decompose the clone to begin with.

Ah yes, I misparsed that.

merykitty · 2026-04-20T15:07:54Z

  return phase->C->top();
 }

 bool ArrayCopyNode::finish_transform(PhaseGVN *phase, bool can_reshape,


It would be helpful if there is a description about what is mem, and what memory states it captures.

I'm adding a comment.

vnkozlov

@rwestrel recently work with this, I believe he should take a look and/or have a better idea.

I completely agree with this. Lets Roland to look on this - he is expert in this part.

Current fix does not look right based on the issue, @rkennke, you described.

merykitty · 2026-04-20T17:59:43Z

@vnkozlov Can you clarify? The fix looks pretty correct to me.

rkennke · 2026-04-20T19:08:01Z

@rwestrel recently work with this, I believe he should take a look and/or have a better idea.

I completely agree with this. Lets Roland to look on this - he is expert in this part.

Current fix does not look right based on the issue, @rkennke, you described.

The original fix was not right, as @merykitty pointed out last week, but I think this new fix should be mostly correct. It'd be good if @rwestrel would look at it anyway.

vnkozlov · 2026-04-20T19:49:38Z

-    const TypePtr* raw_adr_type = TypeRawPtr::BOTTOM;
    ac->set_adr_type(TypeRawPtr::BOTTOM);
-    kit->set_predefined_output_for_runtime_call(ac, ac->in(TypeFunc::Memory), raw_adr_type);
+    // The clone copies the entire object content. Set all memory slices
+    // to the clone's output so that subsequent typed loads see the
+    // clone's writes. Using set_predefined_output_for_runtime_call with
+    // TypeRawPtr::BOTTOM would only update the raw slice, leaving typed
+    // per-instance slices (from escape analysis) pointing to the
+    // pre-clone memory state via the Initialize's NarrowMemProj.
+    kit->set_predefined_output_for_runtime_call(ac);
+    kit->set_all_memory_call(ac);


Why we can't use ac->connect_outputs(kit); here?

connect_outputs() also handles IO and exception paths, which are not needed/used here?

Looking on set_predefined_output_for_runtime_call(ac) and it is only do set_control() and set_all_memory_call(call) for your case. So you don't actually need call memory set again.

Correct, that is redundant. I'll remove it.

vnkozlov

Good. I will submit testing

merykitty

Looks good to me, thanks for making the changes and answering my questions.

merykitty · 2026-04-21T01:05:37Z

        return false;
      }

-      igvn->replace_node(out_mem->raw_out(0), mem);


Ah yes, I misparsed that.

merykitty · 2026-04-21T01:06:57Z

+ *
+ * @run main/othervm -Xbatch -XX:+UseCompactObjectHeaders
+ *                   -XX:CompileCommand=compileonly,compiler.arraycopy.TestBlockArrayCopyMisaligned::test*
+ *                   compiler.arraycopy.TestBlockArrayCopyMisaligned


* @run main/othervm -Xbatch -XX:+UseCompactObjectHeaders * -XX:CompileCommand=compileonly,${test.main.class}::test* * ${test.main.class}

This should be better.

rwestrel · 2026-04-21T10:40:41Z

Does this only reproduce with product builds?
I could only reproduce it with a product build and was puzzled that compilation would be different with a debug build I found that this:

diff --git a/src/hotspot/share/opto/escape.cpp b/src/hotspot/share/opto/escape.cpp
index a05ad0ef99a..19bdb3bc389 100644
--- a/src/hotspot/share/opto/escape.cpp
+++ b/src/hotspot/share/opto/escape.cpp
@@ -4539,7 +4539,7 @@ void ConnectionGraph::split_unique_types(GrowableArray<Node *>  &alloc_worklist,
           const TypePtr* adr_type = proj->adr_type();
           const TypePtr* new_adr_type = tinst->add_offset(adr_type->offset());
           if (adr_type != new_adr_type && !init->already_has_narrow_mem_proj_with_adr_type(new_adr_type)) {
-            DEBUG_ONLY( uint alias_idx = _compile->get_alias_index(new_adr_type); )
+            uint alias_idx = _compile->get_alias_index(new_adr_type);
             assert(_compile->get_general_index(alias_idx) == _compile->get_alias_index(adr_type), "new adr type should be narrowed down from existing adr type");
             NarrowMemProjNode* new_proj = new NarrowMemProjNode(init, new_adr_type);
             igvn->set_type(new_proj, new_proj->bottom_type());

explains why (get_alias_index() has the side effect of allocating a new alias so if we don't perform the call then code that iterates over all aliases until num_alias_types() to fix the memory graph during EA skips the newly added NarrowMemProjNode. With this fixed, the test that's attached to the bug now passes.

rkennke · 2026-04-21T11:32:41Z

Does this only reproduce with product builds? I could only reproduce it with a product build and was puzzled that compilation would be different with a debug build I found that this:
diff --git a/src/hotspot/share/opto/escape.cpp b/src/hotspot/share/opto/escape.cpp
index a05ad0ef99a..19bdb3bc389 100644
--- a/src/hotspot/share/opto/escape.cpp
+++ b/src/hotspot/share/opto/escape.cpp
@@ -4539,7 +4539,7 @@ void ConnectionGraph::split_unique_types(GrowableArray<Node *>  &alloc_worklist,
           const TypePtr* adr_type = proj->adr_type();
           const TypePtr* new_adr_type = tinst->add_offset(adr_type->offset());
           if (adr_type != new_adr_type && !init->already_has_narrow_mem_proj_with_adr_type(new_adr_type)) {
-            DEBUG_ONLY( uint alias_idx = _compile->get_alias_index(new_adr_type); )
+            uint alias_idx = _compile->get_alias_index(new_adr_type);
             assert(_compile->get_general_index(alias_idx) == _compile->get_alias_index(adr_type), "new adr type should be narrowed down from existing adr type");
             NarrowMemProjNode* new_proj = new NarrowMemProjNode(init, new_adr_type);
             igvn->set_type(new_proj, new_proj->bottom_type());
explains why (get_alias_index() has the side effect of allocating a new alias so if we don't perform the call then code that iterates over all aliases until num_alias_types() to fix the memory graph during EA skips the newly added NarrowMemProjNode. With this fixed, the test that's attached to the bug now passes.

I just checked and I could not reproduce with debug build. Likewise, if I comment that line in debug, then it reproduces there, too. And yes, your change fixes the problem for me, too. That's a nasty side-effect...

Do you think the way that clone is currently wired up, feeding into the raw memory, instead of all memory, is correct, though? Because other arraycopy cases seems to set all memory. Or does it not matter?

rwestrel · 2026-04-21T13:10:06Z

Do you think the way that clone is currently wired up, feeding into the raw memory, instead of all memory, is correct, though? Because other arraycopy cases seems to set all memory. Or does it not matter?

The wiring is correct AFAIU. There's a MemBar node following the ArrayCopy node. That MemBar should guarantee correct ordering.

But, the graph here after array copy expansion is also broken, because even with the patch above, the memory input of the LoadI ends up above that MemBar which is not correct even though, it appears to be harmless. The reason the memory input of the LoadI is allowed to be above the MemBar is that C2 treats loads from non escaping allocations specially wrt to MemBars. There used to be code ArrayCopyNode::may_modify() that would pattern match the expanded ArrayCopy shapes so a load from an array would not bypass an ArrayCopy that writes to that array but I reworked it with : https://github.com/openjdk/jdk/pull/23465/changes (and likely broke this particular case). The expectation now is that we would run into a MemBar that's flagged TrailingExpandedArrayCopy when we walk the memory graph after expansion. But we don't add one in this particular case MemBar. So I would suggest that as a fix:

diff --git a/src/hotspot/share/opto/macroArrayCopy.cpp b/src/hotspot/share/opto/macroArrayCopy.cpp
index 139775506db..6c8cb467d53 100644
--- a/src/hotspot/share/opto/macroArrayCopy.cpp
+++ b/src/hotspot/share/opto/macroArrayCopy.cpp
@@ -1248,6 +1248,21 @@ void PhaseMacroExpand::expand_arraycopy_node(ArrayCopyNode *ac) {
   MergeMemNode* merge_mem = nullptr;
 
   if (ac->is_clonebasic()) {
+    if (ac->_dest_type != TypeOopPtr::BOTTOM) {
+      MemBarNode* membar = ac->proj_out(TypeFunc::Control)->unique_ctrl_out()->as_MemBar();
+      MergeMemNode* mm = membar->in(TypeFunc::Memory)->as_MergeMem();
+      assert(mm->memory_at(Compile::AliasIdxRaw)->as_Proj()->in(0) == ac, "");
+      int mem_bar_alias_idx = C->get_alias_index(ac->_dest_type->add_offset(Type::OffsetBot)->is_ptr());
+      Node* ctl = membar->in(0);
+      Node* mem = mm;
+      insert_mem_bar(&ctl, &mem, Op_MemBarCPUOrder, mem_bar_alias_idx);
+      _igvn.replace_input_of(membar, 0, ctl);
+      _igvn.replace_input_of(membar, TypeFunc::Memory, mem);
+      assert(ctl->is_Proj(), "MemBar control projection");
+      assert(ctl->in(0)->isa_MemBar(), "MemBar node");
+      ctl->in(0)->isa_MemBar()->set_trailing_expanded_array_copy();
+    }
+
     BarrierSetC2* bs = BarrierSet::barrier_set()->barrier_set_c2();
     bs->clone_at_expansion(this, ac);
     return;

rkennke · 2026-04-21T13:33:49Z

Do you think the way that clone is currently wired up, feeding into the raw memory, instead of all memory, is correct, though? Because other arraycopy cases seems to set all memory. Or does it not matter?

The wiring is correct AFAIU. There's a MemBar node following the ArrayCopy node. That MemBar should guarantee correct ordering.

But, the graph here after array copy expansion is also broken, because even with the patch above, the memory input of the LoadI ends up above that MemBar which is not correct even though, it appears to be harmless. The reason the memory input of the LoadI is allowed to be above the MemBar is that C2 treats loads from non escaping allocations specially wrt to MemBars. There used to be code ArrayCopyNode::may_modify() that would pattern match the expanded ArrayCopy shapes so a load from an array would not bypass an ArrayCopy that writes to that array but I reworked it with : https://github.com/openjdk/jdk/pull/23465/changes (and likely broke this particular case). The expectation now is that we would run into a MemBar that's flagged TrailingExpandedArrayCopy when we walk the memory graph after expansion. But we don't add one in this particular case MemBar. So I would suggest that as a fix:

Ok, great - thank you for your suggestions! Do you want to reassign the bug to yourself and propose a fix? Because what you are suggesting is totally disjoint from my changes, and it seems wrong that I am pushing your changes? Whatever you prefer, really.

rwestrel · 2026-04-21T13:47:23Z

Ok, great - thank you for your suggestions! Do you want to reassign the bug to yourself and propose a fix? Because what you are suggesting is totally disjoint from my changes, and it seems wrong that I am pushing your changes? Whatever you prefer, really.

Ok. Let me assign it to myself.

vnkozlov · 2026-04-21T14:54:29Z

Ok. Let me assign it to myself.

Thank you, @rwestrel

I assume you will have separate PR and we can close this one.

rwestrel · 2026-04-21T14:59:21Z

I assume you will have separate PR and we can close this one.

Yes, I will do that.

krk · 2026-04-21T15:44:26Z

Is #30075 a potential solution to this as well?

ysramakrishna · 2026-04-21T23:54:45Z

Ok, great - thank you for your suggestions! Do you want to reassign the bug to yourself and propose a fix? Because what you are suggesting is totally disjoint from my changes, and it seems wrong that I am pushing your changes? Whatever you prefer, really.

Ok. Let me assign it to myself.

Could you link the new PR here when you create it, @rwestrel. FWIW, the linked JBS ticket isn't visible at this time, as far as I can tell; not sure if that's intentional.

rwestrel · 2026-04-22T07:39:01Z

Is #30075 a potential solution to this as well?

I think so. The issue from:
#30775 (comment)
needs to be fixed as well. I'll take care of that one.

8380060: C2: Wrong execution with COH and arraycopy

66060f4

openjdk Bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label Apr 16, 2026

Better fix: clone should update all memory slices, not just raw

493c098

openjdk Bot added the hotspot hotspot-dev@openjdk.org label Apr 17, 2026

rkennke added 2 commits April 17, 2026 12:46

Fix wiring of memory

276b968

Adjust ArrayCopyNode::finish_transform() to accept the new possible s…

15144c4

…hape of AC memory

schlosna reviewed Apr 19, 2026

View reviewed changes

Comment thread src/hotspot/share/opto/arraycopynode.cpp Outdated

Simplify boolean expression

c213dea

rkennke marked this pull request as ready for review April 20, 2026 08:49

openjdk Bot added the rfr Pull request is ready for review label Apr 20, 2026

merykitty reviewed Apr 20, 2026

View reviewed changes

vnkozlov reviewed Apr 20, 2026

View reviewed changes

Clarify and simplify ACN::finish_transform()

f1943ff

vnkozlov reviewed Apr 20, 2026

View reviewed changes

rkennke added 2 commits April 20, 2026 21:02

Fix comment

3d6e038

Remove redundant set_all_memory_call

9c1b3e5

vnkozlov approved these changes Apr 20, 2026

View reviewed changes

merykitty approved these changes Apr 21, 2026

View reviewed changes

openjdk Bot added the ready Pull request is ready to be integrated label Apr 21, 2026

rwestrel mentioned this pull request Apr 22, 2026

8380060: C2: Wrong execution with COH and arraycopy #30873

Open

4 tasks

Conversation

rkennke commented Apr 16, 2026 • edited by openjdk Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Progress

Issue

Reviewers

Contributors

Reviewing

Uh oh!

rkennke commented Apr 16, 2026

Uh oh!

rkennke commented Apr 16, 2026

Uh oh!

bridgekeeper Bot commented Apr 16, 2026

Uh oh!

openjdk Bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openjdk Bot commented Apr 16, 2026

Uh oh!

openjdk Bot commented Apr 16, 2026

Uh oh!

openjdk Bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openjdk Bot commented Apr 16, 2026

Uh oh!

merykitty commented Apr 17, 2026

Uh oh!

openjdk Bot commented Apr 17, 2026

Uh oh!

rkennke commented Apr 17, 2026

Uh oh!

Uh oh!

mlbridge Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Webrevs

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vnkozlov left a comment

Choose a reason for hiding this comment

Uh oh!

merykitty commented Apr 20, 2026

Uh oh!

rkennke commented Apr 20, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

vnkozlov left a comment

Choose a reason for hiding this comment

Uh oh!

merykitty left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rkennke commented Apr 16, 2026 •

edited by openjdk Bot

Loading

openjdk Bot commented Apr 16, 2026 •

edited

Loading

openjdk Bot commented Apr 16, 2026 •

edited

Loading

mlbridge Bot commented Apr 20, 2026 •

edited

Loading