Skip to content
Open
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 8 additions & 2 deletions src/hotspot/share/gc/shared/c2/barrierSetC2.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -742,9 +742,15 @@ void BarrierSetC2::clone(GraphKit* kit, Node* src_base, Node* dst_base, Node* si
}
Node* n = kit->gvn().transform(ac);
if (n == ac) {
const TypePtr* raw_adr_type = TypeRawPtr::BOTTOM;
ac->set_adr_type(TypeRawPtr::BOTTOM);
kit->set_predefined_output_for_runtime_call(ac, ac->in(TypeFunc::Memory), raw_adr_type);
// The clone copies the entire object content. Set all memory slices
// to the clone's output so that subsequent typed loads see the
// clone's writes. Using set_predefined_output_for_runtime_call with
// TypeRawPtr::BOTTOM would only update the raw slice, leaving typed
// per-instance slices (from escape analysis) pointing to the
// pre-clone memory state via the Initialize's NarrowMemProj.
kit->set_predefined_output_for_runtime_call(ac);
kit->set_all_memory_call(ac);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we can't use ac->connect_outputs(kit); here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

connect_outputs() also handles IO and exception paths, which are not needed/used here?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking on set_predefined_output_for_runtime_call(ac) and it is only do set_control() and set_all_memory_call(call) for your case. So you don't actually need call memory set again.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, that is redundant. I'll remove it.

} else {
kit->set_all_memory(n);
}
Expand Down
14 changes: 11 additions & 3 deletions src/hotspot/share/opto/arraycopynode.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -507,13 +507,21 @@ bool ArrayCopyNode::finish_transform(PhaseGVN *phase, bool can_reshape,
Node* out_mem = proj_out(TypeFunc::Memory);

BarrierSetC2* bs = BarrierSet::barrier_set()->barrier_set_c2();
if (out_mem->outcnt() != 1 || !out_mem->raw_out(0)->is_MergeMem() ||
out_mem->raw_out(0)->outcnt() != 1 || !out_mem->raw_out(0)->raw_out(0)->is_MemBar()) {
// Expected pattern: out_mem → MergeMem → MemBar, or
// out_mem → MemBar (when MergeMem was folded away).
Node* mem_use = out_mem->raw_out(0);
if (out_mem->outcnt() != 1 ||
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This De Morgan is quite hard to grasp, do you think !(mem_use->is_MemBar() || (mem_use->is_MergeMem() && mem_use->outcnt() == 1 && mem_use->raw_out(0)->is_MemBar()) would be a little easier to read? Not so sure tbh.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, it is easier to read when it's mostly positive tests, with a single leading negation. Also, I am pulling out the outcnt == 1 test to before we actually fetch out_mem->raw_out(0), while that should be safe, it looks brittle and is definitely safer if checked before.

!(mem_use->is_MergeMem() || mem_use->is_MemBar()) ||
(mem_use->is_MergeMem() &&
(mem_use->outcnt() != 1 || !mem_use->raw_out(0)->is_MemBar()))) {
Comment thread
rkennke marked this conversation as resolved.
Outdated
assert(bs->array_copy_requires_gc_barriers(true, T_OBJECT, true, is_clone_inst(), BarrierSetC2::Optimization), "can only happen with card marking");
return false;
}

igvn->replace_node(out_mem->raw_out(0), mem);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite follow this piece, why do we try to replace only the first output of out_mem, what happens if out_mem has multiple outputs? If out_mem should only have 1 output, then an assert seems more suitable.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the assumption is that out_mem only has one use. We check that in the big expression above and it was pre-existing. I also think this should be an assertion, but I'm a little hesitant tbh.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah wait, I think the assertion there already answers this: the known case where the expectation outcnt == 1 might not hold true is with card-marking, and we bail out and not decompose the clone to begin with.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, I misparsed that.

// Replace the MergeMem (or MemBar if MergeMem was folded) with
// the new memory from the decomposed clone.
Node* replace_target = mem_use->is_MergeMem() ? mem_use : out_mem;
Comment thread
rkennke marked this conversation as resolved.
Outdated
igvn->replace_node(replace_target, mem);

Node* out_ctl = proj_out(TypeFunc::Control);
igvn->replace_node(out_ctl, ctl);
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
/*
* Copyright (c) 2026, Oracle and/or its affiliates. All rights reserved.
* DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
*
* This code is free software; you can redistribute it and/or modify it
* under the terms of the GNU General Public License version 2 only, as
* published by the Free Software Foundation.
*
* This code is distributed in the hope that it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
* FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
* version 2 for more details (a copy is included in the LICENSE file that
* accompanied this code).
*
* You should have received a copy of the GNU General Public License version
* 2 along with this work; if not, write to the Free Software Foundation,
* Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
*
* Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA
* or visit www.oracle.com if you need additional information or have any
* questions.
*/

/*
* @test
* @bug 8380060
* @summary C2 block arraycopy initial-word pick-off folds load to zero
*
* @run main/othervm -Xbatch -XX:+UseCompactObjectHeaders
* -XX:CompileCommand=compileonly,compiler.arraycopy.TestBlockArrayCopyMisaligned::test*
* compiler.arraycopy.TestBlockArrayCopyMisaligned
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

* @run main/othervm -Xbatch -XX:+UseCompactObjectHeaders
*                   -XX:CompileCommand=compileonly,${test.main.class}::test*
*                   ${test.main.class}

This should be better.

* @run main/othervm -Xbatch -XX:-UseCompactObjectHeaders
* -XX:CompileCommand=compileonly,compiler.arraycopy.TestBlockArrayCopyMisaligned::test*
* compiler.arraycopy.TestBlockArrayCopyMisaligned
*/

package compiler.arraycopy;

import java.util.Arrays;

public class TestBlockArrayCopyMisaligned {

// Byte array with enough data to exercise the block arraycopy path.
// The clone + substring pattern triggers a tightly-coupled allocation
// where the source is initialized via a raw arraycopy (clone) and the
// destination is a nozero allocation filled by a block arraycopy.
static final byte[] BYTES = new byte[] {
108, 77, 120, 120, 100, 77, 105, 113,
83, 97, 71, 108, 116, 107, 90, 71,
72, 107, 73, 85, 54, 65, 61, 61
};

// Test via String(byte[],hibyte,off,count) + substring.
// The deprecated String constructor clones the byte array via
// Arrays.copyOfRange; substring then does another copyOfRange
// on the clone, triggering the block arraycopy optimization.
static String testStringSubstring() {
String s = new String(BYTES, 0, 0, 24);
return s.substring(0, 23);
}

// Test via explicit clone + Arrays.copyOfRange.
static byte[] testCloneCopyOfRange() {
byte[] clone = BYTES.clone();
return Arrays.copyOfRange(clone, 0, 23);
}

// Test via explicit clone + Arrays.copyOf.
static byte[] testCloneCopyOf() {
byte[] clone = BYTES.clone();
return Arrays.copyOf(clone, 20);
}

public static void main(String[] args) {
String golden = new String(BYTES, 0, 0, 23);

for (int i = 0; i < 10_000; i++) {
// Test 1: String path
String result1 = testStringSubstring();
if (!result1.equals(golden)) {
throw new RuntimeException("testStringSubstring: expected "
+ golden + ", got " + result1);
}

// Test 2: clone + copyOfRange
byte[] result2 = testCloneCopyOfRange();
for (int j = 0; j < 23; j++) {
if (result2[j] != BYTES[j]) {
throw new RuntimeException("testCloneCopyOfRange: mismatch at index " + j
+ ", expected " + BYTES[j] + ", got " + result2[j]);
}
}

// Test 3: clone + copyOf
byte[] result3 = testCloneCopyOf();
for (int j = 0; j < 20; j++) {
if (result3[j] != BYTES[j]) {
throw new RuntimeException("testCloneCopyOf: mismatch at index " + j
+ ", expected " + BYTES[j] + ", got " + result3[j]);
}
}
}
}
}