Skip to content

Commit b22b282

Browse files
Rollup merge of #154043 - RalfJung:simd-min-max, r=Amanieu,calebzulawski,antoyo
simd_fmin/fmax: make semantics and name consistent with scalar intrinsics This is the SIMD version of rust-lang/rust#153343: change the documented semantics of the SIMD float min/max intrinsics to that of the scalar intrinsics, and also make the name consistent. The overall semantic change this amounts to is that we restrict the non-determinism: the old semantics effectively mean "when one input is an SNaN, the result non-deterministically is a NaN or the other input"; the new semantics say that in this case the other input must be returned. For all other cases, old and new semantics are equivalent. This means all users of these intrinsics that were correct with the old semantics are still correct: the overall set of possible behaviors has become smaller, no new possible behaviors are being added. In terms of providers of this API: - Miri, GCC, and cranelift already implement the new semantics, so no changes are needed. - LLVM is adjusted to use `minimumnum nsz` instead of `minnum`, thus giving us the new semantics. In terms of consumers of this API: - Portable SIMD almost certainly wants to match the scalar behavior, so this is strictly a bugfix here. - Stdarch mostly stopped using the intrinsic, except on nvptx, where arguably the new semantics are closer to what we actually want than the old semantics (#2056). Q: Should there be an `f` in the intrinsic name to indicate that it is for floats? E.g., `simd_fminimum_number_nsz`? Also see rust-lang/rust#153395.
2 parents d810ec4 + e700d17 commit b22b282

2 files changed

Lines changed: 9 additions & 9 deletions

File tree

crates/core_arch/src/nvptx/packed.rs

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -99,7 +99,7 @@ pub unsafe fn f16x2_neg(a: f16x2) -> f16x2 {
9999
#[cfg_attr(test, assert_instr(min.f16x2))]
100100
#[unstable(feature = "stdarch_nvptx", issue = "111199")]
101101
pub unsafe fn f16x2_min(a: f16x2, b: f16x2) -> f16x2 {
102-
simd_fmin(a, b)
102+
simd_minimum_number_nsz(a, b)
103103
}
104104

105105
/// Find the minimum of two values, NaNs pass through.
@@ -123,7 +123,7 @@ pub unsafe fn f16x2_min_nan(a: f16x2, b: f16x2) -> f16x2 {
123123
#[cfg_attr(test, assert_instr(max.f16x2))]
124124
#[unstable(feature = "stdarch_nvptx", issue = "111199")]
125125
pub unsafe fn f16x2_max(a: f16x2, b: f16x2) -> f16x2 {
126-
simd_fmax(a, b)
126+
simd_maximum_number_nsz(a, b)
127127
}
128128

129129
/// Find the maximum of two values, NaNs pass through.

crates/core_arch/src/x86/sse.rs

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -208,7 +208,7 @@ pub fn _mm_min_ss(a: __m128, b: __m128) -> __m128 {
208208
#[cfg_attr(test, assert_instr(minps))]
209209
#[stable(feature = "simd_x86", since = "1.27.0")]
210210
pub fn _mm_min_ps(a: __m128, b: __m128) -> __m128 {
211-
// See the `test_mm_min_ps` test why this can't be implemented using `simd_fmin`.
211+
// See the `test_mm_min_ps` test why this can't be implemented using `simd_minimum_number_nsz`.
212212
unsafe { minps(a, b) }
213213
}
214214

@@ -234,7 +234,7 @@ pub fn _mm_max_ss(a: __m128, b: __m128) -> __m128 {
234234
#[cfg_attr(test, assert_instr(maxps))]
235235
#[stable(feature = "simd_x86", since = "1.27.0")]
236236
pub fn _mm_max_ps(a: __m128, b: __m128) -> __m128 {
237-
// See the `test_mm_min_ps` test why this can't be implemented using `simd_fmax`.
237+
// See the `test_mm_min_ps` test why this can't be implemented using `simd_maximum_number_nsz`.
238238
unsafe { maxps(a, b) }
239239
}
240240

@@ -2227,11 +2227,11 @@ mod tests {
22272227
let r = _mm_min_ps(a, b);
22282228
assert_eq_m128(r, _mm_setr_ps(-100.0, 5.0, 0.0, -10.0));
22292229

2230-
// `_mm_min_ps` can **not** be implemented using the `simd_min` rust intrinsic. `simd_min`
2231-
// is lowered by the llvm codegen backend to `llvm.minnum.v*` llvm intrinsic. This intrinsic
2232-
// doesn't specify how -0.0 is handled. Unfortunately it happens to behave different from
2233-
// the `minps` x86 instruction on x86. The `llvm.minnum.v*` llvm intrinsic equals
2234-
// `r1` to `a` and `r2` to `b`.
2230+
// `_mm_min_ps` can **not** be implemented using the `simd_minimum_number_nsz` rust
2231+
// intrinsic. That intrinsic is lowered by the llvm codegen backend to `llvm.minimumnum.v*`
2232+
// llvm intrinsic with the `nsz` attribute. The `nsz` attribute means -0.0 is handled
2233+
// non-deterministically. The `minps` x86 instruction however has a deterministic semantics
2234+
// for signed zeros.
22352235
let a = _mm_setr_ps(-0.0, 0.0, 0.0, 0.0);
22362236
let b = _mm_setr_ps(0.0, 0.0, 0.0, 0.0);
22372237
let r1 = _mm_min_ps(a, b).as_f32x4().to_bits();

0 commit comments

Comments
 (0)