Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simd_shuffle intrinsic: allow argument to be passed as vector #128731

Merged
merged 2 commits into from
Aug 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions compiler/rustc_codegen_cranelift/src/intrinsics/simd.rs
Original file line number Diff line number Diff line change
Expand Up @@ -191,6 +191,14 @@ pub(super) fn codegen_simd_intrinsic_call<'tcx>(
})
.try_into()
.unwrap(),
_ if idx_ty.is_simd()
&& matches!(
idx_ty.simd_size_and_type(fx.tcx).1.kind(),
ty::Uint(ty::UintTy::U32)
) =>
{
idx_ty.simd_size_and_type(fx.tcx).0.try_into().unwrap()
}
_ => {
fx.tcx.dcx().span_err(
span,
Expand All @@ -213,6 +221,8 @@ pub(super) fn codegen_simd_intrinsic_call<'tcx>(

let total_len = lane_count * 2;

// FIXME: this is a terrible abstraction-breaking hack.
// Find a way to reuse `immediate_const_vector` from `codegen_ssa` instead.
Comment on lines +224 to +225
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😅 I guess this is preexisting...

let indexes = {
use rustc_middle::mir::interpret::*;
let idx_const = match &idx.node {
Expand Down
44 changes: 30 additions & 14 deletions compiler/rustc_codegen_gcc/src/builder.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1923,15 +1923,11 @@ impl<'a, 'gcc, 'tcx> Builder<'a, 'gcc, 'tcx> {
v2: RValue<'gcc>,
mask: RValue<'gcc>,
) -> RValue<'gcc> {
let struct_type = mask.get_type().is_struct().expect("mask should be of struct type");

// TODO(antoyo): use a recursive unqualified() here.
let vector_type = v1.get_type().unqualified().dyncast_vector().expect("vector type");
let element_type = vector_type.get_element_type();
let vec_num_units = vector_type.get_num_units();

let mask_num_units = struct_type.get_field_count();
let mut vector_elements = vec![];
let mask_element_type = if element_type.is_integral() {
element_type
} else {
Expand All @@ -1942,19 +1938,39 @@ impl<'a, 'gcc, 'tcx> Builder<'a, 'gcc, 'tcx> {
#[cfg(not(feature = "master"))]
self.int_type
};
for i in 0..mask_num_units {
let field = struct_type.get_field(i as i32);
vector_elements.push(self.context.new_cast(
self.location,
mask.access_field(self.location, field).to_rvalue(),
mask_element_type,
));
}

let mut mask_elements = if let Some(vector_type) = mask.get_type().dyncast_vector() {
let mask_num_units = vector_type.get_num_units();
let mut mask_elements = vec![];
for i in 0..mask_num_units {
let index = self.context.new_rvalue_from_long(self.cx.type_u32(), i as _);
mask_elements.push(self.context.new_cast(
self.location,
self.extract_element(mask, index).to_rvalue(),
mask_element_type,
));
}
mask_elements
} else {
let struct_type = mask.get_type().is_struct().expect("mask should be of struct type");
let mask_num_units = struct_type.get_field_count();
let mut mask_elements = vec![];
for i in 0..mask_num_units {
let field = struct_type.get_field(i as i32);
mask_elements.push(self.context.new_cast(
self.location,
mask.access_field(self.location, field).to_rvalue(),
mask_element_type,
));
}
mask_elements
};
let mask_num_units = mask_elements.len();

// NOTE: the mask needs to be the same length as the input vectors, so add the missing
// elements in the mask if needed.
for _ in mask_num_units..vec_num_units {
vector_elements.push(self.context.new_rvalue_zero(mask_element_type));
mask_elements.push(self.context.new_rvalue_zero(mask_element_type));
}

let result_type = self.context.new_vector_type(element_type, mask_num_units as u64);
Expand Down Expand Up @@ -1998,7 +2014,7 @@ impl<'a, 'gcc, 'tcx> Builder<'a, 'gcc, 'tcx> {

let new_mask_num_units = std::cmp::max(mask_num_units, vec_num_units);
let mask_type = self.context.new_vector_type(mask_element_type, new_mask_num_units as u64);
let mask = self.context.new_rvalue_from_vector(self.location, mask_type, &vector_elements);
let mask = self.context.new_rvalue_from_vector(self.location, mask_type, &mask_elements);
let result = self.context.new_rvalue_vector_perm(self.location, v1, v2, mask);

if vec_num_units != mask_num_units {
Expand Down
19 changes: 12 additions & 7 deletions compiler/rustc_codegen_gcc/src/intrinsic/simd.rs
Original file line number Diff line number Diff line change
Expand Up @@ -353,19 +353,24 @@ pub fn generic_simd_intrinsic<'a, 'gcc, 'tcx>(
}

if name == sym::simd_shuffle {
// Make sure this is actually an array, since typeck only checks the length-suffixed
// Make sure this is actually an array or SIMD vector, since typeck only checks the length-suffixed
// version of this intrinsic.
let n: u64 = match *args[2].layout.ty.kind() {
let idx_ty = args[2].layout.ty;
let n: u64 = match idx_ty.kind() {
ty::Array(ty, len) if matches!(*ty.kind(), ty::Uint(ty::UintTy::U32)) => {
len.try_eval_target_usize(bx.cx.tcx, ty::ParamEnv::reveal_all()).unwrap_or_else(
|| span_bug!(span, "could not evaluate shuffle index array length"),
)
}
_ => return_error!(InvalidMonomorphization::SimdShuffle {
span,
name,
ty: args[2].layout.ty
}),
_ if idx_ty.is_simd()
&& matches!(
idx_ty.simd_size_and_type(bx.cx.tcx).1.kind(),
ty::Uint(ty::UintTy::U32)
) =>
{
idx_ty.simd_size_and_type(bx.cx.tcx).0
}
_ => return_error!(InvalidMonomorphization::SimdShuffle { span, name, ty: idx_ty }),
};
require_simd!(ret_ty, InvalidMonomorphization::SimdReturn { span, name, ty: ret_ty });

Expand Down
19 changes: 12 additions & 7 deletions compiler/rustc_codegen_llvm/src/intrinsic.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1279,19 +1279,24 @@ fn generic_simd_intrinsic<'ll, 'tcx>(
}

if name == sym::simd_shuffle {
// Make sure this is actually an array, since typeck only checks the length-suffixed
// Make sure this is actually an array or SIMD vector, since typeck only checks the length-suffixed
// version of this intrinsic.
let n: u64 = match args[2].layout.ty.kind() {
let idx_ty = args[2].layout.ty;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"index type"?
isn't it more a "type at this index", so "this type's index"? Doesn't matter much, just curious.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the type of the argument named idx in the signature.

let n: u64 = match idx_ty.kind() {
ty::Array(ty, len) if matches!(ty.kind(), ty::Uint(ty::UintTy::U32)) => {
len.try_eval_target_usize(bx.cx.tcx, ty::ParamEnv::reveal_all()).unwrap_or_else(
|| span_bug!(span, "could not evaluate shuffle index array length"),
)
}
_ => return_error!(InvalidMonomorphization::SimdShuffle {
span,
name,
ty: args[2].layout.ty
}),
_ if idx_ty.is_simd()
&& matches!(
idx_ty.simd_size_and_type(bx.cx.tcx).1.kind(),
ty::Uint(ty::UintTy::U32)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we actually need to enforce signedness? Though the 32-bitness is certainly fixed...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We enforce it in the array version so I figured we should keep enforcing it in the vector version.

) =>
{
idx_ty.simd_size_and_type(bx.cx.tcx).0
}
_ => return_error!(InvalidMonomorphization::SimdShuffle { span, name, ty: idx_ty }),
};

let (out_len, out_ty) = require_simd!(ret_ty, SimdReturn);
Expand Down
2 changes: 1 addition & 1 deletion library/core/src/intrinsics/simd.rs
Original file line number Diff line number Diff line change
Expand Up @@ -232,7 +232,7 @@ extern "rust-intrinsic" {
///
/// `T` must be a vector.
///
/// `U` must be a **const** array of `i32`s. This means it must either refer to a named
/// `U` must be a **const** array or vector of `u32`s. This means it must either refer to a named
/// const or be given as an inline const expression (`const { ... }`).
///
/// `V` must be a vector with the same element type as `T` and the same length as `U`.
Expand Down
27 changes: 25 additions & 2 deletions tests/ui/simd/shuffle.rs
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,20 @@
#![allow(incomplete_features)]
#![feature(adt_const_params)]

use std::marker::ConstParamTy;

extern "rust-intrinsic" {
fn simd_shuffle<T, I, U>(a: T, b: T, i: I) -> U;
}
Comment on lines 11 to 13
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, can't we just, like, import this from core::intrinsics now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this test is older than that.


#[derive(Copy, Clone)]
#[derive(Copy, Clone, ConstParamTy, PartialEq, Eq)]
#[repr(simd)]
struct Simd<T, const N: usize>([T; N]);

pub unsafe fn __shuffle_vector16<const IDX: [u32; 16], T, U>(x: T, y: T) -> U {
unsafe fn __shuffle_vector16<const IDX: [u32; 16], T, U>(x: T, y: T) -> U {
simd_shuffle(x, y, IDX)
}
unsafe fn __shuffle_vector16_v2<const IDX: Simd<u32, 16>, T, U>(x: T, y: T) -> U {
simd_shuffle(x, y, IDX)
}

Expand All @@ -30,6 +35,17 @@ fn main() {
let y: Simd<u8, 2> = simd_shuffle(a, b, I2);
assert_eq!(y.0, [1, 5]);
}
// Test that we can also use a SIMD vector instead of a normal array for the shuffle.
const I1_SIMD: Simd<u32, 4> = Simd([0, 2, 4, 6]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually we avoid direct construction of Simd like this, but I suppose the fact that it occasionally emits illegal LLVMIR doesn't matter since this is const-evaluated. I suppose the other accesses and constructions here are preexisting, too, probably preceded that bug report, so not worth blocking this over since we should really try to scan the entire repo for them...

Copy link
Member Author

@RalfJung RalfJung Aug 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a soundness bug where directly constructing SIMD values can miscompile? We shouldn't work around that in code, we should get codegen fixed...

Is this tracked anywhere?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@scottmcm Can you refresh my memory on what the issue was here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Direct construction, as far as I know, is fine™. Both rust-lang/stdarch#1624 and (soon) #129403 have a bunch of it.

We might ban projecting into a simd type at some point in the future, because as I recall it's awkward for codegen, which doesn't want the lanes to be places, because it wants to extract/insert them instead of addressing them.

But that's not happened at this point, and is already in enough places that a few more is no big deal. And I think it might not even apply to Rvalue::Aggregate like seen here anyway, only to field projections.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(And, come to think of it, it might not even be a problem for array-based simd, because using the array as a place forces it to memory anyway. It makes me wonder if the problem was more about just .1 on a lots-of-fields simd type.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might ban projecting into a simd type at some point in the future, because as I recall it's awkward for codegen, which doesn't want the lanes to be places, because it wants to extract/insert them instead of addressing them.

Makes sense. However, if "awkward" leads to "generate wrong code" we should track that as an I-unsound issue somewhere.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know any current "wrong code" cases, just recall that cg_clif had to do a bunch of extra work to make it work.

const I2_SIMD: Simd<u32, 2> = Simd([1, 5]);
unsafe {
let x: Simd<u8, 4> = simd_shuffle(a, b, I1_SIMD);
assert_eq!(x.0, [0, 2, 4, 6]);

let y: Simd<u8, 2> = simd_shuffle(a, b, I2_SIMD);
assert_eq!(y.0, [1, 5]);
}

// Test that an indirection (via an unnamed constant)
// through a const generic parameter also works.
// See https://github.com/rust-lang/rust/issues/113500 for details.
Expand All @@ -42,4 +58,11 @@ fn main() {
Simd<u8, 16>,
>(a, b);
}
unsafe {
__shuffle_vector16_v2::<
{ Simd([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]) },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same thought as above.

Simd<u8, 16>,
Simd<u8, 16>,
>(a, b);
}
}
Loading