-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Add a scalable representation to allow support for scalable vectors #3268
base: master
Are you sure you want to change the base?
Conversation
I think a more general definition of an "opaque" type would be useful. This is a type which can exist in a register but not in memory, specifically:
Other that ARM and RISC-V scalable vectors, this would also be useful to represent reference types in WebAssembly. These are opaque references to objects which can only be used as local variables or function arguments and can't be written to WebAssembly memory. |
ARM SVE uses
|
I noticed that seeing the vector length pseudoregister at runtime was considered undefined behavior. For RISC-V, rather than masking out elements that aren't used, it seems to primarily focus on setting the VL register, which is an actual register that needs to be modified when switching between different vector types. It also let's you change the actual "register size" by grouping together multiple physical registers, which is used either to save instructions or to facilitate type conversions. (ie casting from a u16 vector to a u32 vector puts the result across 2 contiguous vector registers, which can then be used as though they're one register.) |
@boomshroom "That vscale is constant -- that the number of elements in a scalable vector does not change during program execution -- is baked into the accepted scalable vector type proposal from top to bottom and in fact was one of the conditions for its acceptance" - https://lists.llvm.org/pipermail/llvm-dev/2019-October/135560.html It might just be a case of changing the wording so that it's more clear that causing @Amanieu Just to be clear though, are you asking me to transform this into a more general RFC for opaque types, or just mention them? |
ARM offers ACLEs, which can read the vscale. I have an array of floats, then I read them with ACLE SVE. Do SVE types ever exist in memory or only in registers? |
I don't think this needs to be a general RFC on opaque types, but more details on how scalable vectors differ from normal types would be nice to have. |
There are SVE registers. The calling convention can probably pass scalable vectors on the stack. Then it will be vscale * 1 bytes. It has to be a fixed size. |
If you have too much time, you can actually play with a SVE box: |
One selling point of SVE is: if you use ARM ACLE SVE intrinsics and you follow the rules, then your program will run on 256-bit and 2048-bit hardware. ARM SVE are plain Cray vectors. I believe the RISC-V scalable vectors are more elaborate. |
I'm honestly a bit confused by this RFC. I understand the benefits of SVE and what it is, but I'm not 100% sure what it's asking. Specifically, it seems like it's suggesting stabilising Like, I'm sold on the idea of having scalable vectors in stdlib, but unsure about both what the RFC is proposing, and the potential implementation. |
> wc -l arm_sve.h
24043 arm_sve.h |
@Amanieu Mostly agree with #3268 (comment), just had a couple notes:
|
@tschuett This is an RFC, not IRC. Please only leave productive comments that advance the state of the conversation instead of non-contributing allusions that have no clear meaning. I can't even tell if your remark is critical or supportive. |
Sorry for my misbehaviour. I am supportive of adding scalable vectors to Rust. Because of type inference you cannot see that the |
The real questions is whether you want to make scalable vectors target-dependent (SVE, RISC-V). |
Imho scalable vectors should be target independent, the compiler backend will simply pick a suitable constant for vscale at compile time if not otherwise supported. |
Note that vscale is a LLVM thing and should not be part of the RFC. LLVM assumes the vscale is an unknown but constant value during the execution of the program. The real value is hardware dependent. |
I think it should not be dismissed just because it's a LLVM thing: every other compiler will have a similar constant simply because they need to represent scalable vectors as some multiple of an element count, that multiple is vscale. Also, there should be variants for vectors like llvm's https://reviews.llvm.org/D53695
|
Do you want to expose this in Rust or should it be a an implementation detail of the compiler? |
imho @rust-lang/project-portable-simd should expose scalable vector types with vscale, an additional multiplier, and an element type -- perhaps by exposing a wrapper struct that also contains the number of valid elements (like |
One important thing that imho this RFC needs to be usable by portable-simd is for the element type and the multiplier to be able to be generics: #[repr(simd, scalable(MUL))]
struct ScalableVector<T, const MUL: usize>([T; 0]); portable-simd's exposed wrapper type might be: pub struct ScalableSimd<T, const MUL: usize>
where
T: ElementType,
ScalableMul<MUL>: SupportedScalableMul,
{
len: u32, // exposed as usize, but realistically u32 is big enough
value: ScalableVector<T, MUL>,
} |
How about this notation (without the 4): #[repr(simd, scalable)]
#[derive(Clone, Copy)]
pub struct svfloat32_t {
_ty: [f32; 0],
} It is a target-indent scalable vector of |
@tschuett My intention was that the feature proposed by this RFC would be target independent, and the rustc implementation would be target independent. |
Honestly my RISC-V knowledge is limited. If you say that I agree with your vscale vector examples. Maybe you can query LLVM for information about targets. |
For reference, IBM is also working on a scalable vector ISA: |
Yes, that's the intended meaning. Feel free to suggest better wording in that thread.
According to the RISC-V vector spec:
At the LLVM level, it's just treated as |
Does that really work? Rust normally considers uninitialised values (outside
It's not free; most machine instructions support either zeroing or merging predication, so an extra instruction might be required to implement whichever is not inherently supported. The |
So LLVM has specific intrinsics for Risc-V? Or which LLVM operations are we talking about here? |
One random example: They are namespaced with llvm.riscv. |
LLVM tracks |
Okay, so these are platform-specific intrinsics that have semantics on the LLVM IR level, makes sense. These signatures are hard to read and the function names look like gibberish to the untrained eye. What can I expect these operations to be like, when we express them in Rust? Something like this? /// Returns the following function applied pointwise:
/// fn add_masked(x: MaybeUninit<T>, y: MaybeUninit<T>, mask: bool) -> MaybeUninit<T> {
/// if mask {
/// MaybeUnunit::new(x.assume_init() + y.assume_init())
/// } else {
/// MaybeUninit::uninit()
/// }
/// }
fn simd_add_masked<T, N>(x: Simd<T, N>, y: Simd<T, N>, mask: Mask<T, N>) -> Simd<T, N> IOW, if any of the input elements inside the mask are uninit, we make it immediate UB? (I'm aware that in LLVM this will be delayed UB via poison/undef, but that is something we avoided in Rust semantics so far. Also with some of the plans LLVM has for the near future, it would probably be a really bad idea to have poison values in Rust. And undef is going away in LLVM.)
|
* These types can be loaded and stored to/from memory for spilling to the stack, | ||
and to follow any calling conventions. | ||
* Can't be stored in a struct, enum, union or compound type. | ||
* This includes single field structs with `#[repr(trasparent)]`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So what should happen when I do
#[repr(transparent)]
struct Wrap<T>(T);
type MyTy = Wrap<svfloat32_t>;
Are scalable SIMD types not allowed to instantiate generic parameters? Are there new post-monomorphization errors for when a generic instantiation turns out to break rules like this?
|
||
This new class of type has the following properties: | ||
* Not `Sized`, but it does exist as a value type. | ||
* These can be returned from functions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to indicate that we need support for "unsized (r)values" to use this feature. Unfortunately the current state of unsized values is "they are a complete mess, and don't even have a consistent MIR-level semantics".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't currently have support for returning unsized from functions in Rust. I would like for this RFC to better detail the impact that this will have on implementing scalable vectors in Rust. I hope I can provide some helpful information below.
If we look at this example, we can see that C/C++ can handle:
- Scalable types as function params
- Scalable types as local variables
- Scalable types as return values
How does C/C++ do it?
These types in C/C++ are both sizeless and scalable sized. It seems that they invoke either of these properties where it is convenient. For example if you try to take sizeof
on a scalable type:
<source>:8:5: error: invalid application of 'sizeof' to sizeless type 'vint32m8_t' (aka '__rvv_int32m8_t')
8 | sizeof(vint32m8_t);
Another example of the scalable type being sizeless is in ASTContext::getTypeInfoImpl:
// Because the length is only known at runtime, we use a dummy value
// of 0 for the static length.
#define SVE_VECTOR_TYPE(Name, MangledName, Id, SingletonId, NumEls, ElBits, \
IsSigned, IsFP, IsBF) \
case BuiltinType::Id: \
Width = 0;
But on the other hand, clang also treats these types as having a scalable size which can be resolved at runtime. There is a function getBuiltinVectorTypeInfo. In this function you can see how a BuiltinVectorTypeInfo
object gets created using ElementCount::getScalable
:
#define SVE_ELTTY(ELTTY, ELTS, NUMVECTORS) \
{ELTTY, llvm::ElementCount::getScalable(ELTS), NUMVECTORS};
// ... snip
#define RVV_VECTOR_TYPE_INT(Name, Id, SingletonId, NumEls, ElBits, NF, \
IsSigned) \
case BuiltinType::Id: \
return {getIntTypeForBitwidth(ElBits, IsSigned), \
llvm::ElementCount::getScalable(NumEls), NF};
Then in SemaChecking.cpp, there are function calls such as areCompatibleSveTypes
, checkRVVTypeSupport
, CheckImplicitConverssion
which type check treating these types as having a scalable size.
When it comes to code-gen to LLVM IR, Rust unsized
types have been tricky because it can be difficult to lower unsized types, especially when it comes to return types. But that isn't the case with scalable types. Rust scalable types can be mapped to LLVM scalable types. I think this may allow us to sidestep a lot of the complications that come with supporting general unsized types
in Rust. Using the godbolt example above we see that the C scalable/sizeless types lowered as LLVM scalable types:
%7 = load i64, ptr %4, align 8
%8 = call <vscale x 16 x i32> @foo(__rvv_int32m8_t, unsigned long)(<vscale x 16 x i32> %6, i64 noundef %7)
store <vscale x 16 x i32> %8, ptr %5, align 4
Relying on Builtins
One important point I want to make here is that C/C++ is limiting scalable/sizeless types to builtins. For example, you cant define your own scalable type. In addition you cant define data structures using existing builtin scalable types:
// This is an error
struct foo {
vint32m8_t b;
vint32m8_t a;
};
As a result, the scope of handling these types is greatly reduced. As I pointed out above, functions like areCompatibleSveTypes
, checkRVVTypeSupport
know how to type check specifically on these scalable types. There is explicit lowering of intrinsics that operate on these types. I believe that by restricting support to only care for handling unsized scalable builtins, then we may not have to concern ourselves with what a mess general unsized types are in Rust.
What does this mean for Rust
I hope that this RFC can clarify what it will look like to add support for scalable vectors, in the context of unsized
in Rust. Some questions I would like to clarify:
- Will we support unsized fn params, unsized local variables, and unsized return values in general, or will we limit the scope to scalable types? I am leaning towards the latter, especially because supporting unsized return values might be a massive undertaking, if it is possible at all. I think if you choose the former, then we should have an RFC on adding that feature to the language. I've started inquiring about that topic on this Zulip thread in attempt to understand if any work had been done yet.
- Will scalable types be builtin or can people define their own scalable types in their own Rust programs? If we choose the builtin path, I would like this RFC to discuss adding builtins under Prior Art.
- If we sometimes treat these types as
unsized
and sometimes treat them as having scalable sized, what features will we need to include? Would we require something like#![feature(unsized_fn_params, unsized_locals, unsized_ret_vals)]
, #![feature(scalable_types)]`, or both?
* Heap allocation of these types is not possible. | ||
* Can be passed by value, reference and pointer. | ||
* The types can't have a `'static` lifetime. | ||
* These types can be loaded and stored to/from memory for spilling to the stack, | ||
and to follow any calling conventions. | ||
* Can't be stored in a struct, enum, union or compound type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a wild list of restrictions, and the RFC does not explain why they are needed. Further down it seems like really these types are just "slices where the length is determined by a run-time constant". Slices don't have most of these restrictions, so why do scalable SIMD types need them?
* These can be returned from functions. | ||
* Heap allocation of these types is not possible. | ||
* Can be passed by value, reference and pointer. | ||
* The types can't have a `'static` lifetime. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, so svfloat32_t: 'static
is not true? But there's no lifetime in this type so this statement must be true. What is this about?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That might be poorly phrased by me, I was referring to the fact these cant exist as a static variable. I can update the RFC to make that clearer.
@Amanieu This sounds potentially useful except that it is contradicted by the language in this RFC. I don't understand how a type which cannot be in memory can be put in memory for the platform calling convention and can also be passed by pointer. Can you explain what's going on here? |
honestly all the arbitrary restrictions (inherited from C) sound like ARM didn't want to bother to implement dynamically-sized types that are usable anywhere a usual type is, so they came up with some restrictions so they didn't have to, except that they arbitrarily chose where they were willing to put in the work and where they decided the didn't want to. I think Rust should be more consistent about where it supports types. |
… r=Amanieu Stabilize Ratified RISC-V Target Features Stabilization PR for the ratified RISC-V target features. This stabilizes some of the target features tracked by #44839. This is also a part of #114544 and eventually needed for the RISC-V part of rust-lang/rfcs#3268. There is a similar PR for the the stdarch crate which can be found at rust-lang/stdarch#1476. This was briefly discussed on Zulip (https://rust-lang.zulipchat.com/#narrow/stream/250483-t-compiler.2Frisc-v/topic/Stabilization.20of.20RISC-V.20Target.20Features/near/394793704). Specifically, this PR stabilizes the: * Atomic Instructions (A) on v2.0 * Compressed Instructions (C) on v2.0 * ~Double-Precision Floating-Point (D) on v2.2~ * ~Embedded Base (E) (Given as `RV32E` / `RV64E`) on v2.0~ * ~Single-Precision Floating-Point (F) on v2.2~ * Integer Multiplication and Division (M) on v2.0 * ~Vector Operations (V) on v1.0~ * Bit Manipulations (B) on v1.0 listed as `zba`, `zbc`, `zbs` * Scalar Cryptography (Zk) v1.0.1 listed as `zk`, `zkn`, `zknd`, `zkne`, `zknh`, `zkr`, `zks`, `zksed`, `zksh`, `zkt`, `zbkb`, `zbkc` `zkbx` * ~Double-Precision Floating-Point in Integer Register (Zdinx) on v1.0~ * ~Half-Precision Floating-Point (Zfh) on v1.0~ * ~Minimal Half-Precision Floating-Point (Zfhmin) on v1.0~ * ~Single-Precision Floating-Point in Integer Register (Zfinx) on v1.0~ * ~Half-Precision Floating-Point in Integer Register (Zhinx) on v1.0~ * ~Minimal Half-Precision Floating-Point in Integer Register (Zhinxmin) on v1.0~ r? `@Amanieu`
On RISC-V, |
Existing SIMD types are tagged with a `repr(simd)` and contain an array or multiple fields to represent the size of the | ||
vector. Scalable vectors have a size known (and constant) at run-time, but unknown at compile time. For this we propose a | ||
new kind of exotic type, denoted by an additional `repr()`, and based on a ZST. This additional representation, `scalable`, | ||
accepts an integer to determine the number of elements per granule. See the definitions in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In LLVM, a scalable type is represented as an (ElementCount NumElts, Type EltTy)
. An ElementCount
is represented by (IsScalable, MinNumElts)
. Maybe it would be good if called it the minimum number of elements instead of granule
?
|
||
```rust | ||
#[repr(simd, scalable(4))] | ||
pub struct svfloat32_t { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused on where scalable(4)
comes into play here? I was looking at the svfloat32_t
type in C, which is really backed by the builtin type __SVInt64_t
and I couldn't find how that type was tied to a minimum element count of 4.
Am I missing where C SVE intrinsics tie svfloat32_t
to a minimum number of elements? Or is this something that you are proposing Rust does that is missing in C?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be related to the fact that the LLVM representation of the type is <vscale x 4 x f32>
, which means that we assume the hardware scales in units of 128bits (that fit 4 f32). On hardware with a different scaling unit, this will be suboptimal -- or maybe even not work, if the scaling unit is smaller than 128 bits. IOW, this type is pretty non-portable.
That's my understanding based on reading the LLVM LangRef; maybe I got it all wrong. Unfortunately the RFC doesn't explain enough to be able to say -- it assumes a bunch of background on how these scalable vector types work in LLVM / hardware.
This new class of type has the following properties: | ||
* Not `Sized`, but it does exist as a value type. | ||
* These can be returned from functions. | ||
* Heap allocation of these types is not possible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In C, heap allocation depends on malloc
which takes a size. You can't call sizeof
on the an unsized type in C. So it is a compiler error to write malloc(sizeof(vint8mf8_t))
. In this sense, unsized types may seem non-heap-allocatable.
However, I took a look at the RISC-V "V" C intrinsics trying to understand whether this had to be the case. On RISC-V a vector register has a size, even if it is unknown at compile time (due to the vscale
). However, the __riscv_vlenb
C intrinsic could be used to write programs that determine the size of the vector register associated with a type at runtime. As a result, it should be possible to do something like this. Using pseudo-code:
vscale = __riscv_vlenb() / 64;
// helper func that returns the minimum vector size (i.e. size without vscale or multiplied by a vscale of 1)
min_vec_size = get_min_size(vint8mf8_t);
vint8mf8_t *heap_allocated_scalable = malloc(to_bytes_from_bits(vscale * min_vec_size));
So while it may be a little convoluted (and target dependent) to allocate these types on the heap, I think it is possible. Maybe it would be better to drop this as a requirement but note that initially there will not be support for allocating these types on the heap.
Would it make sense to consider the alternative of not exposing these scalable vector types in Rust at all, and instead have them entirely handled by codegen? In other words, when I have a large but statically sized vector |
`vscale` could be 1, 2, 4, 8, 16 which would give register sizes of 128, 256, | ||
512, 1024 and 2048. While SVE now has the power of 2 restriction, `vscale` could | ||
be any value providing it gives a legal vector register size for the | ||
architecture. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds like a pretty bad API for portable efficient vector programming. I thought the point was to not have to know the vector size supported by the hardware, so I could e.g. use <vscale x i32>
to get a vector of i32
that's the ideal size for this hardware. But now it seems like I still have to know the hardware I am writing for so that I can use <vscale x 4 x i32>
on ARM while using e.g. <vscale x 8 x i32>
on some target where vscale
measures multiples of 256 bits.
Ideally for Rust we should have a version of this that does not require me to know the hardware's "vector scaling unit" (i.e. the size that corresponds to an LLVM vscale of 1).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so I could e.g. use to get a vector of i32
A scalable type has a minimum size component. For example <vscale x 4 x i32>
But now it seems like I still have to know the hardware I am writing for
I'm not sure thats true in all instances. In LLVM, vector types go through type legalization in SelectionDAG or GlobalISel, which are components responsible for translating IR into target specific instructions. In cases where SelectionDAG or GlobalISel see a vector type that is not supported, the legalizer will try to put it into a form that the hardware can support. One example of this is on RISC-V where all fixed vectors are legalized into scalable vectors.
Ideally for Rust we should have a version of this that does not require me to know the hardware's "vector scaling unit" (i.e. the size that corresponds to an LLVM vscale of 1).
As LLVM scalable types exist today, we don't know what vscale is until runtime. So you are not required to know the hardware's scaling unit at compile time.
(i.e. the size that corresponds to an LLVM vscale of 1).
This sounds like a suggestion to use fixed sized vectors instead of scalable vectors in cases where your really need it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything I am saying is based on the LangRef: "For scalable vectors, the total number of elements is a constant multiple (called vscale) of the specified number of elements; vscale is a positive integer that is unknown at compile time and the same hardware-dependent constant for all scalable vectors at run time. The size of a specific scalable vector type is thus constant within IR, even if the exact size in bytes cannot be determined until run time.".
IOW, this is not a minimum size. <vscale x 4 x i32>
means "some constant times 4 x i32
". And if you also have a <vscale x 2 x i32>
then that's the same constant times 2 x i32
". So, <vscale x 4 x i32>
will always be exactly twice as large as <vscale x 2 x i32>
. If the ARM chip has vectors of size 512bit, then vscale=4 and <vscale x 2 x i32>
will be only 256bit in size, so half the vector width was wasted. One therefore has to carefully pick the unit that is being scaled to match the hardware.
As LLVM scalable types exist today, we don't know what vscale is until runtime. So you are not required to know the hardware's scaling unit at compile time.
I was talking about the scalable vector unit, not the scalable vector factor. (I am making up terms here as LangRef doesn't give me good terms to work with.) On ARM, the "unit" is 128bit large. The factor then determines the actual size of the vector registers, in units of 128bit. So a factor of 4 means the registers are 512 bit large. With the interface provided by LLVM, one has to know the unit (not the factor!) at compiletime to generate optimal code.
Or maybe I got it all wrong. But the LangRef description is not compatible with your claim that the 4
in vscale x 4 x i32
is a minimum.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was talking about the scalable vector unit
Do you mind giving a definition of what a unit is? Is that the fixed components of the vector type? For <vscale x 4 x i32>
the unit is 4 x i32
?
... With the interface provided by LLVM, one has to know the unit (not the factor!) at compile time to generate optimal code.
I'm not so sure about ARM, but I know that RISC-V can generate code for all different "units" regardless the runtime vscale
value. You can pick whatever "unit" you'd like to use.
But the LangRef description is not compatible with your claim that the 4 in
<vscale x 4 x i32>
is a minimum.
It is a minimum because the smallest runtime value of vscale
is 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a minimum because the smallest runtime value of vscale is 1.
If describing it as a minimum is a sufficient description, then <vscale x 2 x i32>
and <vscale x 4 x i32>
should both be vectors of size 128bit (if the platform has registers of that size), right? I am asking for "at least 2 (or 4) i32, but ideally as many as the hardware provides".
But that's not correct, according to LangRef. Ergo, saying it is a minimum is misleading. The type is not defined as "at least that big", it is defined as "the hardware-specific scaling factor times that base size". If you pick the base size too small (smaller than the scaling unit of the hardware), you will waste register space. If you pick it too big, presumably LLVM complains.
Do you mind giving a definition of what a unit is?
It's how much you get when the factor is 1
. I am talking about a hardware property here. ARM defines that if vscale is 1 then the registers are 128bit large, ergo the ARM scalable vector unit is 128bit -- IOW, the size of ARM scalable vectors is measured in multiples of 128bit.
LLVM vscale types also have a unit, as you say it is the part after vscale x
. If that unit does not have the same size as the hardware unit then things seem weird.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think ideally we'd avoid tying this lang feature too closely to any specific implementation of scalable vector types. Even if this feature is primarily meant for internal use, we still need to properly document and specify it as part of the Rust language since it is a language extension. I would also not be surprised if some day people ask for this to be directly exposed, why should only stdarch define such types?
So what I'd hope for is to declare a type like
#[repr(simd, scalable)]
pub struct svfloat32_t {
_ty: [f32],
}
and then it is the compiler's responsibility to figure out how large the vector should be.
Imagine RISC-V did scalable vectors where the smallest possible vector size is 256bits. So vscale
says how many times 256bits the vectors are in size -- on contrast to ARM where apparently vscale
denotes a multiple of 128bits. I'd want to declare a single scalable vector type for both targets, but the RFC as-is does not support that. The svfloat32_t
type shown above would lower to <vscale x 4 x f32>
on ARM but to <vscale x 8 x f32>
on this hypothetical RISC-V version of scalable vectors.
The point of these variable-sizes vector extensions is that code is written agnostic to the size of the register, but chooses a
I think this is quite a reasonable idea, but I think it would be a lot of work from rust’s perspective. |
Yeah that's what I thought. But now I learn that one has to generate LLVM that says |
Just to be clear, I don't believe As such, users won't need to worry about figuring out the correct value of |
It is impossible to evaluate this RFC without understanding what all of this stuff actually means. And I had to go read other documents to figure this out as the RFC doesn't explain this. I came in expecting some sort of portable interface where I can just ask for "a vector of Currently the RFC is written in a way that it can only be understood by people that already know how scalable vectors work in detail, all the way down to hardware. That excludes the majority of the community from the discussion (and likely the majority of the lang team as well). That needs to be fixed. |
# Reference-level explanation | ||
[reference-level-explanation]: #reference-level-explanation | ||
|
||
This will focus on LLVM. No investigation has been done into the alternative codegen back ends. At the time of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should focus on Rust, not LLVM. In other words, it should fully describe the behavior of these types without mentioning anything LLVM-specific. This is a Rust langauge RFC after all, so its effect needs to be described in terms of what happens on the level of Rust.
It is okay to also explain how this maps to LLVM, but you cannot expect the reader to know anything about LLVM -- so the text needs to make sense to someone who knows nothing about LLVM.
`Sized` (or both). Once returning of unsized is allowed this part of the rule | ||
would be superseded by that mechanism. It's worth noting that, if any other | ||
types are created that are `Copy` but not `Sized` this rule would apply to | ||
those. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remember that Rust has generics, so I can e.g. write a function fn foo<T: Copy>(x: &T) -> T
. The RFC seems to say this is allowed, because the return type is Copy
. But for most types T
and most ABIs this can't be implemented.
You can't just say in a sentence that you allow unsized return values. That's a major language feature that needs significant design work on its own.
I think what you actually want is some extremely special cases where specifically these scalable vector types are allowed as return values, but in a non-compositional way. There is no precedent for anything like this in Rust so it needs to be fairly carefully described and discussed.
… r=Amanieu Stabilize Ratified RISC-V Target Features Stabilization PR for the ratified RISC-V target features. This stabilizes some of the target features tracked by #44839. This is also a part of #114544 and eventually needed for the RISC-V part of rust-lang/rfcs#3268. There is a similar PR for the the stdarch crate which can be found at rust-lang/stdarch#1476. This was briefly discussed on Zulip (https://rust-lang.zulipchat.com/#narrow/stream/250483-t-compiler.2Frisc-v/topic/Stabilization.20of.20RISC-V.20Target.20Features/near/394793704). Specifically, this PR stabilizes the: * Atomic Instructions (A) on v2.0 * Compressed Instructions (C) on v2.0 * ~Double-Precision Floating-Point (D) on v2.2~ * ~Embedded Base (E) (Given as `RV32E` / `RV64E`) on v2.0~ * ~Single-Precision Floating-Point (F) on v2.2~ * Integer Multiplication and Division (M) on v2.0 * ~Vector Operations (V) on v1.0~ * Bit Manipulations (B) on v1.0 listed as `zba`, `zbc`, `zbs` * Scalar Cryptography (Zk) v1.0.1 listed as `zk`, `zkn`, `zknd`, `zkne`, `zknh`, `zkr`, `zks`, `zksed`, `zksh`, `zkt`, `zbkb`, `zbkc` `zkbx` * ~Double-Precision Floating-Point in Integer Register (Zdinx) on v1.0~ * ~Half-Precision Floating-Point (Zfh) on v1.0~ * ~Minimal Half-Precision Floating-Point (Zfhmin) on v1.0~ * ~Single-Precision Floating-Point in Integer Register (Zfinx) on v1.0~ * ~Half-Precision Floating-Point in Integer Register (Zhinx) on v1.0~ * ~Minimal Half-Precision Floating-Point in Integer Register (Zhinxmin) on v1.0~ r? `@Amanieu`
I wonder if the proposal for "claimable" types with automatic claim can be used to overcome the issue of |
The current plan in the implementation PR (rust-lang/rust#118917) is for scalable vector types to not implement either My understanding is that this RFC is going to be rewritten to match the new implementation plan. |
That sounds potentially quite hacky... but in the end it'll be up to @rust-lang/types to decide whether that is acceptable. An interesting part of this will be properly working out the MIR semantics, ideally by implementing them in the interpreter. |
A proposal to add an additional representation to be used with
simd
to allow for scalable vectors to be used.Rendered