-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inline assembly #2850
Inline assembly #2850
Conversation
|
||
This RFC specifies a new syntax for inline assembly which is suitable for eventual stabilization. | ||
|
||
The initial implementation of this feature will focus on the ARM, x86 and RISC-V architectures. Support for more architectures will be added based on user demand. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The initial implementation of this feature will focus on the ARM, x86 and RISC-V architectures.
A reasonable question to ask of this would be "is there anything in the design of this feature that precludes support for additional architectures, or is the feature sufficiently general that we do not (to the best of our ability) foresee any difficulty supporting additional architectures in a backwards-compatible way?" For example, elsewhere the document discusses how registers are highly architecture-specific; are registers the only place we would expect such different behavior, or are there other potential points of divergence? (I'm also not suggesting that we answer this question in the summary; perhaps in the Future Possibilities section.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, register definitions are basically the only thing needed to add support for a new architecture. This should be fairly straightforward once the basic infrastructure for inline asm is implemented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
register definitions are basically the only thing needed to add support for a new architecture. This should be fairly straightforward
Unless it's something alien, like e.g. Intel GPU ISA 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In practice the backend must have the inline assembly support for said target too. Not all of them do.
We can see that `inout` is used to specify an argument that is both input and output. | ||
This is different from specifying an input and output separately in that it is guaranteed to assign both to the same register. | ||
|
||
It is also possible to specify different variables for the input and output parts of an `inout` operand: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is also possible to specify different variables for the input and output parts of an
inout
operand
What sort of pattern necessitates the existence of this construction? Under what circumstances would one find themselves reaching for this? Does this need to exist if the same behavior can be achieved by either a mov
inside the assembly or a let
outside of it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly what happens to the "input" end of the variable? Is it considered "move"d? If it is Copy
, is it copied before the invocation of asm!
(the same way as function calls would work?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the main uses for this is to indicate an input register which is clobbered. This is represented as inout(reg) some_val => _
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently only Copy
types are supported as asm operands. But otherwise the input part is essentially treated the same way as a function argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bstrie Even if you could do it with a mov
instruction, telling the compiler where you left the variable's value allows the compiler to just treat that as the new location of the variable, which means the resulting assembly won't need a mov
at all.
|
||
The assembler template uses the same syntax as [format strings][format-syntax] (i.e. placeholders are specified by curly braces). The corresponding arguments are accessed in order, by index, or by name. However, implicit named arguments (introduced by [RFC #2795][rfc-2795]) are not supported. | ||
|
||
The assembly code syntax used is that of the GNU assembler (GAS). The only exception is on x86 where the Intel syntax is used instead of GCC's AT&T syntax. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only exception is on x86 where the Intel syntax is used instead of GCC's AT&T syntax.
Can we see an example of Intel syntax being used with this macro?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All examples in this RFC use Intel syntax.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to aid in mechanical translation of code, could we expose the appropriate syntax flags here, so that people can ask for AT&T syntax on x86, for instance? (That would make it easy for others to implement an asm_att!
macro, for instance.)
And, of course, the alternatives section should mention the possibility of using AT&T syntax on all platforms and providing a flag for Intel syntax, in which case people could easily implement an asm_intel!
macro for Intel syntax.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is tricky to support reliably. In particular, GCC requires that all asm code in a compilation unit use the same syntax, so this would at least exclude the possibility of inline asm support with a GCC backend.
I would prefer if we simply chose a single asm syntax and stuck with it. Also note that this only affects x86 which has 2 asm syntaxes. Every other architecture only has a single standardized asm syntax.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could it be a crate level setting, since each crate is a single translation unit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that a crate-level setting will be enough if you take LTO into account. And even then, a crate-level setting won't work with inline functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for that. You can switch syntaxes on-the-fly using assembler directives, .att_syntax
and .intel_syntax noprefix
: example.
Edit: This is supported by both GNU as
and LLVM's assembler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is more with the register placeholders, since GCC needs to know whether to emit eax
or %eax
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh... good point. For the record, this only affects a hypothetical GCC backend, not LLVM, which supports inteldialect
per asm block.
I suppose such a backend could always tell GCC to compile in Intel mode, and then just add the %
s and $
s itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Amanieu GCC supports prefixed registers even when in Intel syntax mode, and even with noprefix
; noprefix
just makes the prefixes optional. So a hypothetical GCC backend can (and should) always emit prefixes like %
and $
, always leave GCC in AT&T mode, and just wrap assembly blocks that use Intel syntax in .intel_syntax noprefix
and .att_syntax
.
``` | ||
dir_spec := "in" / "out" / "lateout" / "inout" / "inlateout" | ||
reg_spec := <arch specific register class> / "<arch specific register name>" | ||
operand_expr := expr / "_" / expr "=>" expr / expr "=>" "_" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
operand_expr := expr / "_" / expr "=>" expr / expr "=>" "_"
Should the expr
s in this line be ident
s, or do we really support arbitrary expressions in all these locations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do want these to be full expressions. This allows for example using my_struct.field
as an asm operand.
- `nomem`: The `asm` blocks does not read or write to any memory. This allows the compiler to cache the values of modified global variables in registers across the `asm` block since it knows that they are not read or written to by the `asm`. | ||
- `readonly`: The `asm` block does not write to any memory. This allows the compiler to cache the values of unmodified global variables in registers across the `asm` block since it knows that they are not written to by the `asm`. | ||
- `preserves_flags`: The `asm` block does not modify the flags register (defined below). This allows the compiler to avoid recomputing the condition flags after the `asm` block. | ||
- `noreturn`: The `asm` block never returns, and its return type is defined as `!` (never). Behavior is undefined if execution falls through past the end of the asm code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its return type is defined as
!
(never)
Ordinarily does asm!
have a return type of ()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, is there any difference between using noreturn
and putting a call to unreachable_unchecked
after the asm!
block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, asm!
has a return type of ()
.
Also, is there any difference between using noreturn and putting a call to unreachable_unchecked after the asm! block?
If you look at the "Mapping to LLVM IR" section, you'll see that's exactly what it does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I saw this:
If the
noreturn
flag is set then anunreachable
LLVM instruction is inserted after the asm invocation.
But I think unreachable_unchecked
will result in the same code (modulo inlining, etc.). Do you think this has a performance impact, or is there another reason other than brevity to prefer the noreturn
modifier?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the noreturn
flag because I feel that it makes it more explicit that the asm never returns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main advantage is that the compiler handles name mangling for you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, the main advantage is that it is inline, which is much easier to read than jumping to another file and makes it easier to deal with things like accessing globals or modifying rust variables.
@roblabla Is this documented somewhere? I've used a lot of naked functions with function calls etc, and it seems to work fine as long as your asm sets up the stack properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, I'm glad to discuss naked fn further, but perhaps we should do it in the tracking issue instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, the main advantage is that it is inline, which is much easier to read than jumping to another file
You don't have to place global_asm!();
in a different file.
and makes it easier to deal with things like accessing globals or modifying rust variables.
👍
I've used a lot of naked functions with function calls etc, and it seems to work fine as long as your asm sets up the stack properly.
One example is the Redox interrupt handling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is generally unsafe to write anything but inline assembly inside a naked function. The LLVM language reference describes this feature as having "very system-specific consequences", which the programmer must be aware of.
And
It is easy to quietly generate wrong code in naked functions, such as by causing the compiler to allocate stack space for temporaries where none were anticipated. There is currently no restriction on writing Rust statements inside a naked function, while most compilers supporting similar features either require or strongly recommend that authors write only inline assembly inside naked functions to ensure no code is generated that assumes a particular stack layout. It may be desirable to place further restrictions on what statements are permitted in the body of a naked function, such as permitting only
asm!
statements.
There is an open PR to amend the naked fn RFC to explicitly deny various common misuses of it. Let's move naked function discussion there.
As someone who hasn't really used inline assembly in any language, I have a stupid question about how this feature works as a whole. I think I inferred this from RFC text, but I am not sure my understanding is correct (perhaps this is a content for the future user-level docs?). This is how I think asm works, is this right ballpark? So, what we fundamentally hope to achieve here is an ability to insert arbitrary instruction sequences into the generated machine code. I naively expect the solution along the lines of D DSL: we invent a syntax for specifying arbitrary instructions, and compiler produces the binary for us. From this point of view, specifying assembly as a string literal seems very odd. Now, the reason why this doesn't work too well is that there are numerous fundamentally different CPU architectures (which are themselves moving targets), with fundamentally different institutions, and it is unreasonable to expect that the compiler would fully support all of them. Instead, we rely on the fact that each architecture has a dedicated external tool, an assembler, which is capable of turning assembler-specific syntax into machine code. Compiler more or less invokes the assembler as a black box. Compiler does not understand what However (and I think this is not spelled out explicitly, or have I just missed it?) we also ship a specific assembler (namely llvm assembler), with specific syntax, which Is this all at least somewhat reasonable description of the reality? :) |
My understanding of past inline asm discussions is that the biggest issue was always the stability and portability implications of supporting inline asm at all (and to a lesser extent of exposing LLVM's asm syntax, which thankfully is no longer being proposed). Unless the Rust teams think these issues are now self-evident, we should probably make some explicit statements about them in the RFC. Specifically, I think all of the following are intended to be true:
If I am correct and all of these are intended, I think at least some of them should be made explicit in the RFC. Especially that part in bold. Also, should we consider how this interacts with the target tier policy? At first blush, I would think any changes to inline asm support on tier 3 targets require no special approval, but any changes to inline asm support on tier 1 and 2 targets probably should get... I dunno, compiler team approval? And be included in release notes? As for what's actually in the feature proposal, I have no objections, and (as another person who's never used inline asm in anger) @matklad's elaboration of the constraints matches my understanding of the constraints involved. |
@matklad Yes, that sounds about right. |
In theory, yes, I think, but in practice, I would be pretty shocked if any mature C compiler (e.g. clang) dropped support for inline assembly, and since LLVM is the backend for clang, it's hard to imagine LLVM dropping support altogether. That said, I'm not sure how stable the LLVM asm interface is. I haven't seen it change in the last few years though... |
I completely agree that it's "stable in practice", but since this has always been the thing explicitly cited by Rust teams in past inline asm discussions to answer "why isn't inline asm on stable yet?", it seems like something we need an official statement on. I'm hoping the answer is that they were specifically concerned about LLVM changing their inline asm syntax (because independently specifying our own syntax completely solves that), or about muddying the messaging on Rust's stability promise too close to 1.0 (which I'd assume is no longer a concern), but I just don't know for sure. |
The intent is that all backends will supports
There are two distinct parts to LLVM's inline assembly support:
Finally, LLVM doesn't exist in a vacuum. As @mark-i-m said, clang is a big user, but keep in mind that Rust itself is also a big user of LLVM. Once Rust gets stable inline assembly support, LLVM will have a strong incentive to keep this support working correctly. |
|
||
## Memory operands | ||
|
||
We could support `mem` as an alternative to specifying a register class which would leave the operand in memory and instead produce a memory address when inserted into the asm string. This would allow generating more efficient code by taking advantage of addressing modes instead of using an intermediate register to hold the computed address. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's very common to have instructions (like add
or mov
) that accept inputs as both registers or memory. Specifying this as a rm
constraint allows the compiler to do better register allocation/spilling.
That being said, LLVM currently ignores any freedom in the constraints and always picks memory:
Thus, it simply tries to make a choice that’s most likely to compile, not one that will be optimal performance. (e.g., given “rm”, it’ll always choose to use memory, not registers).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also LLVM only really supports memory addressing modes for inline asm on x86. On ARM/AArch64 it will just perform the address calculation and put the result in a register.
|
||
## Flag outputs | ||
|
||
GCC supports a special type of output which allows an asm block to return a `bool` encoded in the condition flags register. This allows the compiler to branch directly on the condition flag instead of materializing the condition as a `bool`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be super useful for Intel ADCX/ADOX instructions for multi-precision integer math like in this assembly multiplication routine. Here it would be useful if I could specify the carry and overflow flags as inputs/outputs.
This is in line with what the LLVM instrinsics for these instructions do. See core::arch::x86_64::_addcarryx_u64
.
Unfortunately, again it seems like LLVM just ignores it. As mentioned in rust-lang/stdarch#666 (comment), there is an open LLVM issue and it does not seem easy:
The X86 backend isn't currently set up to model the C flag and O flag separately. We model all of the flags as one register. Because of this we can't interleave the flag dependencies. We would need to do something about that before it makes sense to implement _addcarryx_u64 as anything other than plain adc.
I would have used intrinsics instead of asm if it wasn't for this issue.
I’m a little surprised this is already being RFCed. I was expecting we’d come up with a prototype implementation as a procedural macro first, within the working group. This could serve two purposes:
...Well, I say “we”, but I haven’t been contributing other than responding in some GitHub threads. I really appreciate the energy investment in moving this forward! I just think it might be good to start on that before finalizing the design… though I guess that can still happen before stabilization. Edit: And I haven’t even expressed my wish for an implementation before, so I can’t really complain. I just thought we were still in a somewhat early phase in the working group. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a general note: I'd like more "this is UB!" explanations in several places. Maybe the guide-level section could use a complete subsection explicitly explaining again that many things can lead to UB.
Currently only in a few places it explicitly says "UB". In most places it just says "you cannot" and "you must" and that the compiler can assume something. While people used to inline assembly will most certainly know that all of this is about UB, newcomers might not and might instead expect a compiler error or something.
- `nomem`: The `asm` blocks does not read or write to any memory. This allows the compiler to cache the values of modified global variables in registers across the `asm` block since it knows that they are not read or written to by the `asm`. | ||
- `readonly`: The `asm` block does not write to any memory. This allows the compiler to cache the values of unmodified global variables in registers across the `asm` block since it knows that they are not written to by the `asm`. | ||
- `preserves_flags`: The `asm` block does not modify the flags register (defined below). This allows the compiler to avoid recomputing the condition flags after the `asm` block. | ||
- `noreturn`: The `asm` block never returns, and its return type is defined as `!` (never). Behavior is undefined if execution falls through past the end of the asm code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was actually confused by the name noreturn
. I initially assumed this meant that there is no ret
instruction (or similar) in the assembly. How about using diverging
or diverge
as the name instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel that noreturn
expresses this better: execution never returns from the asm block.
| All | `bp` (x86), `r11` (ARM), `x29` (AArch64), `x8` (RISC-V) | The frame pointer cannot be used as an input or output. | | ||
| x86 | `ah`, `bh`, `ch`, `dh` | These are poorly supported by compiler backends. Use 16-bit register views (e.g. `ax`) instead. | | ||
| x86 | `k0` | This is a constant zero register which can't be modified. | | ||
| x86 | `ip` | This is the program counter, not a real register. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does ip
have aliases if it is unusable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So that the compiler can provide better error messages ("unknown register" vs "disallowed register").
| AArch64 | `xzr` | This is a constant zero register which can't be modified. | | ||
| ARM | `pc` | This is the program counter, not a real register. | | ||
| RISC-V | `x0` | This is a constant zero register which can't be modified. | | ||
| RISC-V | `gp`, `tp` | These registers are reserved and cannot be used as inputs or outputs. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am confused about this list. It is written as though it is an exhaustive list, and even goes so far as to mention a register only used in a fairly recent instruction set extension (k0
for x86).
Wouldn't it make more sense to just list a few examples here and keep an up-to-date list in the documentation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation for inline assembly will most likely be a verbatim copy of the contents of this RFC.
text/0000-inline-asm.md
Outdated
|
||
In this example we call the `out` instruction to output the content of the `cmd` variable | ||
to port `0x64`. Since the `out` instruction only accepts `eax` (and its sub registers) as operand | ||
we had to use the `eax` constraint specifier. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the problems I have with the current asm!()
macro is that it does not at all feel like the error reporting story is on the same level of quality as the rest of rustc
. See e.g. rust-lang/rust#15402
If I accidentally use reg
here instead of "eax"
, will I get poor error messages from deep inside the belly of LLVM?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth noting that your specific example won't be possible anymore, since as noted in the RFC, the input/output parameters will only accept raw numeric types (and pointers where it makes sense). Your wrapper struct would get rejected by the type checker.
There are other types of weird "low-level" llvm errors that can get reported when the feature is misused though. Notably, I remember getting really weird aborts when misusing constraints. Rustc will probably need to sanitize them before passing it to LLVM, making sure they make sense. Some things on the top of my head:
- Using the same named register twice as an input parameter (it's UB, LLVM accepts it and will either abort or does something weird)
- Using the same named register as both a clobber and an input or an output (UB, LLVM generates garbage, sometimes aborts)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The specification in this RFC and the validation it requires should eliminate the possibility of any LLVM internal error. For example, the RFC explicitly says this:
It is a compile-time error to use the same explicit register two input operand or two output operands.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, the out
instruction must use the al
, ax
or eax
register per the instruction set reference.
If I accidentally wrote reg
instead of "eax"
in the example, would the error message be of the same quality as normal rust error messages (or at least using the same infrastructure wrt highlighting the span in question etc)?
Another concern is that it would potentially compile some of the time, when LLVM happens to allocate eax
as its register of choice. Would that be possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I accidentally wrote
reg
instead of"eax"
in the example, would the error message be of the same quality as normal rust error messages (or at least using the same infrastructure wrt highlighting the span in question etc)?
LLVM has support for sending error messages from inline assembly back to rustc so that they can be displayed through rustc's normal error message functionality. This already works with the current asm!
macro: https://play.rust-lang.org/?version=nightly&mode=release&edition=2018&gist=e5d9c3c74edc6e02858ec965abfd4d98
Another concern is that it would potentially compile some of the time, when LLVM happens to allocate
eax
as its register of choice. Would that be possible?
Yes it would sometimes compile fine and sometimes not, depending on the register selected by LLVM. There isn't much we can do about that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More than cryptic error messages I fear silent errors without any error messages.
See here: https://godbolt.org/z/5d0MxL
As you can see if you would pass one-byte variable and request an "r" constraint (instead of "q" or "Q" constraint) LLVM is all to happy to make a mess out of your assembler (note how LLVM tries to return both y1 and y4 in the same eax register).
Will rustcc be able to detect and report that somehow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems to be a bug in LLVM. It should not allocate the *H registers for inline assembly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The emitted code is correct, though: after the assembly block, it first stores al
somewhere else, only then ah
in the entire eax
register. Technically, since you only asked for the short registers, you shouldn't have a way to e.g. zero-extend one of them and clobber the other. Should this be a strict guarantee, that you don't get the *h
registers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the way this RFC is currently written the *h
registers are never chosen for operands (and I don't intend to change it).
The reason why I argue that LLVM's behavior is buggy is that Clang's inline asm is designed to emulate GCC's inline asm, and GCC never allocates *h
for register operands.
I will submit a fix to LLVM as part of the asm!
work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small comment on source/destination semantics.
|
||
The assembler template uses the same syntax as [format strings][format-syntax] (i.e. placeholders are specified by curly braces). The corresponding arguments are accessed in order, by index, or by name. However, implicit named arguments (introduced by [RFC #2795][rfc-2795]) are not supported. | ||
|
||
The assembly code syntax used is that of the GNU assembler (GAS). The only exception is on x86 where the Intel syntax is used instead of GCC's AT&T syntax. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this imply that x86
will use the format of mnemonic destination, source
(like Intel Syntax) while every other platform will use mnemonic source, destination
(like AT&T/GAS)? Because that'll extremely hurt portability/readability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Every other architecture already uses mnemonic destination, source
. x86 is the exception with AT&T syntax that reverses this order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, so every platform is standardized to mnemonic dst, src
?
This is a bit confusing, as I'm not sure if the Rust-Source is being specified, or the IR which is being transferred to the compiler.
So does this imply that on non-x86 platforms rustc
will be responsible for re-ordering mnemonic
arguments to the backend?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no reordering needed, all non-x86 platforms already use the correct ordering. The Intel/ATT syntax split is only a thing on x86.
Unfortunately we can't implement this with just a proc macro that wraps the existing |
Implement new asm! syntax from RFC 2850 This PR implements the new `asm!` syntax proposed in rust-lang/rfcs#2850. # Design A large part of this PR revolves around taking an `asm!` macro invocation and plumbing it through all of the compiler layers down to LLVM codegen. Throughout the various stages, an `InlineAsm` generally consists of 3 components: - The template string, which is stored as an array of `InlineAsmTemplatePiece`. Each piece represents either a literal or a placeholder for an operand (just like format strings). ```rust pub enum InlineAsmTemplatePiece { String(String), Placeholder { operand_idx: usize, modifier: Option<char>, span: Span }, } ``` - The list of operands to the `asm!` (`in`, `[late]out`, `in[late]out`, `sym`, `const`). These are represented differently at each stage of lowering, but follow a common pattern: - `in`, `out` and `inout` all have an associated register class (`reg`) or explicit register (`"eax"`). - `inout` has 2 forms: one with a single expression that is both read from and written to, and one with two separate expressions for the input and output parts. - `out` and `inout` have a `late` flag (`lateout` / `inlateout`) to indicate that the register allocator is allowed to reuse an input register for this output. - `out` and the split variant of `inout` allow `_` to be specified for an output, which means that the output is discarded. This is used to allocate scratch registers for assembly code. - `sym` is a bit special since it only accepts a path expression, which must point to a `static` or a `fn`. - The options set at the end of the `asm!` macro. The only one that is particularly of interest to rustc is `NORETURN` which makes `asm!` return `!` instead of `()`. ```rust bitflags::bitflags! { pub struct InlineAsmOptions: u8 { const PURE = 1 << 0; const NOMEM = 1 << 1; const READONLY = 1 << 2; const PRESERVES_FLAGS = 1 << 3; const NORETURN = 1 << 4; const NOSTACK = 1 << 5; } } ``` ## AST `InlineAsm` is represented as an expression in the AST: ```rust pub struct InlineAsm { pub template: Vec<InlineAsmTemplatePiece>, pub operands: Vec<(InlineAsmOperand, Span)>, pub options: InlineAsmOptions, } pub enum InlineAsmRegOrRegClass { Reg(Symbol), RegClass(Symbol), } pub enum InlineAsmOperand { In { reg: InlineAsmRegOrRegClass, expr: P<Expr>, }, Out { reg: InlineAsmRegOrRegClass, late: bool, expr: Option<P<Expr>>, }, InOut { reg: InlineAsmRegOrRegClass, late: bool, expr: P<Expr>, }, SplitInOut { reg: InlineAsmRegOrRegClass, late: bool, in_expr: P<Expr>, out_expr: Option<P<Expr>>, }, Const { expr: P<Expr>, }, Sym { expr: P<Expr>, }, } ``` The `asm!` macro is implemented in librustc_builtin_macros and outputs an `InlineAsm` AST node. The template string is parsed using libfmt_macros, positional and named operands are resolved to explicit operand indicies. Since target information is not available to macro invocations, validation of the registers and register classes is deferred to AST lowering. ## HIR `InlineAsm` is represented as an expression in the HIR: ```rust pub struct InlineAsm<'hir> { pub template: &'hir [InlineAsmTemplatePiece], pub operands: &'hir [InlineAsmOperand<'hir>], pub options: InlineAsmOptions, } pub enum InlineAsmRegOrRegClass { Reg(InlineAsmReg), RegClass(InlineAsmRegClass), } pub enum InlineAsmOperand<'hir> { In { reg: InlineAsmRegOrRegClass, expr: Expr<'hir>, }, Out { reg: InlineAsmRegOrRegClass, late: bool, expr: Option<Expr<'hir>>, }, InOut { reg: InlineAsmRegOrRegClass, late: bool, expr: Expr<'hir>, }, SplitInOut { reg: InlineAsmRegOrRegClass, late: bool, in_expr: Expr<'hir>, out_expr: Option<Expr<'hir>>, }, Const { expr: Expr<'hir>, }, Sym { expr: Expr<'hir>, }, } ``` AST lowering is where `InlineAsmRegOrRegClass` is converted from `Symbol`s to an actual register or register class. If any modifiers are specified for a template string placeholder, these are validated against the set allowed for that operand type. Finally, explicit registers for inputs and outputs are checked for conflicts (same register used for different operands). ## Type checking Each register class has a whitelist of types that it may be used with. After the types of all operands have been determined, the `intrinsicck` pass will check that these types are in the whitelist. It also checks that split `inout` operands have compatible types and that `const` operands are integers or floats. Suggestions are emitted where needed if a template modifier should be used for an operand based on the type that was passed into it. ## HAIR `InlineAsm` is represented as an expression in the HAIR: ```rust crate enum ExprKind<'tcx> { // [..] InlineAsm { template: &'tcx [InlineAsmTemplatePiece], operands: Vec<InlineAsmOperand<'tcx>>, options: InlineAsmOptions, }, } crate enum InlineAsmOperand<'tcx> { In { reg: InlineAsmRegOrRegClass, expr: ExprRef<'tcx>, }, Out { reg: InlineAsmRegOrRegClass, late: bool, expr: Option<ExprRef<'tcx>>, }, InOut { reg: InlineAsmRegOrRegClass, late: bool, expr: ExprRef<'tcx>, }, SplitInOut { reg: InlineAsmRegOrRegClass, late: bool, in_expr: ExprRef<'tcx>, out_expr: Option<ExprRef<'tcx>>, }, Const { expr: ExprRef<'tcx>, }, SymFn { expr: ExprRef<'tcx>, }, SymStatic { expr: ExprRef<'tcx>, }, } ``` The only significant change compared to HIR is that `Sym` has been lowered to either a `SymFn` whose `expr` is a `Literal` ZST of the `fn`, or a `SymStatic` whose `expr` is a `StaticRef`. ## MIR `InlineAsm` is represented as a `Terminator` in the MIR: ```rust pub enum TerminatorKind<'tcx> { // [..] /// Block ends with an inline assembly block. This is a terminator since /// inline assembly is allowed to diverge. InlineAsm { /// The template for the inline assembly, with placeholders. template: &'tcx [InlineAsmTemplatePiece], /// The operands for the inline assembly, as `Operand`s or `Place`s. operands: Vec<InlineAsmOperand<'tcx>>, /// Miscellaneous options for the inline assembly. options: InlineAsmOptions, /// Destination block after the inline assembly returns, unless it is /// diverging (InlineAsmOptions::NORETURN). destination: Option<BasicBlock>, }, } pub enum InlineAsmOperand<'tcx> { In { reg: InlineAsmRegOrRegClass, value: Operand<'tcx>, }, Out { reg: InlineAsmRegOrRegClass, late: bool, place: Option<Place<'tcx>>, }, InOut { reg: InlineAsmRegOrRegClass, late: bool, in_value: Operand<'tcx>, out_place: Option<Place<'tcx>>, }, Const { value: Operand<'tcx>, }, SymFn { value: Box<Constant<'tcx>>, }, SymStatic { value: Box<Constant<'tcx>>, }, } ``` As part of HAIR lowering, `InOut` and `SplitInOut` operands are lowered to a split form with a separate `in_value` and `out_place`. Semantically, the `InlineAsm` terminator is similar to the `Call` terminator except that it has multiple output places where a `Call` only has a single return place output. The constant promotion pass is used to ensure that `const` operands are actually constants (using the same logic as `#[rustc_args_required_const]`). ## Codegen Operands are lowered one more time before being passed to LLVM codegen: ```rust pub enum InlineAsmOperandRef<'tcx, B: BackendTypes + ?Sized> { In { reg: InlineAsmRegOrRegClass, value: OperandRef<'tcx, B::Value>, }, Out { reg: InlineAsmRegOrRegClass, late: bool, place: Option<PlaceRef<'tcx, B::Value>>, }, InOut { reg: InlineAsmRegOrRegClass, late: bool, in_value: OperandRef<'tcx, B::Value>, out_place: Option<PlaceRef<'tcx, B::Value>>, }, Const { string: String, }, SymFn { instance: Instance<'tcx>, }, SymStatic { def_id: DefId, }, } ``` The operands are lowered to LLVM operands and constraint codes as follow: - `out` and the output part of `inout` operands are added first, as required by LLVM. Late output operands have a `=` prefix added to their constraint code, non-late output operands have a `=&` prefix added to their constraint code. - `in` operands are added normally. - `inout` operands are tied to the matching output operand. - `sym` operands are passed as function pointers or pointers, using the `"s"` constraint. - `const` operands are formatted to a string and directly inserted in the template string. The template string is converted to LLVM form: - `$` characters are escaped as `$$`. - `const` operands are converted to strings and inserted directly. - Placeholders are formatted as `${X:M}` where `X` is the operand index and `M` is the modifier character. Modifiers are converted from the Rust form to the LLVM form. The various options are converted to clobber constraints or LLVM attributes, refer to the [RFC](https://github.com/Amanieu/rfcs/blob/inline-asm/text/0000-inline-asm.md#mapping-to-llvm-ir) for more details. Note that LLVM is sometimes rather picky about what types it accepts for certain constraint codes so we sometimes need to insert conversions to/from a supported type. See the target-specific ISelLowering.cpp files in LLVM for details. # Adding support for new architectures Adding inline assembly support to an architecture is mostly a matter of defining the registers and register classes for that architecture. All the definitions for register classes are located in `src/librustc_target/asm/`. Additionally you will need to implement lowering of these register classes to LLVM constraint codes in `src/librustc_codegen_llvm/asm.rs`.
Stabilize asm! and global_asm! Tracking issue: rust-lang#72016 It's been almost 2 years since the original [RFC](rust-lang/rfcs#2850) was posted and we're finally ready to stabilize this feature! The main changes in this PR are: - Removing `asm!` and `global_asm!` from the prelude as per the decision in rust-lang#87228. - Stabilizing the `asm` and `global_asm` features. - Removing the unstable book pages for `asm` and `global_asm`. The contents are moved to the [reference](rust-lang/reference#1105) and [rust by example](rust-lang/rust-by-example#1483). - All links to these pages have been removed to satisfy the link checker. In a later PR these will be replaced with links to the reference or rust by example. - Removing the automatic suggestion for using `llvm_asm!` instead of `asm!` if you're still using the old syntax, since it doesn't work anymore with `asm!` no longer being in the prelude. This only affects code that predates the old LLVM-style `asm!` being renamed to `llvm_asm!`. - Updating `stdarch` and `compiler-builtins`. - Updating all the tests. r? `@joshtriplett`
Stabilize asm! and global_asm! Tracking issue: #72016 It's been almost 2 years since the original [RFC](rust-lang/rfcs#2850) was posted and we're finally ready to stabilize this feature! The main changes in this PR are: - Removing `asm!` and `global_asm!` from the prelude as per the decision in #87228. - Stabilizing the `asm` and `global_asm` features. - Removing the unstable book pages for `asm` and `global_asm`. The contents are moved to the [reference](rust-lang/reference#1105) and [rust by example](rust-lang/rust-by-example#1483). - All links to these pages have been removed to satisfy the link checker. In a later PR these will be replaced with links to the reference or rust by example. - Removing the automatic suggestion for using `llvm_asm!` instead of `asm!` if you're still using the old syntax, since it doesn't work anymore with `asm!` no longer being in the prelude. This only affects code that predates the old LLVM-style `asm!` being renamed to `llvm_asm!`. - Updating `stdarch` and `compiler-builtins`. - Updating all the tests. r? `@joshtriplett`
A redesigned
asm!
macro, with a path to stabilization.Rendered
Thanks to everyone involved in the inline asm project group for their feedback which helped make this RFC possible!
The discussion in this thread has grown rather large, so a new thread has been opened at #2873 where the discussion should continue.