x64: use `ByteSink` more liberally in the `rex` module #9505

abrown · 2024-10-23T22:32:45Z

In looking at adding auto-generated assembler code to Cranelift, I've noticed that we pass MachBuffer down when it is not always necessary. The ByteSink trait (which MachBuffer implements) is a simplified view to put* bytes into the buffer and it is a bit simpler to use when testing since it is also implemented by Vec<u8>. This change propagates the trait to the rex module.

In looking at adding auto-generated assembler code to Cranelift, I've noticed that we pass `MachBuffer` down when it is not always necessary. The `ByteSink` trait (which `MachBuffer` implements) is a simplified view to `put*` bytes into the buffer and it is a bit simpler to use when testing since it is also implemented by `Vec<u8>`. This change propagates the trait to the `rex` module.

abrown · 2024-10-23T22:35:05Z

Similar to #9504, this is just an attempt at trying to simplify what is needed to actually assemble an instruction. We still need MachBuffer for instructions that touch memory; thinking about it, it may just be unavoidable.

abrown · 2024-10-23T22:36:07Z

cranelift/codegen/src/isa/x64/encoding/rex.rs

+                self.emit_two_op(sink, enc_g, 0);
+            }
+        }
+    }


Oh, actually let me remove this: this is for the new assembler but isn't used yet.

abrown · 2024-10-23T22:38:47Z

cranelift/codegen/src/isa/x64/encoding/rex.rs

+    pub fn must_always_emit(&self) -> bool {
+        (self.0 & 2) != 0
+    }
+


...and a little bit of code movement.

cfallin

This looks fine, thanks!

Re: handling Amodes and the helper that briefly appeared here: I think we'll need MachBuffer unavoidably once we have amodes that can refer to labels; that's the semantic thing that MachBuffer gives us above an arbitrary byte sequence. But totally fine to trim the close coupling at lower levels.

I'd love to talk more about the assembler design w.r.t. dependencies -- intuitively I would have thought that MachBuffer would be an integral part of it, i.e. it's the assembler's way of providing its output (given that assemblers need label references, and to flow through other metadata like trap info and relocs from insts to the code blob), but maybe there's another use-case or need for generalization I'm missing.

abrown · 2024-10-23T22:48:31Z

intuitively I would have thought that MachBuffer would be an integral part of it

I mean, that's how it has to be today. I'm kind of wondering if it should, though: it's nice to be able to "just emit" an instruction into a byte buffer during testing and that's what I've been thinking about. The setup in emit_tests.rs just to be able to emit an instruction is a lot, e.g. If we're proptest-ing or fuzzing we also want the "emit a single instruction" path to be relatively quick.

On the other hand, sometimes it is unavoidable: I've notice that we register traps and labels using a MachBuffer in the encoding logic and that simply cannot be done with ByteSink. Maybe ByteSink can gain extra trait functions that do nothing for the Vec<u8> case or something like that...

cfallin · 2024-10-23T23:00:21Z

Maybe ByteSink can gain extra trait functions that do nothing for the Vec case or something like that...

Ah! Yeah, that's an interesting idea: lift the whole interface of MachBuffer into a trait, and provide an implementation of that trait for a Vec<u8>.

There are interesting semantic implementations of this though when label references are involved: what should the assembled bytes from a mov rax, [rip+label] instruction be? Zeroes in the RIP offset (i.e. labels not resolved)? One aspect of the MachBuffer that is nice is that its types essentially force one to be aware one is referencing addresses in the buffer: you have to ask for a label to refer to it, the various amodes require a label, and if you don't bind it, (IIRC?) it should panic when finalizing.

Maybe another way of reading this friction is that we need some convenience APIs? It should be possible to write an assemble(inst: Inst) -> Vec<u8> that is something like:

fn assemble(inst: Inst) -> Vec<u8> {
  let mut buffer = MachBuffer::new();
  inst.emit(&mut buffer, ...);
  buffer.finalize(&VCodeConstants::default(), &ControlPlane::default())
    .apply_base_srcloc(SourceLoc::new(0))
    .data
    .into() // SmallVec into Vec
}

(whew, that is a bit of a mouthful, I see why you want it!)

abrown requested a review from a team as a code owner October 23, 2024 22:32

abrown requested review from cfallin and removed request for a team October 23, 2024 22:32

abrown commented Oct 23, 2024

View reviewed changes

review: remove unused code

563d601

abrown commented Oct 23, 2024

View reviewed changes

cfallin approved these changes Oct 23, 2024

View reviewed changes

cfallin enabled auto-merge October 23, 2024 22:41

cfallin added this pull request to the merge queue Oct 23, 2024

Merged via the queue into bytecodealliance:main with commit 9e1be64 Oct 23, 2024
51 checks passed

abrown deleted the more-byte-sink branch October 23, 2024 23:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

x64: use `ByteSink` more liberally in the `rex` module #9505

x64: use `ByteSink` more liberally in the `rex` module #9505

abrown commented Oct 23, 2024

abrown commented Oct 23, 2024

abrown Oct 23, 2024

abrown Oct 23, 2024

cfallin left a comment

abrown commented Oct 23, 2024 •

edited

Loading

cfallin commented Oct 23, 2024 •

edited

Loading

x64: use ByteSink more liberally in the rex module #9505

x64: use ByteSink more liberally in the rex module #9505

Conversation

abrown commented Oct 23, 2024

abrown commented Oct 23, 2024

abrown Oct 23, 2024

Choose a reason for hiding this comment

abrown Oct 23, 2024

Choose a reason for hiding this comment

cfallin left a comment

Choose a reason for hiding this comment

abrown commented Oct 23, 2024 • edited Loading

cfallin commented Oct 23, 2024 • edited Loading

x64: use `ByteSink` more liberally in the `rex` module #9505

x64: use `ByteSink` more liberally in the `rex` module #9505

abrown commented Oct 23, 2024 •

edited

Loading

cfallin commented Oct 23, 2024 •

edited

Loading