Refactor some SPI transactions #988

2bndy5 · 2024-07-15T10:16:35Z

Refactor 1-byte SPI commands

I came across this tactic when writing the rust implementation.

By telling read_register() to read 0 bytes from a register that is actually a reserved SPI command byte, we don't need to use a special form of write_register(uint8_t, uint8_t). The same transaction still occurs with less code executed to do it.

This only affects get_status(), flush_tx(), flush_rx(), and reUseTX().
Luckily, those SPI commands already have 0x20 bit asserted, so there's no need to mask the command byte with W_REGISTER as is done in write_register().
This needs testing on the various supported platforms to ensure a 1-byte buffer behaves as expected.

Removed mnemonics bit-wise operations

R_REGISTER is defined as 0. So, reg | 0 always results in reg. We don't need to compute this at runtime.
REGISTER_MASK is defined as 0x1F. All register offsets are all under 0x1F. Any register offset masked with REGISTER_MASK results in the same register offset value, so we don't need to compute this at runtime.

Note, all register offsets used are a known number. There is no public API that allows users to read a register offset larger than 0x1F.

Additional SPI refactoring

When writing an ACK payload to TX FIFO, static payload size is not applicable. So, bypass the additional checks in write_payload() about ensuring static payload size is satisfied; instead let writeAckPayload() use write_register() directly.

Note, the W_ACK_PAYLOAD command already has 0x20 bit (W_REGISTER) asserted.

I came across this tactic when writing the rust implementation. By telling `read_register()` to read 0 bytes from a register that is actually a reserved SPI command byte, we don't need to use a special form of `write_register(uint8_t, uint8_t)`. The same transaction still occurs with less code executed to do it. This only affects `get_status()`, `flush_tx()`, and `flush_rx()`. Luckily, those SPI commands already have 0x20 bit asserted, so there's no need to mask the command byte with `W_REGISTER` as is done in `write_register()`. This needs testing on the various supported platforms to ensure a 1-byte buffer behaves as expected.

github-actions · 2024-07-15T10:24:04Z

Memory usage change @ 0829b87

Board	flash	%	RAM for global variables	%
`arduino:avr:nano`	❔ -16 - +2	-0.05 - +0.01	0 - 0	0.0 - 0.0
`arduino:samd:mkrzero`	💚 -104 - -28	-0.04 - -0.01	0 - 0	0.0 - 0.0

Click for full report table

Board	`examples/GettingStarted` flash	%	`examples/GettingStarted` RAM for global variables	%	`examples/AcknowledgementPayloads` flash	%	`examples/AcknowledgementPayloads` RAM for global variables	%	`examples/ManualAcknowledgements` flash	%	`examples/ManualAcknowledgements` RAM for global variables	%	`examples/StreamingData` flash	%	`examples/StreamingData` RAM for global variables	%	`examples/MulticeiverDemo` flash	%	`examples/MulticeiverDemo` RAM for global variables	%	`examples/InterruptConfigure` flash	%	`examples/InterruptConfigure` RAM for global variables	%	`examples/scanner` flash	%	`examples/scanner` RAM for global variables	%	`examples/encodeRadioDetails` flash	%	`examples/encodeRadioDetails` RAM for global variables	%
`arduino:avr:nano`	2	0.01	0	0.0	-16	-0.05	0	0.0	-2	-0.01	0	0.0	2	0.01	0	0.0	2	0.01	0	0.0	-8	-0.03	0	0.0	-16	-0.05	0	0.0	-12	-0.04	0	0.0
`arduino:samd:mkrzero`	-64	-0.02	0	0.0	-60	-0.02	0	0.0	-64	-0.02	0	0.0	-60	-0.02	0	0.0	-28	-0.01	0	0.0	-68	-0.03	0	0.0	-64	-0.02	0	0.0	-104	-0.04	0	0.0

Click for full report CSV

Board,examples/GettingStarted<br>flash,%,examples/GettingStarted<br>RAM for global variables,%,examples/AcknowledgementPayloads<br>flash,%,examples/AcknowledgementPayloads<br>RAM for global variables,%,examples/ManualAcknowledgements<br>flash,%,examples/ManualAcknowledgements<br>RAM for global variables,%,examples/StreamingData<br>flash,%,examples/StreamingData<br>RAM for global variables,%,examples/MulticeiverDemo<br>flash,%,examples/MulticeiverDemo<br>RAM for global variables,%,examples/InterruptConfigure<br>flash,%,examples/InterruptConfigure<br>RAM for global variables,%,examples/scanner<br>flash,%,examples/scanner<br>RAM for global variables,%,examples/encodeRadioDetails<br>flash,%,examples/encodeRadioDetails<br>RAM for global variables,%
arduino:avr:nano,2,0.01,0,0.0,-16,-0.05,0,0.0,-2,-0.01,0,0.0,2,0.01,0,0.0,2,0.01,0,0.0,-8,-0.03,0,0.0,-16,-0.05,0,0.0,-12,-0.04,0,0.0
arduino:samd:mkrzero,-64,-0.02,0,0.0,-60,-0.02,0,0.0,-64,-0.02,0,0.0,-60,-0.02,0,0.0,-28,-0.01,0,0.0,-68,-0.03,0,0.0,-64,-0.02,0,0.0,-104,-0.04,0,0.0

2bndy5 · 2024-07-15T10:39:14Z

Hmm, flash size on ATmega328P increased by 2 bytes. Its probably because a pointer data type is slightly larger than a single byte. I still expect speed-ups though because write_register(uint8_t, uint8_t) doesn't have to check if it is a 1-byte vs 2-byte transaction anymore. Maybe the speed-up would be negligible since all 2-byte transactions are used to configure the radio, clear the status flags, or read FIFO_STATUS/RPD registers.

I could try using read_register(uint8_t) instead with a special check like

// only the following 1-byte SPI commands have 0xE0 bits asserted
// FLUSH_TX, FLUSH_RX, RF24_NOP, REUSE_TX_PL
if ((reg & 0xE0) != 0xE0) {
    result = _SPI.transfer(0xff);
}

2bndy5 · 2024-07-15T21:35:35Z

I could try using read_register(uint8_t) instead with a special check

I don't particularly like this idea. It basically goes back to having a specialized ***_register() that's tailored to 1 byte transactions. Additionally, we'd have to explicitly ignore the returned result.

While there is a 2 byte increase in flash on ATmega328p. there is also a not-insignificant decrease in flash on ATSAMD21. I think this change is still worth it considering coretx-M chips are a much more desirable purchase now-a-days; the price is almost the same as the older ATmega328 but with a lot more benefits (like extra RAM and native USB support).

2bndy5 · 2024-07-15T23:38:29Z

I'd also like to refactor write_payload() into startFastWrite()'s definition because there would be no special need to have another private SPI write function. Both W_TX_PAYLOAD and W_TX_PAYLOAD_NO_ACK SPI commands already have the 0x20 (W_REGISTER) bit asserted. I already changed the writeAckPayload() to use write_register() directly (see 7abc73d).

Although, I'm afraid such a refactor might have an undesirable change in compile size. At this point, I'm only concerned with possible speed-ups without increasing the compile size (preferably reducing it).

TMRh20 · 2024-07-16T08:46:10Z

I would add my opinion here, but I don't have one yet. I need to look over the code etc and understand what you are doing. 😄

I haven't looked at the RF24 code base for quite some time now, so gimme some time to review here.

2bndy5 · 2024-07-16T09:55:42Z

I'm not adamant about these changes. I just opened the PR as a draft to see compile size reports. Compile size aside, I believe these changes should speed up the SPI transactions a little.

2bndy5 · 2024-07-18T04:46:45Z

Testing looks good on Linux.

I'm seeing a noticeable speed-up during the streamingData example. The ackPayload example seems to be running about the same speed. Its hard to measure because the speed-up may be a matter nanoseconds (dependent on CPU cycles).

TMRh20 · 2024-07-20T09:04:59Z

This is just weird. How you figured out to use a read instead of a write is beyond me. It seems to work too, so I am again very impressed.

2bndy5 · 2024-07-20T09:16:16Z

It was just from taking a fresh look. Its kinda fun to write something from scratch, especially in a new/different language.

When I first joined up with you here, I just finished writing the pure python implementation, so I brought my "improvement" ideas with me then. Now that I wrote a pure rust implementation, I have some work (related to these changes) to do in the CirPy lib...

The "read" and "write" terms are more of a human perception when every transaction over the SPI bus is full duplex (both read and write). I lucked out with the 0x20 bit being asserted in the 1-byte SPI commands.

TMRh20 · 2024-07-20T09:25:08Z

The "read" and "write" terms are more of a human perception when every transaction over the SPI bus is full duplex (both read and write).

Yes, but one needs a complete understanding of SPI and RF24 behaviors along with putting two-and-two together to pull something like this off!

`R_REGISTER` is defined as 0. `reg | 0` always results in `reg`

`REGISTER_MASK` is defined as 0x1F. All register offsets are all under 0x1F. Any register offset masked with REGISTER_MASK results in the same register offset value, so we don't need to compute this at runtime. Note: all register offsets used are a known number. There is no public API that allows users to read a register offset larger than 0x1F.

When writing an ACK payload to TX FIFO, static payload size is not applicable. So, bypass the additional checks in `write_payload()` about ensuring static payload size is satisfied. Note, the `W_ACK_PAYLOAD` command already has 0x20 bit (`W_REGISTER`) asserted.

TMRh20 · 2024-07-27T10:20:33Z

Finally had some time to test this more, and my testing shows a speed-up of at least 4-6 KB/s transferring data from a Nano to a Due at 1MBPS. That's pretty significant!

2bndy5 · 2024-07-27T10:28:34Z

I want to check that this still works with the Pico SDK. I doubt there will be any problems, but it doesn't hurt to verify anyway.

2bndy5 · 2024-08-02T06:30:10Z

This works fine with Pico SDK. Seen as how I've tested Linux and PicoSDK (and Arduino-pico core for good measure), are there any objections to merging this?

TMRh20 · 2024-08-02T09:19:08Z

Nope, I’m pretty happy about these changes

2bndy5 · 2024-10-06T00:40:40Z

We should probably publish a new release to get these optimizations out into the Arduino world. This idea also beckons a release crusade because of the change SERIAL_DEBUG -> RF24_DEBUG (and corresponding changes up the RF24 stack).

TMRh20 · 2024-10-06T10:20:04Z

Sounds good to me!

In looking at it, there are a lot of changes all the way through the stack that probably should have been pushed out sooner.

I haven't done much coding through the summertime though... been slacking lol

2bndy5 · 2024-10-06T10:32:34Z

It is a bit difficult to remember what changes are unreleased; we currently need to look at the git history since the last tag.

If the immanent release crusade goes well, then I could set up a CI workflow to output unreleased changes in a job summary (like what I did here for a different project).

The unreleased changes are not going to be kept in the CHANGELOGs because that would be rather cumbersome on local git clones.

TMRh20 · 2024-10-06T10:48:26Z

Hmm, well Its not that hard to do a compare as in v1.4.9...master
but automating things further is pretty nice.

2bndy5 · 2024-10-06T10:51:57Z

It is a little annoying to get to that page. Usually I have to type it manually or first compare a release and change the comparison once the page loads. I'm sure there's a git command too...

2bndy5 · 2024-10-06T13:54:41Z

Release crusade completed

There's going to be some mis-categorization on behalf of git-cliff because we're not using conventional commit messages. For example, in RF24Ethernet, I noticed

But it should really be categorized as "Added". This because the merge commit (nRF24/RF24Ethernet@9b78743 from PR nRF24/RF24Ethernet#49) included the word "Remove" in the commit message's body:

Add non-blocking connect MQTT example

Remove unused variable test

Put timer calculations outside loops

Add new example file to docs

I'm still not sure why the word "Add" (or the PR label ) didn't cause it to be categorized properly into "Added". 🤷🏼‍♂️

2bndy5 force-pushed the refactor-spi-cmds branch from 9c0ba58 to ba12195 Compare July 15, 2024 11:18

2bndy5 changed the title ~~Refactor 1-byte SPI commands~~ Refactor some SPI transactions Jul 18, 2024

2bndy5 marked this pull request as ready for review July 18, 2024 04:48

TMRh20 approved these changes Jul 20, 2024

View reviewed changes

This comment was marked as resolved.

Sign in to view

2bndy5 added 4 commits July 20, 2024 02:53

replace reg | R_REGISTER with just reg

7a53781

`R_REGISTER` is defined as 0. `reg | 0` always results in `reg`

reviewed debug output

0829b87

2bndy5 force-pushed the refactor-spi-cmds branch from 7abc73d to 0829b87 Compare July 20, 2024 09:53

2bndy5 merged commit 1acf968 into master Aug 2, 2024
80 checks passed

2bndy5 deleted the refactor-spi-cmds branch August 2, 2024 09:21

2bndy5 mentioned this pull request Oct 11, 2024

[feature request] support ESP-IDF platform #925

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor some SPI transactions #988

Refactor some SPI transactions #988

2bndy5 commented Jul 15, 2024 •

edited

Loading

github-actions bot commented Jul 15, 2024 •

edited

Loading

2bndy5 commented Jul 15, 2024 •

edited

Loading

2bndy5 commented Jul 15, 2024 •

edited

Loading

2bndy5 commented Jul 15, 2024 •

edited

Loading

TMRh20 commented Jul 16, 2024

2bndy5 commented Jul 16, 2024 •

edited

Loading

2bndy5 commented Jul 18, 2024

TMRh20 commented Jul 20, 2024

2bndy5 commented Jul 20, 2024

TMRh20 commented Jul 20, 2024

This comment was marked as resolved.

TMRh20 commented Jul 27, 2024

2bndy5 commented Jul 27, 2024

2bndy5 commented Aug 2, 2024

TMRh20 commented Aug 2, 2024 via email

2bndy5 commented Oct 6, 2024

TMRh20 commented Oct 6, 2024

2bndy5 commented Oct 6, 2024

TMRh20 commented Oct 6, 2024

2bndy5 commented Oct 6, 2024

2bndy5 commented Oct 6, 2024

Refactor some SPI transactions #988

Refactor some SPI transactions #988

Conversation

2bndy5 commented Jul 15, 2024 • edited Loading

Refactor 1-byte SPI commands

Removed mnemonics bit-wise operations

Additional SPI refactoring

github-actions bot commented Jul 15, 2024 • edited Loading

2bndy5 commented Jul 15, 2024 • edited Loading

2bndy5 commented Jul 15, 2024 • edited Loading

2bndy5 commented Jul 15, 2024 • edited Loading

TMRh20 commented Jul 16, 2024

2bndy5 commented Jul 16, 2024 • edited Loading

2bndy5 commented Jul 18, 2024

TMRh20 commented Jul 20, 2024

2bndy5 commented Jul 20, 2024

TMRh20 commented Jul 20, 2024

This comment was marked as resolved.

TMRh20 commented Jul 27, 2024

2bndy5 commented Jul 27, 2024

2bndy5 commented Aug 2, 2024

TMRh20 commented Aug 2, 2024 via email

2bndy5 commented Oct 6, 2024

TMRh20 commented Oct 6, 2024

2bndy5 commented Oct 6, 2024

TMRh20 commented Oct 6, 2024

2bndy5 commented Oct 6, 2024

2bndy5 commented Oct 6, 2024

Release crusade completed

2bndy5 commented Jul 15, 2024 •

edited

Loading

github-actions bot commented Jul 15, 2024 •

edited

Loading

2bndy5 commented Jul 15, 2024 •

edited

Loading

2bndy5 commented Jul 15, 2024 •

edited

Loading

2bndy5 commented Jul 15, 2024 •

edited

Loading

2bndy5 commented Jul 16, 2024 •

edited

Loading