Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

k_pipe: add-ons to the k_pipe API rewrite #84052

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

npitre
Copy link
Collaborator

@npitre npitre commented Jan 15, 2025

This implements various cleanups and optimizations I saw the potential for
when I reviewed PR #83283. The original author did quite a lot of good work
already and I didn't want to block the merging of his own PR with these any
longer. Furthermore, it becomes easier to convey those things directly in
the code at some point.

Then it was pretty easy to add back the ability to do direct writer-to-reader
transfers and no-ring-buffer operations as the deprecated code did.

@zephyrbot
Copy link
Collaborator

zephyrbot commented Jan 15, 2025

The following west manifest projects have changed revision in this Pull Request:

Name Old Revision New Revision Diff

All manifest checks OK

Note: This message is automatically posted and updated by the Manifest GitHub Action.


if (K_TIMEOUT_EQ(timeout, K_NO_WAIT)) {
return -EAGAIN;
}

pipe->waiting++;
SYS_PORT_TRACING_OBJ_FUNC_BLOCKING(k_pipe, read, pipe, timeout);
z_pend_curr(&pipe->lock, *key, waitq, timeout);
rc = z_pend_curr(&pipe->lock, *key, waitq, timeout);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't utalize this because of this discussion with @andyross:
#83247 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify: I'm not a fan of promiscuous use of z_thread_return_value_set as it usually pretty confusing, wastes bytes in the thread struct, and tends to just recapitulate a bunch of state that the IPC mechanisms need to keep anyway (i.e. you generally need to have some sideband to tell you "why you woke up" and this is just a duplicate). My hope in the long term is that pend() ends up just flagging a boolean result: either you were deliberately awoken (currently a return value of zero), or you bounced for some async reason (and need to return -EAGAIN to your caller).

And in fact that's all this is doing: it just does the wakeup with a result code of zero, and checks for the -EAGAIN which is the default value. So while this is using a feature I personally dislike, it's doing it in a way that would port cleanly to the eventual utopia I imagine.

kernel/pipe.c Outdated
@@ -110,7 +100,7 @@ int z_impl_k_pipe_write(struct k_pipe *pipe, const uint8_t *data, size_t len, k_
}

if (pipe_empty(pipe)) {
notify_waiter(&pipe->data);
z_sched_wake(&pipe->data, 0, NULL);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference between the z_sched_wake & z_upend_* ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Layering. The former is synchronized and manages the timeout value and some other bits of state you need to worry about when you wake a thread up.

andyross
andyross previously approved these changes Jan 16, 2025
Copy link
Contributor

@andyross andyross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks clean. Preemptive +1 which people can assume is refreshed post-merge of the parent PR.


if (K_TIMEOUT_EQ(timeout, K_NO_WAIT)) {
return -EAGAIN;
}

pipe->waiting++;
SYS_PORT_TRACING_OBJ_FUNC_BLOCKING(k_pipe, read, pipe, timeout);
z_pend_curr(&pipe->lock, *key, waitq, timeout);
rc = z_pend_curr(&pipe->lock, *key, waitq, timeout);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify: I'm not a fan of promiscuous use of z_thread_return_value_set as it usually pretty confusing, wastes bytes in the thread struct, and tends to just recapitulate a bunch of state that the IPC mechanisms need to keep anyway (i.e. you generally need to have some sideband to tell you "why you woke up" and this is just a duplicate). My hope in the long term is that pend() ends up just flagging a boolean result: either you were deliberately awoken (currently a return value of zero), or you bounced for some async reason (and need to return -EAGAIN to your caller).

And in fact that's all this is doing: it just does the wakeup with a result code of zero, and checks for the -EAGAIN which is the default value. So while this is using a feature I personally dislike, it's doing it in a way that would port cleanly to the eventual utopia I imagine.

kernel/pipe.c Outdated
@@ -110,7 +100,7 @@ int z_impl_k_pipe_write(struct k_pipe *pipe, const uint8_t *data, size_t len, k_
}

if (pipe_empty(pipe)) {
notify_waiter(&pipe->data);
z_sched_wake(&pipe->data, 0, NULL);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Layering. The former is synchronized and manages the timeout value and some other bits of state you need to worry about when you wake a thread up.

Copy link
Collaborator

@peter-mitsis peter-mitsis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may have missed some subtleties (won't be the first or last time), but pending additional info there are I think a few issues with this proposed patch set.

  1. Shouldn't use of z_sched_wake() necessitate a call to z_reschedule() or one of its variants? The thread that is being awakened could be of higher priority than the current thread, so I would think that we would want to switch to it as soon as we can safely do so rather than rely upon some schedule point external to the pipe operation.
  2. The way I read the pipe reading code, it looks like we can wind up in a situation where we have a non-full pipe buffer and waiting writers. Imagine a full pipe buffer (say 100 bytes) with two waiting writers (that each want to write 1 byte) and we read exactly 100 bytes. The first writer will wake, the pipe buffer will be read, eventually the awakened writer will write its 1 byte to the pipe buffer, but no one wakes the 2nd writer.

Nicolas Pitre added 4 commits January 17, 2025 16:22
Simplify the logic, avoid repeated conditionals, avoid superfluous
scheduler calls, make the code more efficient and easier to read.

Signed-off-by: Nicolas Pitre <[email protected]>
Dispense with the call to sys_timepoint_expired() by leveraging
swap_retval to distinguish between notifications and timeouts when
z_pend_curr() returns.

Signed-off-by: Nicolas Pitre <[email protected]>
If there are pending readers, it is best to perform a single data copy
directly into their final destination buffer rather than doing one copy
into the ring buffer just to immediately copy the same data out of it.

Incidentally, this allows for supporting pipes with no ring buffer at all.

The pipe implementation being deprecated has a similar capability so better
have it here too.

Signed-off-by: Nicolas Pitre <[email protected]>
... now that the new pipe implementation supports it.

Signed-off-by: Nicolas Pitre <[email protected]>
@npitre
Copy link
Collaborator Author

npitre commented Jan 17, 2025 via email

@peter-mitsis
Copy link
Collaborator

I think that we are almost there. One more thing that I noticed is that the pipe_buf_spec is stored on the stack. This can be an issue on SMP platforms that do not have a coherent cache for the stack (xtensa). This is why the old pipes implementation had a copy of the pipe descriptor embedded within the thread structure (and swap_data pointed to it). I have some more thinking to do, but there might be some potential behavioral issues or asymmetries if k_pipe_read() is called from within an ISR and there are multiple writers waiting.

@npitre
Copy link
Collaborator Author

npitre commented Jan 18, 2025

One more thing that I noticed is that the pipe_buf_spec is stored on the
stack. This can be an issue on SMP platforms that do not have a coherent
cache for the stack (xtensa).

Can it be made coherent manually? The thread owning the stack is guaranteed
not to execute while other threads/CPUs access it.

This also means Xtensa cannot read from a pipe to a buffer on the stack either.

One possibility is to simply disable the direct copy path on Xtensa.

I have some more thinking to do, but there might be some potential
behavioral issues or asymmetries if k_pipe_read() is called from within
an ISR and there are multiple writers waiting.

k_pipe_read() from within an ISR cannot sleep. It may only consume data from
the ring buffer if any, and if the ring buffer was full it wakes up
potential waiting writers, and returns immediately.

@andyross
Copy link
Contributor

FWIW it's not Xtensa as a whole, it's specifically intel_adsp's SMP configurations. The architecture actually has a tunable for snoop-enabled cache, but I'm not aware of anyone that's used it.

And yes: it's best to just gate the feature on !defined(CONFIG_KERNEL_COHERENCE). It's possible to manually manage the cache, but very non-trivial. Not only is it incoherent, there's no bytewise tagging: so if one side writes one byte of a 64 byte cache line and ignores the other 63, a future flush can still clobber work other contexts have done. So you have to align and pad everything. And the cache line size changes between cores (the MTK ones I'm working with right now have 128 byte lines!). Not really worth it.

Honestly that's sort of my feeling for all the zero-copy work in our existing IPC primitives. Feels like a lot of code for not a lot of benefit. Apps that care about performance on that level should probably be passing pointers and queuing them with semaphores. Using a pipe almost always means you're willing to deal with some inefficiency in order to get a clean API.

Nicolas Pitre added 2 commits January 18, 2025 13:14
We are waking up threads but failed to let them run if they are
higher priority. Add missing calls to z_reschedule().

Also wake up all pending writers as we don't know how many there might
be. It is more efficient to wake them all when the ring buffer is full
before reading from it rather than waking them one by one whenever there is
more room in it.

Thanks to Peter Mitsis for noticing those issues.

Signed-off-by: Nicolas Pitre <[email protected]>
Systems that enabled this option don't have their stacks in coherent
memory. Given our pipe_buf_spec is stored on the stack, and readers may
also have their destination buffer on their stack too, it is not worth
going to the trouble of supporting direct-to-readers copy with them.

Signed-off-by: Nicolas Pitre <[email protected]>
@npitre
Copy link
Collaborator Author

npitre commented Jan 18, 2025 via email

@Mattemagikern
Copy link
Contributor

add back the ability to do direct writer-to-reader transfers and no-ring-buffer operations as the deprecated code did.

Do we really want to do this?
Traditionally, pipes serve as buffers between producers and consumers, ensuring a clear separation of concerns. A copy-based implementation guarantees consistent semantics regardless of producer/consumer timing or priority, making it the expected behavior of a pipe.

While the NULL buffer option gives users more flexibility, we should ask whether this optimization is truly worth it and widely applicable. Would it see enough real-world use to justify the added complexity? More importantly, are there better alternatives that should be used instead?

By not offering this option, we might actually guide users toward a more suitable data structure for zero-copy communication, rather than complicating the pipe abstraction.

I'm not necessarily opposed to this change—I just want to make sure we've fully considered the implications.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants