-
Notifications
You must be signed in to change notification settings - Fork 6.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
k_pipe: add-ons to the k_pipe API rewrite #84052
base: main
Are you sure you want to change the base?
Conversation
The following west manifest projects have changed revision in this Pull Request:
✅ All manifest checks OK Note: This message is automatically posted and updated by the Manifest GitHub Action. |
|
||
if (K_TIMEOUT_EQ(timeout, K_NO_WAIT)) { | ||
return -EAGAIN; | ||
} | ||
|
||
pipe->waiting++; | ||
SYS_PORT_TRACING_OBJ_FUNC_BLOCKING(k_pipe, read, pipe, timeout); | ||
z_pend_curr(&pipe->lock, *key, waitq, timeout); | ||
rc = z_pend_curr(&pipe->lock, *key, waitq, timeout); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't utalize this because of this discussion with @andyross:
#83247 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify: I'm not a fan of promiscuous use of z_thread_return_value_set as it usually pretty confusing, wastes bytes in the thread struct, and tends to just recapitulate a bunch of state that the IPC mechanisms need to keep anyway (i.e. you generally need to have some sideband to tell you "why you woke up" and this is just a duplicate). My hope in the long term is that pend() ends up just flagging a boolean result: either you were deliberately awoken (currently a return value of zero), or you bounced for some async reason (and need to return -EAGAIN to your caller).
And in fact that's all this is doing: it just does the wakeup with a result code of zero, and checks for the -EAGAIN which is the default value. So while this is using a feature I personally dislike, it's doing it in a way that would port cleanly to the eventual utopia I imagine.
kernel/pipe.c
Outdated
@@ -110,7 +100,7 @@ int z_impl_k_pipe_write(struct k_pipe *pipe, const uint8_t *data, size_t len, k_ | |||
} | |||
|
|||
if (pipe_empty(pipe)) { | |||
notify_waiter(&pipe->data); | |||
z_sched_wake(&pipe->data, 0, NULL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the difference between the z_sched_wake
& z_upend_*
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Layering. The former is synchronized and manages the timeout value and some other bits of state you need to worry about when you wake a thread up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks clean. Preemptive +1 which people can assume is refreshed post-merge of the parent PR.
|
||
if (K_TIMEOUT_EQ(timeout, K_NO_WAIT)) { | ||
return -EAGAIN; | ||
} | ||
|
||
pipe->waiting++; | ||
SYS_PORT_TRACING_OBJ_FUNC_BLOCKING(k_pipe, read, pipe, timeout); | ||
z_pend_curr(&pipe->lock, *key, waitq, timeout); | ||
rc = z_pend_curr(&pipe->lock, *key, waitq, timeout); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify: I'm not a fan of promiscuous use of z_thread_return_value_set as it usually pretty confusing, wastes bytes in the thread struct, and tends to just recapitulate a bunch of state that the IPC mechanisms need to keep anyway (i.e. you generally need to have some sideband to tell you "why you woke up" and this is just a duplicate). My hope in the long term is that pend() ends up just flagging a boolean result: either you were deliberately awoken (currently a return value of zero), or you bounced for some async reason (and need to return -EAGAIN to your caller).
And in fact that's all this is doing: it just does the wakeup with a result code of zero, and checks for the -EAGAIN which is the default value. So while this is using a feature I personally dislike, it's doing it in a way that would port cleanly to the eventual utopia I imagine.
kernel/pipe.c
Outdated
@@ -110,7 +100,7 @@ int z_impl_k_pipe_write(struct k_pipe *pipe, const uint8_t *data, size_t len, k_ | |||
} | |||
|
|||
if (pipe_empty(pipe)) { | |||
notify_waiter(&pipe->data); | |||
z_sched_wake(&pipe->data, 0, NULL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Layering. The former is synchronized and manages the timeout value and some other bits of state you need to worry about when you wake a thread up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may have missed some subtleties (won't be the first or last time), but pending additional info there are I think a few issues with this proposed patch set.
- Shouldn't use of z_sched_wake() necessitate a call to z_reschedule() or one of its variants? The thread that is being awakened could be of higher priority than the current thread, so I would think that we would want to switch to it as soon as we can safely do so rather than rely upon some schedule point external to the pipe operation.
- The way I read the pipe reading code, it looks like we can wind up in a situation where we have a non-full pipe buffer and waiting writers. Imagine a full pipe buffer (say 100 bytes) with two waiting writers (that each want to write 1 byte) and we read exactly 100 bytes. The first writer will wake, the pipe buffer will be read, eventually the awakened writer will write its 1 byte to the pipe buffer, but no one wakes the 2nd writer.
Simplify the logic, avoid repeated conditionals, avoid superfluous scheduler calls, make the code more efficient and easier to read. Signed-off-by: Nicolas Pitre <[email protected]>
Dispense with the call to sys_timepoint_expired() by leveraging swap_retval to distinguish between notifications and timeouts when z_pend_curr() returns. Signed-off-by: Nicolas Pitre <[email protected]>
If there are pending readers, it is best to perform a single data copy directly into their final destination buffer rather than doing one copy into the ring buffer just to immediately copy the same data out of it. Incidentally, this allows for supporting pipes with no ring buffer at all. The pipe implementation being deprecated has a similar capability so better have it here too. Signed-off-by: Nicolas Pitre <[email protected]>
... now that the new pipe implementation supports it. Signed-off-by: Nicolas Pitre <[email protected]>
both issues were real and should now be fixed.
|
I think that we are almost there. One more thing that I noticed is that the pipe_buf_spec is stored on the stack. This can be an issue on SMP platforms that do not have a coherent cache for the stack (xtensa). This is why the old pipes implementation had a copy of the pipe descriptor embedded within the thread structure (and swap_data pointed to it). I have some more thinking to do, but there might be some potential behavioral issues or asymmetries if k_pipe_read() is called from within an ISR and there are multiple writers waiting. |
Can it be made coherent manually? The thread owning the stack is guaranteed This also means Xtensa cannot read from a pipe to a buffer on the stack either. One possibility is to simply disable the direct copy path on Xtensa.
|
FWIW it's not Xtensa as a whole, it's specifically intel_adsp's SMP configurations. The architecture actually has a tunable for snoop-enabled cache, but I'm not aware of anyone that's used it. And yes: it's best to just gate the feature on Honestly that's sort of my feeling for all the zero-copy work in our existing IPC primitives. Feels like a lot of code for not a lot of benefit. Apps that care about performance on that level should probably be passing pointers and queuing them with semaphores. Using a pipe almost always means you're willing to deal with some inefficiency in order to get a clean API. |
We are waking up threads but failed to let them run if they are higher priority. Add missing calls to z_reschedule(). Also wake up all pending writers as we don't know how many there might be. It is more efficient to wake them all when the ring buffer is full before reading from it rather than waking them one by one whenever there is more room in it. Thanks to Peter Mitsis for noticing those issues. Signed-off-by: Nicolas Pitre <[email protected]>
Systems that enabled this option don't have their stacks in coherent memory. Given our pipe_buf_spec is stored on the stack, and readers may also have their destination buffer on their stack too, it is not worth going to the trouble of supporting direct-to-readers copy with them. Signed-off-by: Nicolas Pitre <[email protected]>
CONFIG_KERNEL_COHERENCE awareness added.
|
Do we really want to do this? While the NULL buffer option gives users more flexibility, we should ask whether this optimization is truly worth it and widely applicable. Would it see enough real-world use to justify the added complexity? More importantly, are there better alternatives that should be used instead? By not offering this option, we might actually guide users toward a more suitable data structure for zero-copy communication, rather than complicating the pipe abstraction. I'm not necessarily opposed to this change—I just want to make sure we've fully considered the implications. |
This implements various cleanups and optimizations I saw the potential for
when I reviewed PR #83283. The original author did quite a lot of good work
already and I didn't want to block the merging of his own PR with these any
longer. Furthermore, it becomes easier to convey those things directly in
the code at some point.
Then it was pretty easy to add back the ability to do direct writer-to-reader
transfers and no-ring-buffer operations as the deprecated code did.