-
Notifications
You must be signed in to change notification settings - Fork 6.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kernel: k_pipe: Rewrite k_pipe API #83283
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a pretty solid path to me. Some notes after a very quick read-through, the big ones being "fix the races" and "use ring_buf if possible". You're also going to (obviously) need test coverage for all the cases here, including ideally a stress that will bang pipes with variant read/write/buffer sizes from multiple threads on each end.
The process and architecture people will want to chime in as to migration strategy. As you'll discover I'm a lot (like... a lot) more tolerant of churn and cowboy evolution than most of the project. My guess is that people are going to demand a more conservative rollout. Maybe give the two APIs distinct names and let them live in the tree simultaneously for a version before flagging the old stuff deprecated. Put each under kconfigs such that they can be independently en/disabled, etc...
Also recognize that treewide deprecation means combing through all the in-tree code to make sure that nothing else is using the old layers, deleting or filtering test cases using the wrong stuff, etc... It's tedious and annoying and everyone will hate you when their precious out-of-tree code stops building; just be prepared. :)
kernel/pipe.c
Outdated
pipe->tail = 0U; | ||
pipe->count = 0U; | ||
|
||
// or -ECONNRESET ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heh. You do seem to be trawling pretty deep into errno.h to try to find stuff I haven't seen before. FWIW, I think that's usually more harm than good unless you're extremely careful about documenting what each code means and how a user is expected to react. 99%+ of the time -EINVAL or -EAGAIN is all that's required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’ve made sure to document the behavior carefully to address potential ambiguity.
In this case, the use of this specific error code is intended for a scenario where we want to signal to a consumer thread that the current work order or process is no longer valid, without needing to close the pipe or disrupt the threads that rely on it. I believe this is a valid use case that justifies going beyond the more common -EINVAL
or -EAGAIN
.
e46dad2
to
a7b74f2
Compare
Thanks for your initial review @andyross! I see an oportunity here I'd like to explore; you mention to use the |
That seems like an eminently reasonable choice too. Though FWIW if you have a pipe based on pre-existing primitives then its should probably live in lib/ somewhere and not kernel/. Whether the overhead is too high is sort of an app question. As it happens k_sem is really pretty lean and optimal, and I'd guess that kind of tool would benchmark faster than the pipe we have already. But surely you could do better at the bottom of the stack. I love optimization, but don't know that we should demand it. |
In addition to the points that Andy raised, there are a couple of implications that are worth pointing out related to the copying of data.
|
a7b74f2
to
17a7e7e
Compare
I’ve really enjoyed the learning experience of working with the scheduler, particularly with the
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments:
This was mentioned already, but you should consider using the ring_buffer
facility rather than open coding your own. This will save some binary
footprint not to have 2 different implementations linked in. Furthermore,
the most costly part in yours is by far the modulus operator on some targets
which the generic code dispense with.
You say:
I looked into the
ring_buffer.c
and it looks like it is two things in
one? Both an object queue and a ring_buffer.c for bytes.
Not sure what you mean here, however if it is the ring_buf_item_*
stuff
that looks confusing then you may safely ignore it. And better look in
ring_buffer.h
first instead.
The size of the ring_buffer struct alone would almost double the size of
the k_pipe object so I opted to not use it.
Given those fields you will no longer need, the k_pipe object would grow by
only 16 bytes. Unless you are using a gazillon of pipes, the tradeoff is in
favor of code reuse both in terms of binary size and maintainability.
(Since I almost had all the logic in place for the ring-buffer already).
Sorry but such arguments don't count. ;-)
Next, the pipe implementation itself is lacking. If you want to write
10 bytes but there is room for only 5 bytes then you should write those 5
bytes, flag any waiters, and wait up to the timeout to write the remaining 5
bytes. If suddenly you're woken up and there is only room for 3 bytes then
you should write those 3 bytes, flag potential waiters, readjust the timeout
for what remains of the original one, and wait again for the last 2 bytes.
Same logic goes for the reader. Currently you return early when there is not
enough room despite the timeout not being reached.
Timeout readjusting is tricky. You may look at sys_timepoint_timeout()
usage
in kheap.c for an example of how to do it simply.
17a7e7e
to
1357469
Compare
The distinction you’ve pointed out between this implementation and other kernel objects is intentional. By waking up as soon as there’s a chance to send or receive data, it aims to maximize concurrency, reduce blocking, and decrease complexity. I understand that this behavior deviates from existing patterns, which might cause confusion or misaligned expectations. However, this new behavior aligns closely with the concept of a byte-stream, which I believe makes it more intuitive for certain use cases.
The issue of indeterminate time elapsing between wakeup and execution is indeed a valid concern, especially when considering scenarios where the timeout might expire during this window. The current design assumes that the timeout applies to the operation's initiation, not its completion. Achieving precise timing for both initiation and completion would be challenging. A possible middle ground might be to document the timeout behavior explicitly in the API to set clear expectations for users. Regarding data transfers larger than the pipe buffer, this scenario is effectively handled by allowing operations to perform partial reads and writes. This mechanism ensures that even when the data size exceeds the buffer capacity, the transfer can proceed incrementally until completion. |
Ahaha Fair! I’ve taken another look at the implementation and made revisions to utilize the ring_buffer as suggested. While I initially hesitated due to the additional overhead, I agree that the tradeoff in terms of binary size and maintainability justifies the change. Reusing existing infrastructure is the better long-term choice. Regarding the
While I understand the reasoning behind partial writes and adjusting the timeout dynamically, I believe that returning early when there is insufficient space aligns better with the design goals of simplicity and efficiency. Continuously retrying with adjusted timeouts adds complexity (and possibly overhead), which might not be necessary for most use cases. Instead, handling partial writes and managing timeouts at a higher level can keep the implementation cleaner, more predictable and easier to test.
Thank you for the advice, I'll keep it in mind depending on the outcome of this discussion :) |
Something that struck me when I made the move to The
I recommend keeping
What are your thoughts on this? |
There's already a CONFIG_PIPES that gets the pipe code, just have that "select RING_BUFFER". It's true that our kconfigs are sometimes finer grained than they should be, but there's no shame in having the kernel depend on code in lib/ that's what it's there for. |
What are your thoughts on this?
Many things rely on ring_buffer.c already, so I'm rather surprized it is
not always compiled. I'd say it is light and quick to compile so no real
gain in leaving it out, given that bitarray.c is even bigger and always
built.
|
I propose removing the Including these functions in the interface suggests that their results can be relied upon for decision-making, which risks subtle bugs or misuse. If their intended purpose is to provide a "snapshot" for non-critical use cases, their utility seems limited. For a more robust and intuitive interface, I recommend removing these functions entirely, as doing so would simplify the API and reduce potential confusion for users. That said, these functions are currently used in |
aa754c0
to
814edfd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took another pass through. Looking cleaner now. The ring_buf usage looks correct and straightforward. To repeat though: I'm absolutely not above appreciating a rewritten data structure, but if you do it you need to show clear and obvious advantages. :)
As others are pointing out, the semantics of short counts are a little confusing. Zephyr IPC almost always takes timeout arguments, so ideally this code should sit in a loop trying to iteratively read/write until the timeout expires instead of returning on the first byte.
Code that actually does want to see the partial results (which is usually a rare edge case kind of thing, the whole idea of "file descriptor" style I/O is that it synthesizes a synchronous interface over async hardware behavior) can always try to read one byte as a sentinel and follow it with a call using K_NO_WAIT to grab the rest.
814edfd
to
c1f63fa
Compare
The following west manifest projects have changed revision in this Pull Request:
✅ All manifest checks OK Note: This message is automatically posted and updated by the Manifest GitHub Action. |
Add tracing support for the reworked k_pipe API. Signed-off-by: Måns Ansgariusson <[email protected]>
The struct ring_buf is renamed to struct ring_buffer to be able to coexist with the sys/ring_buffer.h header file. Signed-off-by: Måns Ansgariusson <[email protected]>
The struct ring_buf is renamed to struct ring_buffer to be able to coexist with the sys/ring_buffer.h header file. Signed-off-by: Måns Ansgariusson <[email protected]>
The smp_shell module uses k_fifo, but does not include the kernel header file that defines it. This commit adds the missing include. Signed-off-by: Måns Ansgariusson <[email protected]>
Replaced the k_pipe-based implementation in sockpair with ring_buffer based implementation instead. The move to ring_buffer is done to avoid overhead of k_pipe and to align with the new k_pipe API. This does not pose any added risk to concurrency as the read and write functions are protected by semaphores for both spairs. Signed-off-by: Måns Ansgariusson <[email protected]>
Update tests to use the reworked k_pipe API. Signed-off-by: Måns Ansgariusson <[email protected]>
Good evening, everyone! Thanks to @eriktamlin at Percepio, the integration tests for the I believe all feedback has been addressed at this stage. I’m switching the PR from Draft to Ready for Review. Please let me know if there’s anything further that needs attention. Thank you for your time and input! |
@Mattemagikern please update #80539 to track deprecation |
I approved too soon I think. I realized something after reviewing the extension to this by @npitre (see #84052). At some point after the waiter is notified, we need to invoke z_reschedule() to force a schedule point. Otherwise we are relying upon some external action to force a reschedule (such as an interrupt, pending on an object, ...) |
I'll add the `z_reschedule()` to my follo-up patches.
|
The
k_pipe_*
API has been reworked to provide a more consistent andintuitive interface. The new API aims to provide a simple to use byte
stream interface that is more in line with the POSIX pipe API.
The previous API has been deprecated and will be removed in a future
release.
Signed-off-by: Måns Ansgariusson [email protected]