-
Notifications
You must be signed in to change notification settings - Fork 322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] IPC4 fails to set up stream metadata correctly #8639
Comments
@marcinszkudlinski can you comment on a solution ? |
Yes, I know there are inconsistences at number of channels. that's why I introduced a WA to mtl 007 branch:
problems are being investigated |
Add @andrula-song as assignee given expert-seeming comments in #8571 |
@ujfalusi can you give this some priority. Thanks. @plbossart fyi, driver may be sending incorrect IPC4 data, we are not sure yet. |
@andyross, can you attach kernel and firmware logs with pointers where things go south? I'm not sure if I follow the description. pipelines are prepared in sequence which means that one side of the module likely will not prepared when it is prepared. I'm not sure what is the direction in this case. The code for |
Adding @cujomalainey @johnylin76 since @andyross out till Friday. |
@lkoenig is the other person working on this whom might have logs. |
I am afraid there is nothing in the kernel that deals with the number of channels. All information on configurations are provided by topology and sent to firmware as is. I do remember talking with @cujomalainey on the AEC/DMIC integration. The premise for DMICs since GLK is that we ALWAYS deal with 4 channels, and userspace/UCM extract the useful channels based on platform-specific channel maps. But if the DMIC input is provided to AEC, then how would the firmware know which channels are to be dealt with? Conversely the echo reference is two channels, so this really begs the question on how the echo reference is mapped to which input channels for processing. So yes there's clearly a 2:4 mismatch in the AEC integration... |
@marcinszkudlinski @plbossart it sounds like we could add a topology tuple(s) for AEC DMIC channel mapping. i.e. AEC echo ref is channels mask 0x3 |
To clarify: the DMIC source to AEC does indeed represent itself correctly as 4 channels. And that's indeed a problem that needs solving (right now AEC simply drops the second two). But at the source/sink metadata level, everything is correct. It's promising us 4x s16 @48khz and delivering the expected 8 bytes per 1/48000 seconds. This bug is that the reference input (demuxed from the playback pipeline by the copier) is also listing 4 channels, even though it's clearly only providing two. And AFAICT the topology shows two there. I don't know where the "4" is coming from. |
@johnylin76 has been doing some excellent work in CRAS to enable easy dynamic configuration of SOF. The main target is DRC/EQ configs from CRAS format. That being said, it would not be hard to extend that to a channel map which we have on hand already and write that down to the firmware. We could even update it for WF vs UF mics. |
Actually I can't! :) The symptom if you're missing the alignment_constants calls is just ring buffer madness downstream of the component. The only symptom is that it sounds like junk. The proximate cause there seems to be (from code reading, I won't have a chance to try this until I get back to the box on Friday night) just uninitialized data in the struct. The constructor for the audio_stream in the buffer never sets the values, leaving them either at the calloc() zeros or heap garbage, and neither is correct somehow. (The actual implementation of set_alignment_constants computes values from the arguments, it can't just be left at a zero). I'm hopeful that just making sure the initialization code calls this with reasonable values is going to be enough. Then we can go around and remove all the seemingly-stubbed calls to this function[1] from all the components I see have had them added. :) The symptom from the incorrect channel count is just short bytes. The AEC falls behind trying to read 10ms of data from the reference pipe for every 10ms of capture, so it iterates only once in 20ms, producing a flood of errors at the upstream DAI and otherwise being easier to diagnose. [1] Most of them pass "1" as the needed alignment, which I can guarantee is incorrect for existing formats as Xtensa won't do a misaligned 16/32 bit load except in a handful of instructions with special addressing. |
That would work, and/or we could trivially do that internal to the component too with a control (somewhat oddly tuples seem harder to get to in component code than controls, or I just haven't found the right API yet). I think the bigger question is policy: there's a software constraint (AEC can only process two channels) and a hardware behavior (there are four mic channels). We need a policy decision as to which channels to process and how to expose that to host code via PCMs. Bluntly: why are there four microphones and what are we supposed to do with them? |
That's just a(nother![1]) workaround at the component level though. The component isn't responsible for configuring its input sources, right? The copier/DAI for the reference input clearly is two channel, the sof_source created from it should show the same metadata. How do we fix this so the pipeline code does it correctly? [1] I'll absolutely pull this in to the current workaround patch, obviously. It's cleaner than detecting the mismatch and calling source_set_channels(), as at least it's sourced from the module base_cfg (which I assume comes out of topology?) |
Just to restate my perspective here: the incorrect channel count and wrong alignment data are fields in a |
IPC4 fails to set up stream metadata on the buffer objects at pipeline creation time (see Issue thesofproject#8639). Traditionally components have been responsible for doing this. The way this works here is that during the init() call, we cache the relevant ipc4_audio_format records from the module "extended config" (which is untyped byte-packed data, I think it's raw bytes from the host?) that correspond to our connected streams. This config struct gets freed after initialization, so it has to be done then. Then at prepare time, we must use ipc4_update_source/sink_format() to set values on the (incompletely initialized) streams. Similarly, we have to call audio_stream_init_alignment_constants(), otherwise needed values on the audio stream remain uninitialized. It's not really documented what happens if our settings disagree with the component on the other side of the buffer! Presumably that would be a fatal topology bug, but nothing is set up to detect it. In fact in practice most other components DON'T do this, so our settings win (and thus we must do this, or else the rest of the stream sees garbage and misbehaves, generally with buffer overruns or short reads). Finally, there is somewhat... unique handling needed to get correct IBS/OBS buffer sizing. We have two inputs with different stream formats, and thus different minimum block sizes. The "base" module config only has space for one IBS value. So the pipeline code has an "extended" config with separate IBS per pin, but it will only use it if it knows about it, which it doesn't by default because module initialization throws away the data! So it falls to us to duplicate a copy and store it ourselves, in a separate field from where we found it. Also ntoe: The "pin" index (the value of the pin_index field of the ipc4_in/output_pin_format structs seen in module config at init() time) and the "source" index (the ordering of the sources[] argument to prepare()/process()) ARE PRESENTED BACKWARDS! And I don't see any API to tell me which is which (note that the ordering of the connected buffers in the comp_dev sink/source lists is a THIRD convention, and matches sources[]/sinks[]). Signed-off-by: Andy Ross <[email protected]>
I spent most of today reverse-engineering the changes in the most recent mtl-007-drop-stable and can duplicate results on my tree. Honestly, this bug has to stay open. Here's the isolated workaround patch on top of my temporary tree (I'll get it rebased on main, retested on mt8195, and submitted tomorrow): andyross@e3345b7 First: please read the commit message (duplicated in comments) carefully. The details here do seem extremely complicated. Second: please check my work. It's certainly possible I've missed something. The root causes here were, in full:
My position is that none of that code should be required to be present in a working component. Those tasks are all the job of the IPC/module/pipeline setup code. If I'm wrong, are any of those requirements documented? This all seems ridiculous to me, though the workarounds are real and do seem to work. |
IPC4 fails to set up stream metadata on the buffer objects at pipeline creation time (see Issue thesofproject#8639). Traditionally components have been responsible for doing this. The way this works here is that during the init() call, we cache the relevant ipc4_audio_format records from the module "extended config" (which is untyped byte-packed data, I think it's raw bytes from the host?) that correspond to our connected streams. This config struct gets freed after initialization, so it has to be done then. Then at prepare time, we must use ipc4_update_source/sink_format() to set values on the (incompletely initialized) streams. Similarly, we have to call audio_stream_init_alignment_constants(), otherwise needed values on the audio stream remain uninitialized. It's not really documented what happens if our settings disagree with the component on the other side of the buffer! Presumably that would be a fatal topology bug, but nothing is set up to detect it. In fact in practice most other components DON'T do this, so our settings win (and thus we must do this, or else the rest of the stream sees garbage and misbehaves, generally with buffer overruns or short reads). Finally, there is somewhat... unique handling needed to get correct IBS/OBS buffer sizing. We have two inputs with different stream formats, and thus different minimum block sizes. The "base" module config only has space for one IBS value. So the pipeline code has an "extended" config with separate IBS per pin, but it will only use it if it knows about it, which it doesn't by default because module initialization throws away the data! So it falls to us to duplicate a copy and store it ourselves, in a separate field from where we found it. Also ntoe: The "pin" index (the value of the pin_index field of the ipc4_in/output_pin_format structs seen in module config at init() time) and the "source" index (the ordering of the sources[] argument to prepare()/process()) ARE PRESENTED BACKWARDS! And I don't see any API to tell me which is which (note that the ordering of the connected buffers in the comp_dev sink/source lists is a THIRD convention, and matches sources[]/sinks[]). Signed-off-by: Andy Ross <[email protected]>
Audio stream objects are generally allocated out of zero'd heap memory, but the computed alignment fields need non-zero values to have correct behavior. These were being left as garbage. Set them to a byte and frame alignmnt of 1, as this corresponds to pervasive convention in the tree. But note that a byte alignment of 1 is actually technically incorrect, as all existing DSP targets are on architectures which don't allow misaligned loads. But existing component code is universally coded correctly anyway. (It's worth pointing out that "set_alignment_constants()" is now exposed as an API call on abstracted source/sink objects. But the only current implementation is on audio_stream. Future implementations will need to correctly initialize themselves.) This is a partial fix for Issue thesofproject#8639 Signed-off-by: Andy Ross <[email protected]>
Audio stream objects are generally allocated out of zero'd heap memory, but the computed alignment fields need non-zero values to have correct behavior. These were being left as garbage. Set them to a byte and frame alignment of 1, as this corresponds to pervasive convention in the tree. But note that a byte alignment of 1 is actually technically incorrect, as all existing DSP targets are on architectures which don't allow misaligned loads. But existing component code is universally coded correctly anyway. (It's worth pointing out that "set_alignment_constants()" is now exposed as an API call on abstracted source/sink objects. But the only current implementation is on audio_stream. Future implementations will need to correctly initialize themselves.) This is a partial fix for Issue thesofproject#8639 Signed-off-by: Andy Ross <[email protected]>
Audio stream objects are generally allocated out of zero'd heap memory, but the computed alignment fields need non-zero values to have correct behavior. These were being left as garbage. Set them to a byte and frame alignmnt of 1, as this corresponds to pervasive convention in the tree. But note that a byte alignment of 1 is actually technically incorrect, as all existing DSP targets are on architectures which don't allow misaligned loads. But existing component code is universally coded correctly anyway. (It's worth pointing out that "set_alignment_constants()" is now exposed as an API call on abstracted source/sink objects. But the only current implementation is on audio_stream. Future implementations will need to correctly initialize themselves.) This is a partial fix for Issue thesofproject#8639 Signed-off-by: Andy Ross <[email protected]>
Been working on that, actually. See #8685 for IMHO a much cleaner take on the "base_cfg_ext" problem that doesn't require module code do weird gymnastics managing memory in the core module layers. It is, admittedly, pretty opinionated. But it's a lot smaller, much simpler for component code, and as mentioned I do feel like we've gotten stuck with a really bad API paradigm here. With that, the big workaround patch in the AEC series linked above is down to just one line that looks like:
And I'm pretty sure I see where the problem is there and hope to have a fix out tomorrow. |
Audio stream objects are generally allocated out of zero'd heap memory, but the computed alignment fields need non-zero values to have correct behavior. These were being left as garbage. Set them to a byte and frame alignmnt of 1, as this corresponds to pervasive convention in the tree. But note that a byte alignment of 1 is actually technically incorrect, as all existing DSP targets are on architectures which don't allow misaligned loads. But existing component code is universally coded correctly anyway. (It's worth pointing out that "set_alignment_constants()" is now exposed as an API call on abstracted source/sink objects. But the only current implementation is on audio_stream. Future implementations will need to correctly initialize themselves.) This is a partial fix for Issue thesofproject#8639 Signed-off-by: Andy Ross <[email protected]>
As specified, this function was a bit of a booby trap: it has to be called exactly once, after all other setup that modifies frame size has been done. If it is called in the wrong order, or not at all, any consumers of the frame alignment API on this stream will see garbage data in the computed fields. That's way too much complexity for the component code that needs to use this tool, especially as the impact is not in the component itself (e.g. it's a downstream copier widget with SIMD code that fails). Instead, preserve the two requirements set by this function in the audio_stream struct and recompute them any time any of the upstream values change. That allows this to be safely used from any context. There's a mild complexity with audio_sink/source layer, which is written to directly modify the format (it keeps a pointer into audio_stream of its own) without going through the audio_stream API itself. There, the framework gives us "on_audio_format_set()" callbacks on source and sink, so implement it there. This is a partial fix for Issue thesofproject#8639 Signed-off-by: Andy Ross <[email protected]>
As specified, this function was a bit of a booby trap: it has to be called exactly once, after all other setup that modifies frame size has been done. If it is called in the wrong order, or not at all, any consumers of the frame alignment API on this stream will see garbage data in the computed fields. That's way too much complexity for the component code that needs to use this tool, especially as the impact is not in the component itself (e.g. it's a downstream copier widget with SIMD code that fails). Instead, preserve the two requirements set by this function in the audio_stream struct and recompute them any time any of the upstream values change. That allows this to be safely used from any context. There's a mild complexity with audio_sink/source layer, which is written to directly modify the format (it keeps a pointer into audio_stream of its own) without going through the audio_stream API itself. There, the framework gives us "on_audio_format_set()" callbacks on source and sink, so implement it there. This is a partial fix for Issue thesofproject#8639 Signed-off-by: Andy Ross <[email protected]>
IPC4 fails to set up stream metadata on the buffer objects at pipeline creation time (see Issue thesofproject#8639). Traditionally components have been responsible for doing this at prepare time via calling ipc4_update_source/sink_format() to set values on the (incompletely initialized) streams. In particular, on MTL the reference input stream is presented with an incorrect channel count (4, matching the capture stream; but the reference/playback pipeline has only two channels) and needs to be corrected. Also note: The "pin" index (the value of the pin_index field of the ipc4_in/output_pin_format structs seen in module config at init() time) and the "source" index (the ordering of the sources[] argument to prepare()/process()) ARE PRESENTED BACKWARDS! And I don't see any API to tell me which is which (note that the ordering of the connected buffers in the comp_dev sink/source lists is a THIRD convention, and matches sources[]/sinks[]). Signed-off-by: Andy Ross <[email protected]> [DNM] Google AEC: minimize workaround This is down to just one line now with existing (still in review) PRs applied
As specified, this function was a bit of a booby trap: it has to be called exactly once, after all other setup that modifies frame size has been done. If it is called in the wrong order, or not at all, any consumers of the frame alignment API on this stream will see garbage data in the computed fields. That's way too much complexity for the component code that needs to use this tool, especially as the impact is not in the component itself (e.g. it's a downstream copier widget with SIMD code that fails). Instead, preserve the two requirements set by this function in the audio_stream struct and recompute them any time any of the upstream values change. That allows this to be safely used from any context. There's a mild complexity with audio_sink/source layer, which is written to directly modify the format (it keeps a pointer into audio_stream of its own) without going through the audio_stream API itself. There, the framework gives us "on_audio_format_set()" callbacks on source and sink, so implement it there. This is a partial fix for Issue #8639 Signed-off-by: Andy Ross <[email protected]>
@andyross is this issue good to close now? |
Let's leave this open for another week or two while I get the AEC rework rebased. The actual stream format for multi-output copier objects is still being set incorrectly, because the copier doesn't get the cfg_ext data that would contain the correct value, because it has its own config struct in the protocol where cfg_ext would go. It's a protocol bug basically, the topology isn't transmitted to the firmware in a single format or a single place, every component is rolling its own. Basically I need to write up the protocol stuff in a new bug before closing this. |
Component stream state in IPC4 doesn't seem to be set up reliably. When the sof-mtl-max98357a-rt5682-ssp2-ssp0 topology (built form the mtl-007-drop-stable branch), the Google AEC component is being presented with a reference/playback input source that claims 4 channels, but in fact is only providing enough data for two.
Similarly, the computed "alignment constants" on a buffer's audio stream start out with (I think) garbage, and require that something somewhere call
source/sink_set_alignment_constants()
with arbitrary (and even incorrect, e.g. a byte alignment of 1 was being used!) arguments just to get the fields initialized.The component has always had code present (see this commit for the current form: bab124d) to fix this at prepare() time. But it's fragile, undocumented, got dropped during refactoring and took me DAYS to figure out; the symptoms were just ring buffer overruns in the downstream components that produced "mild distortion".
This just isn't right. Components don't own their sinks/sources and shouldn't ever be responsible for configuring them. The pipeline/IPC layer needs to do this correctly from topology, and in the event that the data is wrong or mismatched, it needs to fail gracefully. We shouldn't ever be in a situation like this where a source promises 8 byte frames per sample period but delivers 4.
The text was updated successfully, but these errors were encountered: