Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] IPC4 fails to set up stream metadata correctly #8639

Open
andyross opened this issue Dec 16, 2023 · 21 comments
Open

[BUG] IPC4 fails to set up stream metadata correctly #8639

andyross opened this issue Dec 16, 2023 · 21 comments
Assignees
Labels
bug Something isn't working as expected IPC4 Issues observed with IPC4 (same IPC as Windows)

Comments

@andyross
Copy link
Contributor

andyross commented Dec 16, 2023

Component stream state in IPC4 doesn't seem to be set up reliably. When the sof-mtl-max98357a-rt5682-ssp2-ssp0 topology (built form the mtl-007-drop-stable branch), the Google AEC component is being presented with a reference/playback input source that claims 4 channels, but in fact is only providing enough data for two.

Similarly, the computed "alignment constants" on a buffer's audio stream start out with (I think) garbage, and require that something somewhere call source/sink_set_alignment_constants() with arbitrary (and even incorrect, e.g. a byte alignment of 1 was being used!) arguments just to get the fields initialized.

The component has always had code present (see this commit for the current form: bab124d) to fix this at prepare() time. But it's fragile, undocumented, got dropped during refactoring and took me DAYS to figure out; the symptoms were just ring buffer overruns in the downstream components that produced "mild distortion".

This just isn't right. Components don't own their sinks/sources and shouldn't ever be responsible for configuring them. The pipeline/IPC layer needs to do this correctly from topology, and in the event that the data is wrong or mismatched, it needs to fail gracefully. We shouldn't ever be in a situation like this where a source promises 8 byte frames per sample period but delivers 4.

@andyross andyross added the bug Something isn't working as expected label Dec 16, 2023
@lgirdwood
Copy link
Member

@marcinszkudlinski can you comment on a solution ?

@marcinszkudlinski
Copy link
Contributor

Yes, I know there are inconsistences at number of channels.
Other parameters are set properly

that's why I introduced a WA to mtl 007 branch:

/* TODO - it does not work, to be checked before merging!!

problems are being investigated

@andyross
Copy link
Contributor Author

Add @andrula-song as assignee given expert-seeming comments in #8571

@lgirdwood
Copy link
Member

@ujfalusi can you give this some priority. Thanks. @plbossart fyi, driver may be sending incorrect IPC4 data, we are not sure yet.

@lgirdwood lgirdwood added the P1 Blocker bugs or important features label Dec 19, 2023
@ujfalusi
Copy link
Contributor

@andyross, can you attach kernel and firmware logs with pointers where things go south? I'm not sure if I follow the description.
If the sof-mtl-max98357a-rt5682-ssp2-ssp0 represents the Google AEC with 4 channels and it should not be like that then the issue is in topology?

pipelines are prepared in sequence which means that one side of the module likely will not prepared when it is prepared. I'm not sure what is the direction in this case.

The code for src/audio/google/google_rtc_audio_processing.c differs quite a bit between main, mtl-007-drop-stable and wherever bab124d is.

@lgirdwood
Copy link
Member

Adding @cujomalainey @johnylin76 since @andyross out till Friday.

@cujomalainey
Copy link
Contributor

@lkoenig is the other person working on this whom might have logs.

@plbossart
Copy link
Member

I am afraid there is nothing in the kernel that deals with the number of channels. All information on configurations are provided by topology and sent to firmware as is.

I do remember talking with @cujomalainey on the AEC/DMIC integration. The premise for DMICs since GLK is that we ALWAYS deal with 4 channels, and userspace/UCM extract the useful channels based on platform-specific channel maps.

But if the DMIC input is provided to AEC, then how would the firmware know which channels are to be dealt with? Conversely the echo reference is two channels, so this really begs the question on how the echo reference is mapped to which input channels for processing.

So yes there's clearly a 2:4 mismatch in the AEC integration...

@lgirdwood
Copy link
Member

@marcinszkudlinski @plbossart it sounds like we could add a topology tuple(s) for AEC DMIC channel mapping. i.e. AEC echo ref is channels mask 0x3

@andyross
Copy link
Contributor Author

To clarify: the DMIC source to AEC does indeed represent itself correctly as 4 channels. And that's indeed a problem that needs solving (right now AEC simply drops the second two). But at the source/sink metadata level, everything is correct. It's promising us 4x s16 @48khz and delivering the expected 8 bytes per 1/48000 seconds.

This bug is that the reference input (demuxed from the playback pipeline by the copier) is also listing 4 channels, even though it's clearly only providing two. And AFAICT the topology shows two there. I don't know where the "4" is coming from.

@cujomalainey
Copy link
Contributor

But if the DMIC input is provided to AEC, then how would the firmware know which channels are to be dealt with? Conversely the echo reference is two channels, so this really begs the question on how the echo reference is mapped to which input channels for processing.

@johnylin76 has been doing some excellent work in CRAS to enable easy dynamic configuration of SOF. The main target is DRC/EQ configs from CRAS format. That being said, it would not be hard to extend that to a channel map which we have on hand already and write that down to the firmware. We could even update it for WF vs UF mics.

@andyross
Copy link
Contributor Author

andyross commented Dec 20, 2023

@andyross, can you attach kernel and firmware logs with pointers where things go south?

Actually I can't! :) The symptom if you're missing the alignment_constants calls is just ring buffer madness downstream of the component. The only symptom is that it sounds like junk. The proximate cause there seems to be (from code reading, I won't have a chance to try this until I get back to the box on Friday night) just uninitialized data in the struct. The constructor for the audio_stream in the buffer never sets the values, leaving them either at the calloc() zeros or heap garbage, and neither is correct somehow. (The actual implementation of set_alignment_constants computes values from the arguments, it can't just be left at a zero). I'm hopeful that just making sure the initialization code calls this with reasonable values is going to be enough. Then we can go around and remove all the seemingly-stubbed calls to this function[1] from all the components I see have had them added. :)

The symptom from the incorrect channel count is just short bytes. The AEC falls behind trying to read 10ms of data from the reference pipe for every 10ms of capture, so it iterates only once in 20ms, producing a flood of errors at the upstream DAI and otherwise being easier to diagnose.

[1] Most of them pass "1" as the needed alignment, which I can guarantee is incorrect for existing formats as Xtensa won't do a misaligned 16/32 bit load except in a handful of instructions with special addressing.

@andyross
Copy link
Contributor Author

it sounds like we could add a topology tuple(s) for AEC DMIC channel mapping. i.e. AEC echo ref is channels mask 0x3

That would work, and/or we could trivially do that internal to the component too with a control (somewhat oddly tuples seem harder to get to in component code than controls, or I just haven't found the right API yet).

I think the bigger question is policy: there's a software constraint (AEC can only process two channels) and a hardware behavior (there are four mic channels). We need a policy decision as to which channels to process and how to expose that to host code via PCMs. Bluntly: why are there four microphones and what are we supposed to do with them?

@yongzhi1
Copy link
Contributor

FYI, for this issue "the Google AEC component is being presented with a reference/playback input source that claims 4 channels, but in fact is only providing enough data for two."

Here is the orginal fix #8452 and the channel work around was removed by e6a3538 on mtl-007-drop-stable.

@andyross
Copy link
Contributor Author

Here is the orginal fix #8452 and the channel work around was removed by e6a3538 on mtl-007-drop-stable.

That's just a(nother![1]) workaround at the component level though. The component isn't responsible for configuring its input sources, right? The copier/DAI for the reference input clearly is two channel, the sof_source created from it should show the same metadata. How do we fix this so the pipeline code does it correctly?

[1] I'll absolutely pull this in to the current workaround patch, obviously. It's cleaner than detecting the mismatch and calling source_set_channels(), as at least it's sourced from the module base_cfg (which I assume comes out of topology?)

@andyross
Copy link
Contributor Author

andyross commented Dec 22, 2023

That's just a(nother![1]) workaround at the component level though.

Just to restate my perspective here: the incorrect channel count and wrong alignment data are fields in a struct audio_stream stream inside a struct comp_buffer. Those objects are not "owned" by the component code (and can't be, since they connect two different components). They are created by the IPC/pipeline layer on the components' behalf, based on topology. They need to be initialized correctly.

andyross added a commit to andyross/sof that referenced this issue Dec 23, 2023
IPC4 fails to set up stream metadata on the buffer objects at pipeline
creation time (see Issue thesofproject#8639).  Traditionally components have been
responsible for doing this.  The way this works here is that during
the init() call, we cache the relevant ipc4_audio_format records from
the module "extended config" (which is untyped byte-packed data, I
think it's raw bytes from the host?) that correspond to our connected
streams.  This config struct gets freed after initialization, so it
has to be done then.

Then at prepare time, we must use ipc4_update_source/sink_format() to
set values on the (incompletely initialized) streams.  Similarly, we
have to call audio_stream_init_alignment_constants(), otherwise needed
values on the audio stream remain uninitialized.

It's not really documented what happens if our settings disagree with
the component on the other side of the buffer!  Presumably that would
be a fatal topology bug, but nothing is set up to detect it.  In fact
in practice most other components DON'T do this, so our settings win
(and thus we must do this, or else the rest of the stream sees garbage
and misbehaves, generally with buffer overruns or short reads).

Finally, there is somewhat... unique handling needed to get correct
IBS/OBS buffer sizing.  We have two inputs with different stream
formats, and thus different minimum block sizes.  The "base" module
config only has space for one IBS value.  So the pipeline code has an
"extended" config with separate IBS per pin, but it will only use it
if it knows about it, which it doesn't by default because module
initialization throws away the data!  So it falls to us to duplicate a
copy and store it ourselves, in a separate field from where we found
it.

Also ntoe: The "pin" index (the value of the pin_index field of the
ipc4_in/output_pin_format structs seen in module config at init()
time) and the "source" index (the ordering of the sources[] argument
to prepare()/process()) ARE PRESENTED BACKWARDS!  And I don't see any
API to tell me which is which (note that the ordering of the connected
buffers in the comp_dev sink/source lists is a THIRD convention, and
matches sources[]/sinks[]).

Signed-off-by: Andy Ross <[email protected]>
@andyross
Copy link
Contributor Author

I spent most of today reverse-engineering the changes in the most recent mtl-007-drop-stable and can duplicate results on my tree. Honestly, this bug has to stay open. Here's the isolated workaround patch on top of my temporary tree (I'll get it rebased on main, retested on mt8195, and submitted tomorrow): andyross@e3345b7

First: please read the commit message (duplicated in comments) carefully. The details here do seem extremely complicated.

Second: please check my work. It's certainly possible I've missed something.

The root causes here were, in full:

  1. IPC4 doesn't initialize the audio_streams in comp_buffers completely. Where IPC3 components seem to have code to call init_alignment_constants() themselves, newer components don't. Really I think this is an old bug in audio_stream: the fields in question can't be set to zero, they need working defaults at creation time. I'll submit a patch.

  2. IPC4 doesn't set module stream data at prepare() time, or does it incompletely. Components are responsible for calling source/sink_update_audio_format() themselves.

  3. After initialization, IPC4 throws away (!) the data needed to determine IBS/OBS values on components with more than one source or sink. Components need to make a copy themselves if they want the pipeline setup to use it, which is... very weird. This breaks the AEC reference stream, which has half the channels of the capture stream and thus needs smaller buffers. This was the source of the persistent falling behind I reported in [BUG] DP scheduler introducing 10ms+ latency to AEC capture streams #8640

My position is that none of that code should be required to be present in a working component. Those tasks are all the job of the IPC/module/pipeline setup code. If I'm wrong, are any of those requirements documented? This all seems ridiculous to me, though the workarounds are real and do seem to work.

andyross added a commit to andyross/sof that referenced this issue Dec 23, 2023
IPC4 fails to set up stream metadata on the buffer objects at pipeline
creation time (see Issue thesofproject#8639).  Traditionally components have been
responsible for doing this.  The way this works here is that during
the init() call, we cache the relevant ipc4_audio_format records from
the module "extended config" (which is untyped byte-packed data, I
think it's raw bytes from the host?) that correspond to our connected
streams.  This config struct gets freed after initialization, so it
has to be done then.

Then at prepare time, we must use ipc4_update_source/sink_format() to
set values on the (incompletely initialized) streams.  Similarly, we
have to call audio_stream_init_alignment_constants(), otherwise needed
values on the audio stream remain uninitialized.

It's not really documented what happens if our settings disagree with
the component on the other side of the buffer!  Presumably that would
be a fatal topology bug, but nothing is set up to detect it.  In fact
in practice most other components DON'T do this, so our settings win
(and thus we must do this, or else the rest of the stream sees garbage
and misbehaves, generally with buffer overruns or short reads).

Finally, there is somewhat... unique handling needed to get correct
IBS/OBS buffer sizing.  We have two inputs with different stream
formats, and thus different minimum block sizes.  The "base" module
config only has space for one IBS value.  So the pipeline code has an
"extended" config with separate IBS per pin, but it will only use it
if it knows about it, which it doesn't by default because module
initialization throws away the data!  So it falls to us to duplicate a
copy and store it ourselves, in a separate field from where we found
it.

Also ntoe: The "pin" index (the value of the pin_index field of the
ipc4_in/output_pin_format structs seen in module config at init()
time) and the "source" index (the ordering of the sources[] argument
to prepare()/process()) ARE PRESENTED BACKWARDS!  And I don't see any
API to tell me which is which (note that the ordering of the connected
buffers in the comp_dev sink/source lists is a THIRD convention, and
matches sources[]/sinks[]).

Signed-off-by: Andy Ross <[email protected]>
andyross added a commit to andyross/sof that referenced this issue Dec 23, 2023
Audio stream objects are generally allocated out of zero'd heap
memory, but the computed alignment fields need non-zero values to have
correct behavior.  These were being left as garbage.

Set them to a byte and frame alignmnt of 1, as this corresponds to
pervasive convention in the tree.  But note that a byte alignment of 1
is actually technically incorrect, as all existing DSP targets are on
architectures which don't allow misaligned loads.  But existing
component code is universally coded correctly anyway.

(It's worth pointing out that "set_alignment_constants()" is now
exposed as an API call on abstracted source/sink objects.  But the
only current implementation is on audio_stream.  Future
implementations will need to correctly initialize themselves.)

This is a partial fix for Issue thesofproject#8639

Signed-off-by: Andy Ross <[email protected]>
andyross added a commit to andyross/sof that referenced this issue Dec 23, 2023
Audio stream objects are generally allocated out of zero'd heap
memory, but the computed alignment fields need non-zero values to have
correct behavior.  These were being left as garbage.

Set them to a byte and frame alignment of 1, as this corresponds to
pervasive convention in the tree.  But note that a byte alignment of 1
is actually technically incorrect, as all existing DSP targets are on
architectures which don't allow misaligned loads.  But existing
component code is universally coded correctly anyway.

(It's worth pointing out that "set_alignment_constants()" is now
exposed as an API call on abstracted source/sink objects.  But the
only current implementation is on audio_stream.  Future
implementations will need to correctly initialize themselves.)

This is a partial fix for Issue thesofproject#8639

Signed-off-by: Andy Ross <[email protected]>
@kv2019i
Copy link
Collaborator

kv2019i commented Jan 2, 2024

In addition to #8672 @andyross submitted, there's #8575 to address point (3) in upstream.

@yongzhi1 yongzhi1 added the IPC4 Issues observed with IPC4 (same IPC as Windows) label Jan 2, 2024
andyross added a commit to andyross/sof that referenced this issue Jan 3, 2024
Audio stream objects are generally allocated out of zero'd heap
memory, but the computed alignment fields need non-zero values to have
correct behavior.  These were being left as garbage.

Set them to a byte and frame alignmnt of 1, as this corresponds to
pervasive convention in the tree.  But note that a byte alignment of 1
is actually technically incorrect, as all existing DSP targets are on
architectures which don't allow misaligned loads.  But existing
component code is universally coded correctly anyway.

(It's worth pointing out that "set_alignment_constants()" is now
exposed as an API call on abstracted source/sink objects.  But the
only current implementation is on audio_stream.  Future
implementations will need to correctly initialize themselves.)

This is a partial fix for Issue thesofproject#8639

Signed-off-by: Andy Ross <[email protected]>
@andyross
Copy link
Contributor Author

andyross commented Jan 3, 2024

Been working on that, actually. See #8685 for IMHO a much cleaner take on the "base_cfg_ext" problem that doesn't require module code do weird gymnastics managing memory in the core module layers. It is, admittedly, pretty opinionated. But it's a lot smaller, much simpler for component code, and as mentioned I do feel like we've gotten stuck with a really bad API paradigm here.

With that, the big workaround patch in the AEC series linked above is down to just one line that looks like:

#ifdef CONFIG_IPC_MAJOR_4
	ipc4_update_source_format(sources[0], &mod->priv.cfg.input_pins[1].audio_fmt);
#endif

And I'm pretty sure I see where the problem is there and hope to have a fix out tomorrow.

andyross added a commit to andyross/sof that referenced this issue Jan 3, 2024
Audio stream objects are generally allocated out of zero'd heap
memory, but the computed alignment fields need non-zero values to have
correct behavior.  These were being left as garbage.

Set them to a byte and frame alignmnt of 1, as this corresponds to
pervasive convention in the tree.  But note that a byte alignment of 1
is actually technically incorrect, as all existing DSP targets are on
architectures which don't allow misaligned loads.  But existing
component code is universally coded correctly anyway.

(It's worth pointing out that "set_alignment_constants()" is now
exposed as an API call on abstracted source/sink objects.  But the
only current implementation is on audio_stream.  Future
implementations will need to correctly initialize themselves.)

This is a partial fix for Issue thesofproject#8639

Signed-off-by: Andy Ross <[email protected]>
andyross added a commit to andyross/sof that referenced this issue Jan 8, 2024
As specified, this function was a bit of a booby trap: it has to be
called exactly once, after all other setup that modifies frame size
has been done.  If it is called in the wrong order, or not at all, any
consumers of the frame alignment API on this stream will see garbage
data in the computed fields.  That's way too much complexity for the
component code that needs to use this tool, especially as the impact
is not in the component itself (e.g. it's a downstream copier widget
with SIMD code that fails).

Instead, preserve the two requirements set by this function in the
audio_stream struct and recompute them any time any of the upstream
values change.  That allows this to be safely used from any context.

There's a mild complexity with audio_sink/source layer, which is
written to directly modify the format (it keeps a pointer into
audio_stream of its own) without going through the audio_stream API
itself.  There, the framework gives us "on_audio_format_set()"
callbacks on source and sink, so implement it there.

This is a partial fix for Issue thesofproject#8639

Signed-off-by: Andy Ross <[email protected]>
andyross added a commit to andyross/sof that referenced this issue Jan 9, 2024
As specified, this function was a bit of a booby trap: it has to be
called exactly once, after all other setup that modifies frame size
has been done.  If it is called in the wrong order, or not at all, any
consumers of the frame alignment API on this stream will see garbage
data in the computed fields.  That's way too much complexity for the
component code that needs to use this tool, especially as the impact
is not in the component itself (e.g. it's a downstream copier widget
with SIMD code that fails).

Instead, preserve the two requirements set by this function in the
audio_stream struct and recompute them any time any of the upstream
values change.  That allows this to be safely used from any context.

There's a mild complexity with audio_sink/source layer, which is
written to directly modify the format (it keeps a pointer into
audio_stream of its own) without going through the audio_stream API
itself.  There, the framework gives us "on_audio_format_set()"
callbacks on source and sink, so implement it there.

This is a partial fix for Issue thesofproject#8639

Signed-off-by: Andy Ross <[email protected]>
andyross added a commit to andyross/sof that referenced this issue Jan 9, 2024
IPC4 fails to set up stream metadata on the buffer objects at pipeline
creation time (see Issue thesofproject#8639).  Traditionally components have been
responsible for doing this at prepare time via calling
ipc4_update_source/sink_format() to set values on the (incompletely
initialized) streams.

In particular, on MTL the reference input stream is presented with an
incorrect channel count (4, matching the capture stream; but the
reference/playback pipeline has only two channels) and needs to be
corrected.

Also note: The "pin" index (the value of the pin_index field of the
ipc4_in/output_pin_format structs seen in module config at init()
time) and the "source" index (the ordering of the sources[] argument
to prepare()/process()) ARE PRESENTED BACKWARDS!  And I don't see any
API to tell me which is which (note that the ordering of the connected
buffers in the comp_dev sink/source lists is a THIRD convention, and
matches sources[]/sinks[]).

Signed-off-by: Andy Ross <[email protected]>

[DNM] Google AEC: minimize workaround

This is down to just one line now with existing (still in review) PRs
applied
lgirdwood pushed a commit that referenced this issue Jan 9, 2024
As specified, this function was a bit of a booby trap: it has to be
called exactly once, after all other setup that modifies frame size
has been done.  If it is called in the wrong order, or not at all, any
consumers of the frame alignment API on this stream will see garbage
data in the computed fields.  That's way too much complexity for the
component code that needs to use this tool, especially as the impact
is not in the component itself (e.g. it's a downstream copier widget
with SIMD code that fails).

Instead, preserve the two requirements set by this function in the
audio_stream struct and recompute them any time any of the upstream
values change.  That allows this to be safely used from any context.

There's a mild complexity with audio_sink/source layer, which is
written to directly modify the format (it keeps a pointer into
audio_stream of its own) without going through the audio_stream API
itself.  There, the framework gives us "on_audio_format_set()"
callbacks on source and sink, so implement it there.

This is a partial fix for Issue #8639

Signed-off-by: Andy Ross <[email protected]>
@abonislawski
Copy link
Member

@andyross is this issue good to close now?

@andyross
Copy link
Contributor Author

Let's leave this open for another week or two while I get the AEC rework rebased. The actual stream format for multi-output copier objects is still being set incorrectly, because the copier doesn't get the cfg_ext data that would contain the correct value, because it has its own config struct in the protocol where cfg_ext would go. It's a protocol bug basically, the topology isn't transmitted to the firmware in a single format or a single place, every component is rolling its own.

Basically I need to write up the protocol stuff in a new bug before closing this.

@abonislawski abonislawski removed the P1 Blocker bugs or important features label Mar 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working as expected IPC4 Issues observed with IPC4 (same IPC as Windows)
Projects
None yet
Development

No branches or pull requests

10 participants