-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
snd_pcm_delay() reporting unexpected values #4781
Comments
@RanderWang Can you check? Anythinig I'm missing in the test procedure? |
I am working on it. The reason is related to both fw and kerne driver. We change driver logic : send ipc start first or start host dma first |
[83694.972961] snd_sof 0000:00:1f.3: ipc tx 0x13000004|0x1: GLB_SET_PIPELINE_STATE [data size: 12] [83694.984310] copier: copier_comp_trigger: comp:0 0x4 copier_comp_trigger() cmd 7 FW start to calculate position for HDA host dma stream at 83694.984376 for stream start msg, but HDA stream was started at 83694.993907 in host kernel, almost 10ms later, so at least 480 samples error |
the position in the firmware should be updated when data is received, not when the pipeline is started? |
this is for stream_start_offset which is to remove the dai link count before host stream is started like mixer case or dai pipeline is started before host pipeline. Yes, your idea is in my shop list. Current design follows REF FW Sleep 10 MS after the stream is started in FW should be the root reason /* wait for IPCs to complete on other cores and be nice to any LL work */
static int ipc_wait_for_compound_msg(void)
{
int try_count = 30; /* timeout out is 30 x 10ms so 300ms for IPC */
while (atomic_read(&msg_data.delayed_reply)) {
k_sleep(Z_TIMEOUT_MS(10));
if (!try_count--) {
atomic_set(&msg_data.delayed_reply, 0);
ipc_cmd_err(&ipc_tr, "ipc4: failed to wait schedule thread");
return IPC4_FAILURE;
}
}
return IPC4_SUCCESS;
} |
Let's discuss this commit: thesofproject/sof@909a327 This commit results to a batch of host warning since host kernel is blocked to start HDA DMA
|
The link start offset and link position is in frames from the hardware, there is no need for conversion. At the start of the stream the link position can be lower than the start offset as the firmware provides a calculated estimate of it by taking the link DMA position and adding a calculated latency through the audio processing. Fixes: 3937a76 ("ASoC: SOF: ipc4-pcm: add delay function support") Link: thesofproject#4781 Reported-by: Kai Vehmanen <[email protected]> Signed-off-by: Peter Ujfalusi <[email protected]>
@kv2019i, @RanderWang, can you try this patch: |
I added comments in your commit. The issue is a little tricky, we need to refine both kernel and FW. |
is https://github.com/thesofproject/linux/blob/topic/sof-dev/sound/soc/sof/intel/hda-stream.c#L1078 correct ? if (pos >= hstream->bufsize)
pos = 0; // pos -= hstream->bufsize; ? |
I add some log and find that pos <= hstream->bufsize, never > hstream->bufsize, so no problem |
@RanderWang, it surely looks like that we have different units, the stream_start_offset do appear to be set in bytes by the firmware. |
PPLCLLPL/U is in frame, I checked the log. Thanks very much! |
PPLCLLPL/U is incremented by the value written to DGLLPI, no? There is something also odd about the (host) DMA position reporting. We are reading and returning the value of DPIB value in hda_dsp_stream_get_position(), but according to the documentation the DPIB is in bytes and not in frames? |
Yeah, the code is pretty confusing. The DPIB is in bytes but gets converted to frames for PCM (and remains bytes for the compressed position) We should avoid using 'pos' as a snd_pcm_uframe_t and a u64 with this sort of code:
a patch would certainly improve the code hint hint wink wink. |
The link start offset and link position is in frames from the hardware, there is no need for conversion. At the start of the stream the link position can be lower than the start offset as the firmware provides a calculated estimate of it by taking the link DMA position and adding a calculated latency through the audio processing. Fixes: 3937a76 ("ASoC: SOF: ipc4-pcm: add delay function support") Link: thesofproject#4781 Reported-by: Kai Vehmanen <[email protected]> Signed-off-by: Peter Ujfalusi <[email protected]>
PPLCLLPL/U are meant to be coupled with the GP DMA for tracking the linear link position associated with the DMA channel, so it has nothing to do with DGLLPI which is for hda dma |
@RanderWang I can confirm thesofproject/sof#8756 makes the reporting values sane again:
This doesn't yet tell how accurate the delay estimate is, but at least the values look reasonable now. I'll add some more options to the debug tool to analyze the delay behaviour. |
The alsa-conformance-tool branch at https://github.com/kv2019i/cros-audiotest/tree/topic/pcm-delay-debug has been updated with one more patch to add delay-corrected playback position to debug output. With @RanderWang you fix to FW side (thesofproject/sof#8756), result now looks like (two columns on the right, PLAYED_ADJ and DIFF):
|
@kv2019i thanks ! I can't understand PLAYED_ADJ and DIFF, what are they ? And what's the correct data ? |
I download the conformance test and debug it. |
@RanderWang And just to clarify, the results look good with your fix. The PLAYED_ADJ is the actual playback position (appl_ptr - delay), and the DIFF is the difference between two updates. In correct operation, the DIFF should match the time difference recorded in the first column. So when host see time advance by 1ms, the actual playback should advance by 1ms. One example line:
So the hw_ptr moves by 48 samples (=1ms) and as measured by host system clock, this happened 0.001005210sec since last update (so roughly equals 1ms). The PLAYED field shows that 960 samples has been transferred to the DSP. Using the delay reported by SOF driver, we can now calculate that the actual sample played out to the codec at this point is 693, so there is 960-693=267 delay caused by DSP internal pipelines and link DMA buffering. What this test cannot validate is whether the 267 sample delay is accurate, this we have to test with other means, but at least we can check the updates are roughly correct and e.g. the PLAYED_ADJ progresses as expected. Specifically, due to DMA bursts, hw-ptr can advance much faster than real-time, but the PLAYED_ADJ should stay stable even during bursts. |
My findings so far:
Furthermore: I don't get the logic to pause/resume: I start to think that we should disable the reporting either by setting the fw register area's ABI version to 0 (ipc4_abi_ver_offset) or to unconditionally set teh stream_start_offset to |
Not following. The delay is the difference between the DPIB position for the host DMA and the hardware counters on the link DMA. Why would we care about the "PPHCLDPL/U for the host DMA" ? |
PPHCLDPL/U is the host DMA position - it is the Linear DMA Position, the same counter as LLP (Linear Link Position), just on the other side of the DSP. |
If you report the hw_ptr using DPIB and the delay with another register, you are asking for trouble... |
imho, we would use the correct tool for the task. The delay is after all: LDP-(LLP+start_offset), iow, the number of frames sent to DSP minus the number of frames left the DSP (on playback) |
the ALSA buffer cannot be smaller than the DSP buffer, it would make no sense to do so powerwise... And while the logic of keeping symmetry between host and link makes sense at a high level, it also adds an open: how do we know if the LDP and DPIB report the same thing? It's possible that they do, but it's also possible that they don't. On some platforms the DPIB stuff is problematic. Where I am going is that we have to be consistent and report the number of bytes read with the SAME mechanism for hw_ptr and delay. Using separate registers adds an uncertainty on how the hardware actually works. |
@plbossart, the delay has nothing to do with the DSP buffer. Well it has in some degree. How can we use DPIB to report in the 5ms buffer case 8ms delay when we wrap with the buffer size? We cannot. The pause/resume and underrun makes this whole delay reporting defunct in the first place, so it should be disabled for now, it is unreliable no matter what method we are using to track the host ptr. The firmware uses 'unrelated' dai_posn in the start offset calculation and ignores the comp_posn, which is the LLP. |
If the ALSA buffer is based on 5ms, it means you care about the latency, and adding an additional 5ms in the DSP processing is just plain silly. It makes no sense to add more wake-ups on the host for a given DSP latency, or more buffers on the DSP for a given host latency. The DPIB is converted anyways to a linear value when it wraps around. |
How does PA/PW knows the number of components in the DSP for a given PCM? It is possible that the next tplg version will include more things or less for the same path.
I'm fine using the DPIB as proposed by #4791 but you have objected that solution. It is a good indirect approximation. Yes, we have code for DPIB, it does some calculation and stuff. The way the hw_ptr is calculated is a bit hairy for me, it is not hw_ptr+=pointer ( imho, the ALSA buffer size has no direct relation to the delay. The delay is the number of frames entered to the DSP minus what left it. The other burning issue is the whole uncertainty of what and how many the link transfers around start, pause, resume. I can try to run few DPIB vs LDP tests to see how they compare, but as soon as you start to do pause/resume or underrun (mpv does that even w/o the delay reporting) the whole delay shoots out in some direction and become unusable. |
I only asked a question on how the change from boundary to buffer_size work-around would work. I don't know the answer, I just found this was a controversial move without any justification/explanations.
Not really, it's the difference between the hw_ptr and the link position. If there is no DSP, we can still compute the delay as LPIB-DPIB. That was true 10 years ago before all the SKL+ changes. That said, I asked precisions on how DPIB and LDP values are related. I am not trying to block for the sake of blocking, just making sure we're not adding more issues into an already complicated mix. |
@plbossart, on the other hand there is another side of complication with using LDP which just occurred to me: Let me see the DPIB (#4791) in action. |
This problem is generic to all IPC4 platforms, not just MTL |
Disable delay reporting as a workaround to bug thesofproject/linux#4781 SOF driver checks the ABI version and if it's not 1, delay reporting is disabled. Signed-off-by: Kai Vehmanen <[email protected]>
Disable delay reporting as a workaround to bug thesofproject/linux#4781 SOF driver checks the ABI version and if it's not 1, delay reporting is disabled. Signed-off-by: Kai Vehmanen <[email protected]>
Disable delay reporting as a workaround to bug thesofproject/linux#4781 SOF driver checks the ABI version and if it's not 1, delay reporting is disabled. Signed-off-by: Kai Vehmanen <[email protected]>
Disable delay reporting as a workaround to bug thesofproject/linux#4781 SOF driver checks the ABI version and if it's not 1, delay reporting is disabled. Signed-off-by: Kai Vehmanen <[email protected]>
Disable delay reporting as a workaround to bug thesofproject/linux#4781 SOF driver checks the ABI version and if it's not 1, delay reporting is disabled. Signed-off-by: Kai Vehmanen <[email protected]>
Disable delay reporting as a workaround to bug thesofproject/linux#4781 SOF driver checks the ABI version and if it's not 1, delay reporting is disabled. Signed-off-by: Kai Vehmanen <[email protected]>
Disable delay reporting as a workaround to bug thesofproject/linux#4781 SOF driver checks the ABI version and if it's not 1, delay reporting is disabled. Signed-off-by: Kai Vehmanen <[email protected]>
Disable delay reporting as a workaround to bug thesofproject/linux#4781 SOF driver checks the ABI version and if it's not 1, delay reporting is disabled. Signed-off-by: Kai Vehmanen <[email protected]>
Disable delay reporting as a workaround to bug thesofproject/linux#4781 SOF driver checks the ABI version and if it's not 1, delay reporting is disabled. Signed-off-by: Kai Vehmanen <[email protected]>
While testing a new extension to alsa-conformance-test to report PCM delay values, I noticed MTL on hda-generic tplg, is reporting odd seeming values (negative delays).
Environment
kernel: 66ee247 (sof-dev)
SOF: 6c188298b958 (SOF main)
platform: MTL
topoplogy: sof-hda-generic-4ch.tplg
Expected outcome
Definition of the PCM delay in alsa-lib is as follows:
Actual outcome
Negative values reported even if no underrun/overrun condition is seen on the stream. It seems the link position is ahead of the host position, which can't be correct for playback.
Ways to test
Example error
aplay -Dhw:0,0 -traw -fdat /dev/zero
Output from the kernel RD patch:
Output from alsa_conformance_test ("alsa_conformance_test -Phw:0,0 --debug")
The text was updated successfully, but these errors were encountered: