Revert "ipc4: relax the IPC timeout checks and be nicer to other thre… #8756

RanderWang · 2024-01-18T05:16:21Z

…ads"

When a stream is triggered to start, host kernel first sendis trigger start ipc message to fw and then start host dma for this stream. Ipc_wait_for_compound_msg is used to wait for all pipelines in the stream to be complete and need to be done fast since it blocks host to start hda dma. The reverted commit adds a 10 ms delay and results to host copier xrun warning for it can't get data from host dma. And this commit also makes a negative effect for stream_start_offset calculation, so revert it.

This reverts commit 909a327.

one fix for thesofproject/linux#4781

RanderWang · 2024-01-18T05:22:05Z

fw warning with reverted commit:

[ 4580.800270] <inf> copier: copier_comp_trigger: comp:0 0x4 copier_comp_trigger() cmd 7
[ 4580.800293] <inf> copier: copier_comp_trigger: comp:0 0x4 copier_comp_trigger() cmd 1
[ 4580.800328] <err> copier: copier_comp_trigger: comp:0 0x4 ------- dai pos 4288, dai link 527
[ 4580.800350] <inf> host_comp: host_get_copy_bytes_normal: comp:0 0x4 no bytes to copy, available samples: 0, free_samples: 192
[ 4580.801273] <inf> host_comp: host_get_copy_bytes_normal: comp:0 0x4 no bytes to copy, available samples: 0, free_samples: 192
[ 4580.802271] <inf> host_comp: host_get_copy_bytes_normal: comp:0 0x4 no bytes to copy, available samples: 0, free_samples: 192
[ 4580.803271] <inf> host_comp: host_get_copy_bytes_normal: comp:0 0x4 no bytes to copy, available samples: 0, free_samples: 192
[ 4580.804271] <inf> host_comp: host_get_copy_bytes_normal: comp:0 0x4 no bytes to copy, available samples: 0, free_samples: 192
[ 4580.805271] <inf> host_comp: host_get_copy_bytes_normal: comp:0 0x4 no bytes to copy, available samples: 0, free_samples: 192
[ 4580.806273] <inf> host_comp: host_get_copy_bytes_normal: comp:0 0x4 no bytes to copy, available samples: 0, free_samples: 192
[ 4580.807271] <inf> host_comp: host_get_copy_bytes_normal: comp:0 0x4 no bytes to copy, available samples: 0, free_samples: 192
[ 4580.808271] <inf> host_comp: host_get_copy_bytes_normal: comp:0 0x4 no bytes to copy, available samples: 0, free_samples: 192
[ 4580.809253] <inf> host_comp: host_get_copy_bytes_normal: comp:0 0x4 no bytes to copy, available samples: 0, free_samples: 192
[ 4580.810258] <inf> host_comp: host_get_copy_bytes_normal: comp:0 0x4 no bytes to copy, available samples: 0, free_samples: 192
[ 4580.811248] <inf> host_comp: host_get_copy_bytes_normal: comp:0 0x4 no bytes to copy, available samples: 0,

…ads" When a stream is triggered to start, host kernel first sendis trigger start ipc message to fw and then start host dma for this stream. Ipc_wait_for_compound_msg is used to wait for all pipelines in the stream to be complete and need to be done fast since it blocks host to start hda dma. The reverted commit adds a 10 ms delay and results to host copier xrun warning for it can't get data from host dma. And this commit also makes a negative effect for stream_start_offset calculation which is for the difference between dai gateway and host gateway. We calculate it in host start function but host dma is started 10ms later, so there will be at least 10ms data error. This reverts commit 909a327. Signed-off-by: Rander Wang <[email protected]>

kv2019i

Thanks @RanderWang , this can help potentially in other cases as well. I don't think this was the original intent of the change, but this will delay any IPC that requires to be handled in pipeline context, by 10ms.

kv2019i · 2024-01-18T11:48:35Z

Note, the original patch has been referred to fix #7774 so we need to track test results carefully.

RanderWang · 2024-01-19T01:05:37Z

Note, the original patch has been referred to fix #7774 so we need to track test results carefully.

As I was working on the issue #7774 at that time, I tested the original patch but made no effect.

RanderWang · 2024-01-19T01:09:32Z

#7774 should be fixed by #8143

btian1 · 2024-01-23T08:06:49Z

src/ipc/ipc4/handler.c


 	while (atomic_read(&msg_data.delayed_reply)) {
-		k_sleep(Z_TIMEOUT_MS(10));
+		k_sleep(Z_TIMEOUT_US(250));


btian1

250 --> 10ms? it is 40 times enlarge? may miss something during sleep? how about 1ms?

above is the comments I added when the change first submitted, my question is: how 250us * 30 = 7.5ms comes from?

is it based on experience value to make total wait time < 10ms?

it is set by experiments with test and log. Most waiting time is less than 1ms. I don't want to use 1ms since it will miss one schedule period in FW.

What context is this called from ? My expectation is the IPC thread and if so not sure how we can block LL from running ?

@lgirdwood This is the IPC thread. It doesn't block the LL thread, but it will block processing of IPC messages until the LL timer fires next time, runs the IPC processing (that needs to happen in LL context), acks it as done, and then the IPC thread can continue. Now this adds a mandatory 10ms delay to each such handover from IPC thread to LL thread.

ok, so we have a timing dependency between LL thread and IPC thread and this is convergence. This need to be spelled out in the inline comments alongside setting a better tries count as 30 means about 7.5 LL ticks. i.e. we need to say > 1.5 LL ticks would be a significant failure

@kv2019i , @RanderWang is it better if 150 * 50? to make IPC response more quick?

@lgirdwood This was the mechanism added by @lyakh to synchronize pipeline trigger processing with LL thread (so that pipeline state changes are not done async with the pipeline exection). @lgirdwood @btian1 We can improve the logic (or even devise a different way to wait for next ll tick, I think the original concern of trying to avoid unnecessary thread context switches is as valid as ever), but can we do that as a follow-up and merge this plain revert first? This helps with multiple pressing issues (AV sync and this seems to help with multicore scenarios), and we have done significant testing of this version with @RanderWang . If we want a modified new mechanism, that needs to be tested well, pushing the merge out.

I'm fine with a revert, but we need a folloup PR that spells out the relationships with some inline commentary.

kv2019i · 2024-01-25T17:51:22Z

One fail in https://sof-ci.01.org/sofpr/PR8756/build1997/devicetest/index.html?model=MTLP_RVP_NOCODEC&testcase=multiple-pause-resume-50 -- no FW/drv errors in the logs. This looks like a case of #8770 .

I'll proceed wit merge

marc-hb · 2024-01-27T00:36:20Z

According to @keqiaozhang this revert causes the following regression:

[BUG] FW reported error: 7 - Unsupported operation requested when testing Bluetooth offload tplg #8801

RanderWang requested review from lgirdwood, plbossart, mmaka1, lbetlej, dbaluta and kv2019i as code owners January 18, 2024 05:16

RanderWang requested a review from ujfalusi January 18, 2024 05:21

RanderWang force-pushed the pipeline_wait branch from f029d92 to c142b7d Compare January 18, 2024 05:37

kv2019i approved these changes Jan 18, 2024

View reviewed changes

kv2019i mentioned this pull request Jan 18, 2024

idc: zephyr: set of fixes to IDC in multicore scenario #8753

Closed

kv2019i requested review from lyakh, abonislawski and serhiy-katsyuba-intel January 19, 2024 10:19

This was referenced Jan 19, 2024

host_get_copy_bytes_normal: "no bytes to copy" log spam #8761

Closed

snd_pcm_delay() reporting unexpected values thesofproject/linux#4781

Closed

btian1 reviewed Jan 23, 2024

View reviewed changes

This was referenced Jan 23, 2024

deepbuffer PCM not following period size setting thesofproject/linux#4795

Open

[BUG] stream_start_offset set incorrectly leading to av sync errors #8779

Closed

RanderWang mentioned this pull request Jan 24, 2024

Fix delay error thesofproject/linux#4791

Closed

serhiy-katsyuba-intel approved these changes Jan 25, 2024

View reviewed changes

lgirdwood approved these changes Jan 25, 2024

View reviewed changes

kv2019i merged commit 08cf3d2 into thesofproject:main Jan 25, 2024
38 of 44 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revert "ipc4: relax the IPC timeout checks and be nicer to other thre… #8756

Revert "ipc4: relax the IPC timeout checks and be nicer to other thre… #8756

RanderWang commented Jan 18, 2024

RanderWang commented Jan 18, 2024

kv2019i left a comment

kv2019i commented Jan 18, 2024

RanderWang commented Jan 19, 2024

RanderWang commented Jan 19, 2024

btian1 Jan 23, 2024

RanderWang Jan 23, 2024

lgirdwood Jan 24, 2024

kv2019i Jan 24, 2024

lgirdwood Jan 24, 2024

btian1 Jan 25, 2024

kv2019i Jan 25, 2024

lgirdwood Jan 25, 2024

kv2019i commented Jan 25, 2024

marc-hb commented Jan 27, 2024

Revert "ipc4: relax the IPC timeout checks and be nicer to other thre… #8756

Revert "ipc4: relax the IPC timeout checks and be nicer to other thre… #8756

Conversation

RanderWang commented Jan 18, 2024

RanderWang commented Jan 18, 2024

kv2019i left a comment

Choose a reason for hiding this comment

kv2019i commented Jan 18, 2024

RanderWang commented Jan 19, 2024

RanderWang commented Jan 19, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kv2019i commented Jan 25, 2024

marc-hb commented Jan 27, 2024