-
Notifications
You must be signed in to change notification settings - Fork 322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] pause resume PID **** had non-zero exit status when checking multiple-pause/resume #8770
Comments
@serhiy-katsyuba-intel @abonislawski @lyakh @teburd I think I found the reason why the DMA stops moving. The GEN bit is not set for some reason. When the problem occurs, I can see that bit26 of DGCS is not set (added debug to call the existing intel_adsp_hda_dbg() macro on error:
So "dgcs: 0x80810000" is wrong and explains why data is not moving and we get flood of "no bytes to copy" messages. Without GEN set, the DMA cannot write to DSP memory. It seems with multiple cores using the host DMA engine, at least writes to DGCS are not immediately reflected in the read back values. Based on this, I made a few fix attempt pushed at https://github.com/kv2019i/zephyr/commits/202402-hdadma-gen-bit-fix-attempts but these don't completely solve the problem. Even if I added a 10ms busyloop to poll for GEN bit to be set, it is read back as 0, so something is blocking the setting. |
Ignore early exit to allow runnning tests for longer without stopping in case thesofproject/sof#8770 occurs.
This will be fixed by zephyrproject-rtos/zephyr#69480 . Assigning to @serhiy-katsyuba-intel for follow-up as you are the fix author. Let's not close this SOF issue until we have the fix integrated to SOF main. |
Closing as fix merged. Please reopen if seen again. |
Describe the bug
This issue happens when checking the multiple-pause/resume case, it shows "pause resume PID **** had non-zero exit status" in console log, but there's no error in kernel log or mtrace. The reproduction rate is low about 20%. This issue can be reproduced on both MTL and LNL NOCODEC platforms.
Inner test ID:
LNLM_RVP_NOCODEC: 37146
MTLP_RVP_NOCODEC: 37016
To Reproduce
~/sof-test/test-case/multiple-pause-resume.sh -r 50
Reproduction Rate
20%
Environment
Screenshots or console output
dmesg.txt
mtrace.txt
The text was updated successfully, but these errors were encountered: