Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bluetooth: Mesh: Allow to suspend mesh from bt_mesh_send_cb callbacks #68735

Merged
merged 4 commits into from
Feb 26, 2024

Conversation

PavelVPV
Copy link
Collaborator

@PavelVPV PavelVPV commented Feb 8, 2024

This PR allows to suspend the mesh stack from bt_mesh_send_cb callback by removing the deadlock caused by k_work_flush.

In case of the extended advertiser there are 2 cases:

  • when the bt_mesh_adv_disable is called from any of bt_mesh_send_cb callbacks which are called from the advertiser work item, or
  • when it is called from any other context.

When it is called from bt_mesh_send_cb callbacks, since these callbacks are called from the delayable work which is running on the system workqueue, the advertiser can check the current context and its work state. If the function is called from the advertiser work, it can disable the advertising set straight away because all ble host APIs have already been called in adv_start function. Before sending anything else, the advertiser checks the instance value in adv_start function, which is also reset to NULL in bt_mesh_adv_disable call, and aborts all next advertisements. The ADV_FLAG_SUSPENDING tells the advertiser work to abort processing while bt_mesh_adv_disable function didn't finish stopping advertising set. This can happen if the work has been already scheduled and the schedler ran it while sleeping inside the bt_le_ext_adv_stop or bt_le_ext_adv_disable functions.

When bt_mesh_adv_disable is called from any other context or from the system workqueue but not from the advertiser work, then k_work_flush can be called safely as it won't cause any deadlocks.

The adv_sent function is inside the bt_mesh_adv_disable function to schedule the advertiser work (send_pending_adv) and abort all pending advertisements that have been already added to the pool.

In case of the legacy advertiser, if the bt_mesh_adv_disable is called form the advertiser thread (this happens when it is called from bt_mesh_send_cb.start or bt_mesh_send_cb.end callbacks), then k_thread_join returns -EDEADLK. But the enabled flag is set to false and the thread will abort the current advertisement and the pending advertisements.

This commit allows to suspend the mesh stack from `bt_mesh_send_cb`
callbacks by removing the deadlock caused by `k_work_flush` in the
extended advertiser.

In case of the extended advertiser there are 2 cases:
- when the `bt_mesh_adv_disable` is called from any of `bt_mesh_send_cb`
  callbacks which are called from the advertiser work item, or
- when it is called from any other context.

When it is called from `bt_mesh_send_cb` callbacks, since these
callbacks are called from the delayable work which is running on the
system workqueue, the advertiser can check the current context and its
work state. If the function is called from the advertiser work, it can
disable the advertising set straight away because all ble host APIs have
already been called in `adv_start` function. Before sending anything
else, the advertiser checks the `instance` value in `adv_start`
function, which is also reset to NULL in `bt_mesh_adv_disable` call, and
aborts all next advertisements. The `ADV_FLAG_SUSPENDING` tells the
advertiser work to abort processing while `bt_mesh_adv_disable` function
didn't finish stopping advertising set. This can happen if the work has
been already scheduled and the schedler ran it while sleeping inside
the `bt_le_ext_adv_stop` or `bt_le_ext_adv_disable` functions.

When `bt_mesh_adv_disable` is called from any other context or from the
system workqueue but not from the advertiser work, then `k_work_flush`
can be called safely as it won't cause any deadlocks.

The `adv_sent` function is inside the `bt_mesh_adv_disable` function to
schedule the advertiser work (`send_pending_adv`) and abort all pending
advertisements that have been already added to the pool.

In case of the legacy advertiser, if the `bt_mesh_adv_disable` is called
form the advertiser thread (this happens when it is called from
`bt_mesh_send_cb.start` or `bt_mesh_send_cb.end` callbacks), then
`k_thread_join` returns `-EDEADLK`. But the `enabled` flag is set to
false and the thread will abort the current advertisement and the
pending advertisements.

Signed-off-by: Pavel Vasilyev <[email protected]>
Now, when the deadlock is removed from `bt_mesh_adv_disable` function,
the advertiser can be disabled from the `bt_mesh_send_cb`
callbacks.

Signed-off-by: Pavel Vasilyev <[email protected]>
This reverts commit c1bbd48.

Signed-off-by: Pavel Vasilyev <[email protected]>
The stack manages to suspend the advertiser before it finishes
transmitting the Outbound PDU Report message to confirm the transmission
of a Provisioning PDU. The test requires the server to become
unresponsive when the Provisioning PDU is sent to the unprovisioned
device to test timeout of the provisioning protocol.

Signed-off-by: Pavel Vasilyev <[email protected]>
@PavelVPV PavelVPV force-pushed the adv_suspend_test_async branch from e19ba93 to a78ad4e Compare February 8, 2024 11:52
@fabiobaltieri fabiobaltieri added this to the v3.7.0 milestone Feb 13, 2024
@aescolar aescolar merged commit d1da071 into zephyrproject-rtos:main Feb 26, 2024
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants