Skip to content

Conversation

cvinayak
Copy link
Contributor

Fix control Tx buffer leak into data Tx pool that happens
after a cross-over control procedure response was paused due
to currently active encryption setup procedure, and a new
control Tx PDU in addition to the paused one is enqueued
thereafter.

When the control tx PDUs is resumed but not yet enqueued
towards the radio, if there is a new control Tx PDU enqueued
then the paused control Tx PDU is not set as the head of the
control PDUs in the Tx queue. This caused the paused control
Tx PDU to be associated with data Tx pool, hence causing the
incorrect release into data Tx pool.

Relates to the commit bff76b4 ("Bluetooth: controller:
split: Fix control tx queue handling") and to the
commit 6991d09 ("Bluetooth: controller: Fix control tx
queue handling").

Fixes #32898.

Signed-off-by: Vinayak Kariappa Chettimada [email protected]

Fix control Tx buffer leak into data Tx pool that happens
after a cross-over control procedure response was paused due
to currently active encryption setup procedure, and a new
control Tx PDU in addition to the paused one is enqueued
thereafter.

When the control tx PDUs is resumed but not yet enqueued
towards the radio, if there is a new control Tx PDU enqueued
then the paused control Tx PDU is not set as the head of the
control PDUs in the Tx queue. This caused the paused control
Tx PDU to be associated with data Tx pool, hence causing the
incorrect release into data Tx pool.

Relates to the commit bff76b4 ("Bluetooth: controller:
split: Fix control tx queue handling") and to the
commit 6991d09 ("Bluetooth: controller: Fix control tx
queue handling").

Fixes zephyrproject-rtos#32898.

Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
@cvinayak
Copy link
Contributor Author

@LeBlue Please verify the fix here

@cvinayak cvinayak added bug The issue is a bug, or the PR is fixing a bug backport v1.14-branch labels Mar 13, 2021
@LeBlue
Copy link

LeBlue commented Mar 15, 2021

@cvinayak With the first quick test the bug is not reproducable anymore. Seems to be fixed.

Still, I am currently running the test (see issue) with the plain hci_usb sample in a loop and additionally a test with a custom configured built FW with a lot more peripherals connected (8, +1 that is setup to trigger the bug as before with invalid keys).

Both test run with no problems. I will report in a couple of hours again if it is still everything OK.

@cvinayak
Copy link
Contributor Author

@LeBlue Thank you for the quick verification and the help with the sniffer log. Yes, do revert back after your long duration tests.

@LeBlue
Copy link

LeBlue commented Mar 15, 2021

@cvinayak Is this patch applicable with no changes to v2.5-branch? If so I can/will run the same tests against v2.5-branch as I am probably will be using the v2.5-branch with our next FW version.

@cvinayak
Copy link
Contributor Author

@LeBlue yes, it should, just cherry pick it.

i have labelled it for backport, when approved and merged GitHub bot will make a PR for the v2.5-branch.

I am off work now, but I could check locally if the commit applies clean later in the night.

@LeBlue
Copy link

LeBlue commented Mar 15, 2021

@cvinayak Just checked. Applies cleanly on the up-to-date v2.5-branch.

@LeBlue
Copy link

LeBlue commented Mar 15, 2021

I just checked the tests. On the v2.5-branch the tests were kind of OK, but with the plain hci_usb sample I hit the following assertion after 1h:

ASSERTION FAIL [conn_lll] @ WEST_TOPDIR/zephyr/subsys/bluetooth/controller/ll_sw/ull.c:1036

Control and data pool were correct. Seemed like something else. I will provide more info tomorrow and meanwhile rerun the test.

@cvinayak
Copy link
Contributor Author

I just checked the tests. On the v2.5-branch the tests were kind of OK, but with the plain hci_usb sample I hit the following assertion after 1h:

ASSERTION FAIL [conn_lll] @ WEST_TOPDIR/zephyr/subsys/bluetooth/controller/ll_sw/ull.c:1036

Control and data pool were correct. Seemed like something else. I will provide more info tomorrow and meanwhile rerun the test.

@LeBlue This is a new bug. Are you by any chance seeing Create Connection Cancel issued over the hci_usb? This could be related to my comments here: https://github.com/zephyrproject-rtos/zephyr/pull/32565/files#r586041675

Please create a new GH issue. (Forgot to create an issue myself).

Thank you once again for reporting the bug. Appreciate your thorough analysis and help.

@cvinayak
Copy link
Contributor Author

cvinayak commented Mar 16, 2021

@LeBlue I have made a draft backport PR #33369 for the v2.5-branch with fixes that should be backported, this also has a fix for the race condition when connection establishment is cancelled. Do verify the fixes and let me know if it solves the issues you have been facing lately.

@LeBlue
Copy link

LeBlue commented Mar 16, 2021

All tests have run for a long time now and the issue this PR is addressing is not reproducable any more.

@carlescufi carlescufi requested a review from mtpr-ot March 18, 2021 12:04
Copy link
Contributor

@mtpr-ot mtpr-ot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nashif nashif merged commit 7f727ac into zephyrproject-rtos:master Mar 18, 2021
@zephyrbot
Copy link

The backport to v1.14-branch failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-v1.14-branch v1.14-branch
# Navigate to the new working tree
cd .worktrees/backport-v1.14-branch
# Create a new branch
git switch --create backport-33310-to-v1.14-branch
# Cherry-pick the merged commits of this pull request and resolve the conflicts
git cherry-pick 7f727ac322ab99084c3444bc2546e5777e2e5a3c~1..7f727ac322ab99084c3444bc2546e5777e2e5a3c
# Push it to GitHub
git push --set-upstream origin backport-33310-to-v1.14-branch
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-v1.14-branch

Then, create a pull request where the base branch is v1.14-branch and the compare/head branch is backport-33310-to-v1.14-branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Bluetooth Controller area: Bluetooth bug The issue is a bug, or the PR is fixing a bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bluetooth: controller: Control PDU buffer leak into Data PDU buffer pool
6 participants