Skip to content

Conversation

azdagron
Copy link
Member

PR #2305 fixed spurious notifications of workloads when nothing had changed but unfortunately introduced a regression wherein a workload is no longer notified by the cache when an entry for that workload is removed.

The bug is caused by false sharing of the selRem temporary selector set. Previously selRem was used to build the selectors for entries being removed and the the contents of selRem were merged into the single notification. When multiple notification sets were introduced, selRem was added as a notification set. Unfortunately, selRem is cleared while processing entries, causing the notification set to be empty.

Existing unit-tests did not catch this because the update that removes the existing entry did not have additional entries to be processed (that would cause selRem to be cleared).

This PR fixes the bug by allocating a new selector set to be appended to the notification set instead of using selRem. It also cleans up some selector set usage and adds some additional logic to the unit-test so this condition can be caught in the future.

Fixes: #3922

PR spiffe#2305 fixed spurious notifications of workloads when nothing had
changed but unfortunately introduced a regression wherein a workload is
no longer notified by the cache when an entry for that workload is
removed.

The bug is caused by false sharing of the selRem temporary selector set.
Previously selRem was used to build the selectors for entries being
removed and the the contents of selRem were merged into the single
notification. When multiple notification sets were introduced, selRem
was added as a notification set. Unfortunately, selRem is cleared while
processing entries, causing the notification set to be empty.

Existing unit-tests did not catch this because the update that removes
the existing entry did not have additional entries to be processed (that
would cause selRem to be cleared).

This PR fixes the bug by allocating a new selector set to be appended to
the notification set instead of using selRem. It also cleans up some
selector set usage and adds some additional logic to the unit-test so
this condition can be caught in the future.

Fixes: spiffe#3922

Signed-off-by: Andrew Harding <[email protected]>
Copy link
Member

@amartinezfayo amartinezfayo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @azdagron for tracking this down ❤️
LGTM!

@evan2645 evan2645 added this to the 1.6.1 milestone Feb 28, 2023
@azdagron azdagron merged commit d42471b into spiffe:main Mar 1, 2023
@azdagron azdagron deleted the fix-workload-notification-on-entry-removal branch March 1, 2023 04:29
@amartinezfayo amartinezfayo modified the milestones: 1.6.2, 1.6.3 Apr 5, 2023
stevend-uber pushed a commit to stevend-uber/spire that referenced this pull request Oct 16, 2023
PR spiffe#2305 fixed spurious notifications of workloads when nothing had
changed but unfortunately introduced a regression wherein a workload is
no longer notified by the cache when an entry for that workload is
removed.

The bug is caused by false sharing of the selRem temporary selector set.
Previously selRem was used to build the selectors for entries being
removed and the the contents of selRem were merged into the single
notification. When multiple notification sets were introduced, selRem
was added as a notification set. Unfortunately, selRem is cleared while
processing entries, causing the notification set to be empty.

Existing unit-tests did not catch this because the update that removes
the existing entry did not have additional entries to be processed (that
would cause selRem to be cleared).

This PR fixes the bug by allocating a new selector set to be appended to
the notification set instead of using selRem. It also cleans up some
selector set usage and adds some additional logic to the unit-test so
this condition can be caught in the future.

Fixes: spiffe#3922

Signed-off-by: Andrew Harding <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Workload x509 watcher does not notify on identity entry deletion
3 participants