-
Notifications
You must be signed in to change notification settings - Fork 4k
Closed
Description
Describe the bug
Commit ccc22cb changed the id format of the children in the mirrored supervisor rabbit_shovel_dyn_worker_sup
. However the child spec of a mirrored supervisor is stored in Mnesia and survives a rolling restart. During an upgrade with existing dynamic shovels the below crash was observed on the first node that is upgraded because of the new code hitting old id format.
BOOT FAILED
===========
Error during startup: {error,
{rabbitmq_shovel,
{{shutdown,
{failed_to_start_child,
rabbit_shovel_dyn_worker_sup_sup,
{'EXIT',
{function_clause,
[{rabbit_shovel_dyn_worker_sup_sup,id,
[<<"shovel1">>],
[{file,"rabbit_shovel_dyn_worker_sup_sup.erl"},
{line,100}]},
{rabbit_shovel_dyn_worker_sup_sup,
'-cleanup_specs/0-fun-2-',2,
[{file,"rabbit_shovel_dyn_worker_sup_sup.erl"},
{line,90}]},
{sets,fold_bucket,3,
[{file,"sets.erl"},{line,503}]},
{sets,fold_seg,4,[{file,"sets.erl"},{line,499}]},
{sets,fold_segs,4,
[{file,"sets.erl"},{line,495}]},
{rabbit_shovel_dyn_worker_sup_sup,
cleanup_specs,0,
[{file,"rabbit_shovel_dyn_worker_sup_sup.erl"},
{line,93}]},
{rabbit_shovel_dyn_worker_sup_sup,start_child,
2,
[{file,"rabbit_shovel_dyn_worker_sup_sup.erl"},
{line,42}]},
{rabbit_shovel_dyn_worker_sup_sup,
'-start_link/0-lc$^0/1-0-',1,
[{file,"rabbit_shovel_dyn_worker_sup_sup.erl"},
{line,28}]}]}}}},
{rabbit_shovel,start,[normal,[]]}}}}
For the record on another node:
1> mirrored_supervisor:which_children(rabbit_shovel_dyn_worker_sup_sup).
[{{<<"/">>,<<"shovel1">>},
<49058.4184.0>,worker,
[rabbit_shovel_dyn_worker_sup]}]
I upgraded from 3.11.24 but I think one can start from any version prior to 3.12.8.
EDIT: I believe it only happens on multi-node clusters.
Reproduction steps
- Create a multi-node cluster with version prior to 3.12.8 (eg 3.11.24)
- Create a dynamic shovel that is not auto-deleted (eg shovelling between 2 local queues)
- Upgrade first node to 3.12.8 and restart.
- Observer the shovel plugin crash on the first node which prevents boot
Expected behavior
Existing shovels should still work after upgrade to 3.12.8, possibly by executing a DB migration converting the child IDs.
Additional context
No response
gietschess and slnblizzard