Skip to content

Commit a1dd111

Browse files
fix(nns-recovery): do not fix DFINITY-owned node twice (#6606)
The NNS recovery test is flaky. #6554 fixed a first edge-case and this one should fix a second one. In some cases, the DFINITY-owned node would be fixed twice: once by `ic-recovery` and once by the `guestos-recovery-upgrader`/`guestos-recovery-engine`. This is not a problem per se, the bug is a "test" bug: when simulating the actions of node providers, we overwrite `BOOT_ARGS_A` in `/boot/boot_args` [here](https://github.com/dfinity/ic/pull/6606/files#diff-c26ca29faacddc5919f46b7b6d2d7c503af940fe2e7b14038964accb17d0bebdL256). This works fine if the node is currently using partition A. But the DFINITY-owned node has already upgraded as part of `ic-recovery` and thus is using partition B. This led to its state being wiped and not be able to make consensus progress. As a short-term flakiness fix, this PR does not run the simulated NP actions on the DFINITY-owned node anymore, but the functionality can be introduced more carefully in the future as part of the effort of testing more recovery scenarios.
1 parent 87a22d5 commit a1dd111

File tree

1 file changed

+14
-2
lines changed

1 file changed

+14
-2
lines changed

rs/tests/nested/src/lib.rs

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -614,10 +614,22 @@ pub fn nns_recovery_test(env: TestEnv) {
614614
)
615615
.unwrap();
616616

617-
info!(logger, "Simulate node provider action on 2f+1 nodes");
617+
// The DFINITY-owned node is already recovered as part of the recovery tool, so we only need to
618+
// trigger the recovery on 2f other nodes.
619+
info!(logger, "Simulate node provider action on 2f nodes");
618620
block_on(join_all(
619621
get_host_vm_names(SUBNET_SIZE)
620-
.choose_multiple(&mut rand::thread_rng(), 2 * f + 1)
622+
.iter()
623+
.filter(|vm_name| {
624+
env.get_nested_vm(vm_name)
625+
.unwrap()
626+
.get_nested_network()
627+
.unwrap()
628+
.guest_ip
629+
!= dfinity_owned_node.get_ip_addr()
630+
})
631+
.collect::<Vec<_>>()
632+
.choose_multiple(&mut rand::thread_rng(), 2 * f)
621633
.map(|vm_name| {
622634
simulate_node_provider_action(
623635
&logger,

0 commit comments

Comments
 (0)