Skip to content

aarch64: tst-sampler.cc crashes sometimes in SMP mode #1158

@wkozaczuk

Description

@wkozaczuk

Quite often, once in 3-4 times, this test crashes like so:

./scripts/run.py -e '/tests/tst-sampler.so'
OSv v0.55.0-295-g371ff15c
eth0: 192.168.122.15
Booted up in 84.25 ms
Cmdline: /tests/tst-sampler.so
Starting
Stopping
Starting
Stopping
Done
Assertion failed: sched::preemptable() (arch/aarch64/mmu.cc: page_fault: 46)

[backtrace]
0x00000000400e8f70 <__assert_fail+32>
0x000000004020bb10 <page_fault+220>
0x000000004020b828 <???+1075886120>

Attaching with gdb reveals this:

(gdb) osv info threads
  id  address             name            cpu  status         vruntime  total_cpu_time location
   1 (0xffffa000407e7768) reclaimer       cpu0 waiting     1.70606e-24            2542 condvar::wait(lockfree::mutex*, sched::timer*) at core/condvar.cc:43
   2 (0xffffa000408c08a8) page-access-sca cpu0 waiting     9.62713e-20         4020006 sched::thread::sleep<long, std::ratio<1l, 1000000000l> >(std::chrono::duration<long, std::ratio<1l, 1000000000l> >) at /usr/include/c++/10/chrono:867
   3 (0xffffa000408d3470) page_pool_l2    cpu0 waiting     1.67268e-21         2071583 memory::page_pool::l2::fill_thread() at core/mempool.cc:1440
   4 (0xffffa000408eb138) itimer-real     cpu? unstarted             0               0 ?? at arch/aarch64/entry.S:114
   5 (0xffffa000408eba90) itimer-virt     cpu? unstarted             0               0 ?? at arch/aarch64/entry.S:114
   6 (0xffffa0004092d9a0) balancer0       cpu0 waiting     7.19337e-22           82541 ?? at arch/aarch64/entry.S:114
   7 (0xffffa0004092e878) rcu0            cpu0 waiting       2.214e-23           22417 osv::rcu::cpu_quiescent_state_thread::do_work() at core/rcu.cc:153
   8 (0xffffa00040940858) page_pool_l1_0  cpu0 waiting     5.12787e-22          617832 memory::page_pool::l1::fill_thread() at core/mempool.cc:1352
   9 (0xffffa00040954848) percpu0         cpu0 waiting     5.94536e-25             917 workman::call_of_duty() at core/percpu-worker.cc:92
  10 (0xffffa00040978018) async_worker0   cpu0 waiting     1.07275e-24            1625 async::async_worker::run() at core/async.cc:158
  11 (0xffffa00040978a48) idle0           cpu0 running             inf       983558495 std::_Function_handler<void(), <lambda()> >::_M_invoke(const std::_Any_data &) at /usr/include/c++/10/bits/std_function.h:291
  12 (0xffffa0004098a908) idle1           cpu1 running             inf      1000211333 std::_Function_handler<void(), <lambda()> >::_M_invoke(const std::_Any_data &) at /usr/include/c++/10/bits/std_function.h:291
  13 (0xffffa000409ae8e8) idle2           cpu2 running             inf        18757958 ?? at arch/aarch64/sched.S:54
  14 (0xffffa000409d68e8) idle3           cpu3 running             inf      1000317541 std::_Function_handler<void(), <lambda()> >::_M_invoke(const std::_Any_data &) at /usr/include/c++/10/bits/std_function.h:291
  15 (0xffffa000409fe858) rcu1            cpu1 waiting     1.92449e-23           16667 osv::rcu::cpu_quiescent_state_thread::do_work() at core/rcu.cc:153
  16 (0xffffa00040a14868) page_pool_l1_1  cpu1 waiting     1.27696e-23            8625 memory::page_pool::l1::fill_thread() at core/mempool.cc:1352
  17 (0xffffa00040a28848) percpu1         cpu1 waiting     6.19093e-24             875 workman::call_of_duty() at core/percpu-worker.cc:92
  18 (0xffffa00040a4c018) async_worker1   cpu1 waiting     6.32018e-24            1125 async::async_worker::run() at core/async.cc:158
  19 (0xffffa00040a4ca78) rcu2            cpu2 waking      6.57867e-24            1541 osv::rcu::cpu_quiescent_state_thread::do_work() at core/rcu.cc:178
  20 (0xffffa00040a5e878) page_pool_l1_2  cpu2 waking      6.15216e-24             791 memory::page_pool::l1::fill_thread() at core/mempool.cc:1349
  21 (0xffffa00040a72848) percpu2         cpu2 waiting     6.16508e-24             792 workman::call_of_duty() at core/percpu-worker.cc:92
  22 (0xffffa00040a96018) async_worker2   cpu2 waiting     6.08753e-24             667 async::async_worker::run() at core/async.cc:158
  23 (0xffffa00040a96a38) >init           cpu0 waiting     7.78067e-24            1292 std::_Function_handler<void(), sched::thread::reaper::reaper()::<lambda()> >::_M_invoke(const std::_Any_data &) at /usr/include/c++/10/bits/std_function.h:291
  24 (0xffffa00040aa8a68) rcu3            cpu3 waiting      1.3558e-23            9875 osv::rcu::cpu_quiescent_state_thread::do_work() at core/rcu.cc:178
  25 (0xffffa00040abe790) thread taskq    cpu0 waiting     1.18002e-23            7083 synch_port::_msleep(void*, mtx*, int, char const*, int) at bsd/porting/synch.cc:101
  26 (0xffffa00040abf8d0) callout         cpu0 waiting     3.33669e-21          104376 _callout_thread() at bsd/porting/callout.cc:138
  27 (0xffffa00040ad87c0) page_pool_l1_3  cpu3 waiting     1.00554e-22           63208 memory::page_pool::l1::fill_thread() at core/mempool.cc:1352
  28 (0xffffa00040b3b200) percpu3         cpu3 waiting     6.51405e-24            1209 workman::call_of_duty() at core/percpu-worker.cc:92
  29 (0xffffa00040b3b400) async_worker3   cpu3 waiting     6.51405e-24            1209 async::async_worker::run() at core/async.cc:158
  30 (0xffffa00040fb9200) system_taskq_0  cpu0 waiting     7.97454e-24            1417 synch_port::_msleep(void*, mtx*, int, char const*, int) at bsd/porting/synch.cc:101
  31 (0xffffa00040fb9600) system_taskq_1  cpu0 waiting     7.81944e-24            1208 synch_port::_msleep(void*, mtx*, int, char const*, int) at bsd/port--Type <RET> for more, q to quit, c to continue without paging--c
ing/synch.cc:101
  32 (0xffffa00040fb9800) system_taskq_2  cpu0 waiting     7.76774e-24            1125 synch_port::_msleep(void*, mtx*, int, char const*, int) at bsd/porting/synch.cc:101
  33 (0xffffa00040fb9a00) system_taskq_3  cpu0 waiting     7.61265e-24             916 synch_port::_msleep(void*, mtx*, int, char const*, int) at bsd/porting/synch.cc:101
  34 (0xffffa00040fb9c00) system_taskq_4  cpu0 waiting     7.90991e-24            1333 synch_port::_msleep(void*, mtx*, int, char const*, int) at bsd/porting/synch.cc:101
  35 (0xffffa00040fb9e00) system_taskq_5  cpu0 waiting     7.67727e-24            1000 synch_port::_msleep(void*, mtx*, int, char const*, int) at bsd/porting/synch.cc:101
  36 (0xffffa00040fb2200) system_taskq_6  cpu0 waiting      7.5868e-24             875 synch_port::_msleep(void*, mtx*, int, char const*, int) at bsd/porting/synch.cc:101
  37 (0xffffa00040fb2400) system_taskq_7  cpu0 waiting     7.52217e-24             791 synch_port::_msleep(void*, mtx*, int, char const*, int) at bsd/porting/synch.cc:101
  38 (0xffffa00040f8ce00) solthread-0x401 cpu0 waiting     6.95917e-22           12167 condvar::wait(lockfree::mutex*, sched::timer*) at core/condvar.cc:43
  39 (0xffffa00040fd6200) solthread-0x401 cpu0 waiting     3.71404e-22            6292 condvar::wait(lockfree::mutex*, sched::timer*) at core/condvar.cc:43
  40 (0xffffa00040fd6400) netisr          cpu0 waiting     2.03435e-23            5791 netisr_osv_thread_wrapper(void (*)(void*), void*) at bsd/sys/net/netisr1.cc:26
  41 (0xffffa00040fd6e00) >init           cpu0 waiting     2.67443e-20        34458539 sched::thread::join() at core/sched.cc:1370
	joining on <optimized out>
  42 (0xffffa0004109a400) virtio-net-rx   cpu0 waiting     1.90125e-28           13875 virtio::virtio_driver::wait_for_queue(virtio::vring*, bool (virtio::vring::*)() const) at drivers/virtio.cc:158
  43 (0xffffa0004109a600) virtio-tx-0     cpu0 waiting     9.33158e-29            2459 osv::xmitter<virtio::net::txq, 4096u, std::function<bool ()>, boost::iterators::function_output_iterator<osv::xmitter_functor<virtio::net::txq> > >::poll_until() at include/osv/percpu_xmit.hh:399
  44 (0xffffa0004109a800) virtio-tx-1     cpu1 waiting     1.73191e-29            2750 osv::xmitter<virtio::net::txq, 4096u, std::function<bool ()>, boost::iterators::function_output_iterator<osv::xmitter_functor<virtio::net::txq> > >::poll_until() at include/osv/percpu_xmit.hh:399
  45 (0xffffa0004109aa00) virtio-tx-2     cpu2 waiting     9.04781e-30            1417 osv::xmitter<virtio::net::txq, 4096u, std::function<bool ()>, boost::iterators::function_output_iterator<osv::xmitter_functor<virtio::net::txq> > >::lock_running() at include/osv/percpu_xmit.hh:549
  46 (0xffffa0004109ac00) virtio-tx-3     cpu3 waiting     9.56318e-30            1542 osv::xmitter<virtio::net::txq, 4096u, std::function<bool ()>, boost::iterators::function_output_iterator<osv::xmitter_functor<virtio::net::txq> > >::poll_until() at include/osv/percpu_xmit.hh:399
  47 (0xffffa00040dcf200) virtio-blk      cpu0 waiting     2.95588e-23           24999 virtio::virtio_driver::wait_for_queue(virtio::vring*, bool (virtio::vring::*)() const) at drivers/virtio.cc:158
  48 (0xffffa00040dcf400) virtio-rng      cpu0 waiting     1.32284e-20          675418 condvar::wait(lockfree::mutex*, sched::timer*) at core/condvar.cc:43
  49 (0xffffa00040dcfa00) pl011-input     cpu0 waiting     1.32866e-23            5792 console::LineDiscipline::read_poll(console::console_driver*) at drivers/line-discipline.cc:86
  50 (0xffffa00040dcfc00) rand_harvestq   cpu0 waiting     3.83739e-21          219333 synch_port::_msleep(void*, mtx*, int, char const*, int) at bsd/porting/synch.cc:101
  51 (0xffffa00040dc6200) dhcp            cpu0 waiting     2.33071e-22          264668 dhcp::dhcp_worker::dhcp_worker_fn() at core/dhcp.cc:828
  55 (0xffffa00040cf9400) >>init          cpu2 waking                0               0 ?? at arch/aarch64/entry.S:114
Number of threads: 52
(gdb) bt
#0  processor::wfi () at arch/aarch64/processor.hh:52
#1  processor::halt_no_interrupts () at arch/aarch64/processor.hh:52
#2  arch::halt_no_interrupts () at arch/aarch64/arch.hh:48
#3  operator() (__closure=0xffffa000408bfc00) at arch/aarch64/smp.cc:109
#4  std::__invoke_impl<void, <lambda()>&> (__f=...) at /usr/include/c++/10/bits/invoke.h:60
#5  std::__invoke_r<void, <lambda()>&> (__fn=...) at /usr/include/c++/10/bits/invoke.h:153
#6  std::_Function_handler<void(), <lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...) at /usr/include/c++/10/bits/std_function.h:291
#7  0x000000004020c7a4 in std::function<void ()>::operator()() const (this=<optimized out>) at /usr/include/c++/10/bits/std_function.h:617
#8  interrupt_table::invoke_interrupt (this=this@entry=0x40770a60 <idt>, id=id@entry=4) at arch/aarch64/exceptions.cc:141
#9  0x000000004020c95c in interrupt (frame=<optimized out>) at arch/aarch64/exceptions.cc:167
#10 <signal handler called>
#11 sched::cpu::do_idle (this=this@entry=0xffff8000407ee040) at core/sched.cc:454
#12 0x00000000402e6cb8 in sched::cpu::idle (this=0xffff8000407ee040) at core/sched.cc:472
#13 0x00000000402e6ce0 in operator() (__closure=<optimized out>, __closure=<optimized out>) at core/sched.cc:165
#14 std::__invoke_impl<void, sched::cpu::init_idle_thread()::<lambda()>&> (__f=...) at /usr/include/c++/10/bits/invoke.h:60
#15 std::__invoke_r<void, sched::cpu::init_idle_thread()::<lambda()>&> (__fn=...) at /usr/include/c++/10/bits/invoke.h:153
#16 std::_Function_handler<void(), sched::cpu::init_idle_thread()::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...)
    at /usr/include/c++/10/bits/std_function.h:291
#17 0x00000000402e73f4 in sched::thread::main (this=0xffffa00040978a48) at core/sched.cc:1267
#18 sched::thread_main_c (t=0xffffa00040978a48) at arch/aarch64/arch-switch.hh:162
#19 0x000000004020b788 in thread_main () at arch/aarch64/entry.S:114
(gdb) osv thread 55
(gdb) bt
#0  thread_main () at arch/aarch64/entry.S:114

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions