Skip to content

CC hangs if the number of batches to send is too large #5

@Victor-C-Zhang

Description

@Victor-C-Zhang

The issue

At the beginning of a CC, we need to flush the buffers and update supernodes. It seems like this is hanging if there are not enough updates buffered. There's probably a bug in the code to dispatch a message to the MPI worker. We might need a force_send option for this.

How to replicate

  • Run distributed tests with WorkerCluster::num_batches = 512
  • On dist_query branch run speed experiment -np 4 ./speed_expr 50000 50012 [output file]

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions