Skip to content

Heartbeat sender uses Time.now which is unreliable #676

@blowfishpro

Description

@blowfishpro

The Bunny heartbeat sender uses Time.now to track how long it's been since the last heartbeat (or other connection activity) and determine how long to wait before sending another heartbeat.

In one of our test suites, we discovered that the connection to the RabbitMQ server was unexpectedly closing due to missed heartbeats. We discovered that this tends to happen with tests that stub time.

So it seems like the heartbeat sender's accounting of when to send the next heartbeat is getting messed up by Time.now getting stubbed and jumping around.

It turns out there are other legitimate reasons not to use Time.now to measure elapsed time. This blog post explains some of the details there.

The suggested alternative in that blog post is to use Process.clock_gettime(Process::CLOCK_MONOTONIC) (which returns a strictly increasing count of time since some arbitrary starting point, as a float number of second by default). That seems to work well enough elsewhere, although one thing that's unclear to me is whether that's available on all of the platforms Bunny wants to support.

From a quick code search, Bunny::Concurrent::ContinuationQueue seems to also use Time.now to count elapsed time. There are a couple of other cases where Time.now is used to generate pseudo-random consumer tags, although those have other components that will vary so they're unlikely to be an issue in practice.

As an aside, we're definitely wondering ourselves whether it's a good idea to stub time in higher level tests that actually interact with e.g. a real RabbitMQ server. But that won't be trivial to address.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions