Skip to content

Proposal for an alternative algorithm for events-based cache event tracking #5624

@azdagron

Description

@azdagron

The experimental events-based cache relies on tracking change events for registration entries and attested nodes. The current algorithm relies on the monotonically increasing property of the event ID. Previous assumptions around arrival order of events and event ID increment stepping have contributed to a series of fixes to the current algorithm that have complicated the codebase (e.g. skipped event discovery and tracking) and contributed to pessimistic database query load, when one of the primary goals for the events-based cache is to reduce database load in the steady state. Organizations whose deployments tickle the conditions above are not realizing as much benefit from the events-based cache as should be possible.

I propose the following algorithm change to the events based cache:

  • SPIRE Server retains a list of events named processed_events, which is initially empty.
  • Every cache poll cycle:
    1. all events over the last N minutes are retrieved, based on the event CreatedAt column, into a list recent_events
    2. a diff is performed against processed_events and recent_events
      • events in recent_events but not inprocessed_events (i.e. new events) are processed and added to processed_events
      • events in processed_events but not in recent_events (i.e. old events) are removed from processed_events
      • events in both processed_events and recent_events (i.e. already processed events) are ignored

We can handle failures to talk to the database in a given poll cycle by either:

  1. polling for max(N, last time we successfully polled), or
  2. rehydrating the entire cache if it has been longer than some threshold since the last poll

This algorithm assumes a few things:

  • Number of events in the last N minutes is manageably sized:
    Seems reasonable if we keep N small, e.g. 3 minutes
  • No mutating transaction lives longer than N minutes:
    As far as I know, the only mutating operation that has the potential for a long-lived transaction is that which prunes registration entries, which executes a single statement to delete all registration entries older than some threshold. When the registration entry set is large, this operation can take some time. We can likely reduce the per-transaction time by doing the deletion in small batches and more frequently.
  • Event lookup by timestamp is efficient:
    The created_at column is not currently indexed, but can be.

PROS:

  • No more coupling with the event ID values. The value is now a simple lookup and only has to be unique across the N minute window.
  • Reduced database load for deployments that currently tickle large amounts skipped events in the current algorithm.
  • Reaches the goal of low-to-no database load when there is no churn over registration entries/agent
  • Code is much simpler.

CONS:

  • Database load is not immediately reduced when there is registration entry/agent churn since we'll still poll for events for the N minute window, but will fully reduce after N minutes.

I have prototyped the above algorithm and so far have not witnessed split brain in the cache though more testing is required.

Metadata

Metadata

Assignees

Labels

priority/backlogIssue is approved and in the backlogunscopedThe issue needs more design or understanding in order for the work to progress

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions