You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix(dataflow): do not create new KafkaStreams app for existing pipelines
This fixes a dataflow-engine bug triggered when the scheduler will
sends/re-sends pipeline creation messages after a restart
(because it's not aware of their status across various components).
Previous behaviour:
Dataflow-engine, on receiving a command to create a pipeline, would first
create a new Kafka Streams application for this pipeline, before checking
if one already exists and it's running.
Because of this, triggering a control-plane restart of the scheduler in 2.8.1
would result in dataflow errors for pipelines that kept internal state
(mostly pipelines making use of triggers/joins). Kafka Streams would complain
about an existing application using the same state directory, fail the newly
created pipeline and inform the scheduler about this.
However, in actuality the old pipeline, if it was previously running ok,
would continue doing so inside dataflow. This meant that a disconnect between
the state of dataflow-engine and what the scheduler knew about it was being
created
New behaviour:
The introduced changes mean that dataflow-engine first checks if a pipeline
with the same id is already running. If its state is ok, dataflow simply
informs the scheduler that the pipeline is created, without taking further
action.
If a pipeline with the same id already exists but is in a failed state,
it is first stopped (local Kafka Streams state is cleaned), then an attempt
is made to re-create it, with the corresponding status being sent to the
scheduler.
0 commit comments