-
Notifications
You must be signed in to change notification settings - Fork 14.6k
MINOR: retry upon missing source topic #20284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly looking good to me. I left a few comments to cleanup/simplify the code.
@@ -371,6 +372,9 @@ public boolean isStartingRunningOrPartitionAssigned() { | |||
private volatile KafkaFutureImpl<Uuid> restoreConsumerInstanceIdFuture = new KafkaFutureImpl<>(); | |||
private volatile KafkaFutureImpl<Uuid> producerInstanceIdFuture = new KafkaFutureImpl<>(); | |||
|
|||
// Missing source topic timeout tracking | |||
private long firstMissingSourceTopicTime = -1L; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it would make things slighly more easy to read if we'd use
org.apache.kafka.common.utils.Timer
for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, can we rename this to a more generic topicsReadyTimer
? I think we may want to reuse the timer to also time out when internal topics are not created in time.
@@ -371,6 +372,9 @@ public boolean isStartingRunningOrPartitionAssigned() { | |||
private volatile KafkaFutureImpl<Uuid> restoreConsumerInstanceIdFuture = new KafkaFutureImpl<>(); | |||
private volatile KafkaFutureImpl<Uuid> producerInstanceIdFuture = new KafkaFutureImpl<>(); | |||
|
|||
// Missing source topic timeout tracking |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you describe a member, I'd use a javadoc comment. But this comment isn't adding anything on top of the variable name, so maybe we can drop it altogether?
handleMissingSourceTopicsWithTimeout(missingTopicsDetail); | ||
} else { | ||
// Reset timeout tracking when no missing source topics are reported | ||
firstMissingSourceTopicTime = -1L; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if you use org.apache.kafka.common.utils.Timer and call reset here, you don't need the inline comment.
streams/src/main/java/org/apache/kafka/streams/processor/internals/StreamThread.java
Show resolved
Hide resolved
...ts/src/test/java/org/apache/kafka/streams/integration/KStreamRepartitionIntegrationTest.java
Show resolved
Hide resolved
|
||
// Advance time beyond max.poll.interval.ms (default is 300000ms) to trigger timeout | ||
mockTime.sleep(300001); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: advance time less than 5 min and check if 2nd call throws exception and also check the log message (if easy) and then next step advancing time beyond 5 min as you did!
|
||
// First call should not throw exception (within timeout) | ||
thread.runOnceWithoutProcessingThreads(); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same suggestion (below) here.
@RaidenE1 Thank you, the PR looks good to me now. I had a suggestion, but it’s not necessary to address since you’re already checking the condition in a later test. |
log.error(errorMsg); | ||
|
||
// Reset timer for next timeout cycle | ||
topicsReadyTimer.updateAndReset(maxPollTimeMs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to update the timer? We throw MissingSourceTopicException
below, and this should shut down the thread?
log.error(errorMsg); | ||
throw new MissingSourceTopicException(errorMsg); | ||
throw new TopologyException(errorMsg); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems this case is newly added, but we did not add a new test for it?
Implements a timeout mechanism (using maxPollTimeMs) that waits for
missing source topics to be created before failing, instead of
immediately throwing exceptions in the new Streams protocol.
Additionally, throw TopologyException when partition count mismatch is
detected.
Reviewers: Lucas Brutschy [email protected], Alieh Saeedi
[email protected], Matthias J. Sax [email protected]