Flink fix terminal streaming events #2768
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem
Marquez creates new job version for streaming jobs whenever hash of a job version changes. We introduced this assumption as it makes sense most of the time. However, this does not make much sense for terminal events. In other words, a terminal event for streaming job like
complete
with no input nor output datasets contained, should mean only a job has finished. It shouldn't mean creating a new job version which is current behaviour.Closes: #2767
Solution
Please describe your change as it relates to the problem, or bug fix, as well as any dependencies. If your change requires a database schema migration, please describe the schema modification(s) and whether it's a backwards-incompatible or backwards-compatible change.
One-line summary:
Checklist
CHANGELOG.md
(Depending on the change, this may not be necessary)..sql
database schema migration according to Flyway's naming convention (if relevant)