-
Notifications
You must be signed in to change notification settings - Fork 51
Reducer improvements #849
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reducer improvements #849
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #849 +/- ##
==========================================
+ Coverage 72.27% 72.28% +0.01%
==========================================
Files 471 471
Lines 31069 31137 +68
Branches 877 877
==========================================
+ Hits 22454 22507 +53
- Misses 8522 8535 +13
- Partials 93 95 +2
Flags with carried forward coverage won't be shown. Click here to find out more.
☔ View full report in Codecov by Sentry. |
@@ -291,7 +291,7 @@ def prepare_write(time_nsec_since_epoch, data_length, redis_topic = nil, redis_o | |||
elsif @cycle_size and ((@file_size + data_length) > @cycle_size) | |||
Logger.debug("Log writer start new file due to cycle size #{@cycle_size}") | |||
start_new_file() if allow_new_file | |||
elsif @enforce_time_order and @previous_time_nsec_since_epoch and (@previous_time_nsec_since_epoch > time_nsec_since_epoch) | |||
elsif process_out_of_order and @enforce_time_order and @previous_time_nsec_since_epoch and (@previous_time_nsec_since_epoch > time_nsec_since_epoch) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
process_out_of_order and @enforce_time_order
... which is it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Different purposes. @enforce_time_order is do we care about time order at all, process_out_of_order is do we care about time order for this entry (we don't care for the non-packet entry types).
if entry_type == :RAW_PACKET or entry_type == :JSON_PACKET | ||
process_out_of_order = true | ||
else | ||
process_out_of_order = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
entry_type can be :TARGET_DECLARATION, :PACKET_DECLARATION, :RAW_PACKET, :JSON_PACKET, :OFFSET_MARKER, :KEY_MAP
. Can you add a comment about why we process :RAW_PACKET, :JSON_PACKET
out of order but not with the rest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do
@@ -106,14 +107,28 @@ def initialize(name) | |||
end | |||
end | |||
|
|||
@buffer_depth = 10 unless @buffer_depth | |||
@buffer_depth = 60 unless @buffer_depth |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This matches the default in log_microservice, good change.
end | ||
processed # return to yield |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the only way this stays false is if nothing is returned from all_files
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. It only updates the metric if work was done. Not if it just realized there was nothing to do. Removes the more common super small time metrics.
end | ||
end | ||
|
||
def process_file(filename, type, entry_nanoseconds, file_nanoseconds) | ||
throttle = OpenC3::Throttle.new(@max_cpu_utilization) | ||
file = BucketFile.new(filename) | ||
file.retrieve | ||
unless file.local_path | ||
@logger.warn("Reducer Warning: #{filename}: Could not be retrieved") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously we just warned and then carried on? That makes no sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah - Not correct before
raise "Local file does not exist after get_object: #{client_result.inspect}" | ||
end | ||
rescue => err | ||
# Try to retrieve the file three times |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think this will address real world issues? We're asking for a file with get_object and for some reason it fails so retry and it works?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could happen. That wasn't what was causing the reducer issues, but I thought it was worth adding some retries which should improve resiliency.
I think the root cause of the EOFErrors was Rufus scheduler running two instances of process_file at the same time. The new mutex added to ReducerMicroservice prevents that.
Kudos, SonarCloud Quality Gate passed!
|
No description provided.