-
Notifications
You must be signed in to change notification settings - Fork 169
Ensure that span start <= end #1096
Conversation
8381cf3
to
5969dfb
Compare
5969dfb
to
129e939
Compare
@jgpruitt do we know why we're receiving spans with end < start? I wonder if it's safe to just swap start and end, it could end up hiding something weird/buggy in the user's tracing implementation. |
@JamesGuthrie I don't know why yet. I want to try to track it down later today. I think these are our possible solutions:
IMO, 4 is the best approach, but they're all icky. WDYT? |
@JamesGuthrie I've spent a couple of hours trying to track down the source of the I *think I'm using the Python tracing libraries correctly. I referenced multiple tutorials. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be an issue upstream. Based on the previous comment, it seems that this happened due to python lib, so I think we should open an issue there and report this.
I am not confident in making our own fixes as this in other words is hiding the cause, which is basically data corruption as the user will never expect this. Hence, I don't think we should have this swapping fix, rather drop the entire span batch. We should be strict in ingesting data to avoid problems down the road.
a69ca6b
to
b6602af
Compare
I agree with @Harkishen-Singh. I think we should try to get to the bottom of this. Does it only happen with Python libs (seems like race condition? ). |
I created another issue to track my efforts at identifying a source. #1127 IMO, having this "fix" in place is of benefit in case of future bugs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this solution as it follows the robustness principle.
Adds code to detect when a span's end time is less than its start time. This would be caught by a database constraint and prevent the insertion of the span. The new code will swap the start and end times to allow the span's insertion.
b6602af
to
a404386
Compare
As a team, we decided to implement the fix as-is and follow up at a later date to track down the issue. The most important thing ATM is dealing with data issues.
Description
I have seen instances in which promscale attempts to insert spans that have an end time < start time. There is a database check constraint that catches this error, but it results in dropped spans. This patch catches the issue in Go and swaps the start and end times so that the spans will be inserted successfully.
Issue
Merge requirements
Please take into account the following non-code changes that you may need to make with your PR: