-
Notifications
You must be signed in to change notification settings - Fork 858
Fix lf summary no abstain 1446 #1447
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix lf summary no abstain 1446 #1447
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@garaud great find and great fix, thanks!! Looks like it's just a formatting issue. You can fix it by running tox -e fix
, then verify by running tox
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for posting the issue and the PR! I made one suggestion for simplifying the calculation. Then once we clarify the naming of lfa_bis
in the unit test, looks great to me!
Codecov Report
@@ Coverage Diff @@
## master #1447 +/- ##
=========================================
Coverage ? 97.55%
=========================================
Files ? 55
Lines ? 2002
Branches ? 328
=========================================
Hits ? 1953
Misses ? 22
Partials ? 27
|
75dbd02
to
8b2d3af
Compare
test/labeling/test_analysis.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last nit, I promise: can we add a 3 to L_wo_abstain and a 4 to the Y used here? This will make sure that any future changes account for non-identical label sets between L and Y, which we handle correctly in this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea :)
I updated the test part.
…trix when the L input data didn't have ABSTAIN label. We only slice the confusion matrix on [1:, 1:] when there is some ABSTAIN label, i.e. -1 values. The full confusion matrix should be taken into account when you don't have -1 values. fix snorkel-team#1446
8b2d3af
to
95ca47f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@garaud awesome!! Let's ship it. Feel free to merge whenever you want.
@garaud we usually leave it to authors to merge, but we're pushing a patch today so I'm going to go ahead and merge for you. Thanks again, this is great!! |
Don't worry. Thank you for the review. The weak supervision is quite new for me and snorkel is a great project! We use the weak supervision to label massive 3D point clouds (LIDAR scans in natural environment such as rocks, cliff, vegetation, etc.) and it's really helpful. |
Description of proposed changes
Fix the computation of correct/incorrect values from the confusion matrix when the L input data didn't have ABSTAIN label. We only slice the confusion matrix on [1:, 1:] when there is some ABSTAIN label, i.e. -1 values. The full confusion matrix should be taken into account when you don't have -1 values.
Related issue(s)
Fixes #1446
Test plan
Checklist
Need help on these? Just ask!
I've some trouble with pyspark for the test part. But I think it's not related with this fix.