-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Fix classifier bug #585
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix classifier bug #585
Conversation
Fix minor printing error in sprint_statistics.
Revert "Fix#460"
* . * . * AutoSklearnClassifier/Regressor's fit, refit, fit_ensemble now return self. * Initial commit. Work in Progress. * Fix minor printing error in sprint_statistics. * Revert "Fix#460" * Resolve rebase conflict * combined unittests to reduce travis runtime * . * . * . * . * .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot. Could you please also fix the conflict?
# train ensemble | ||
ensemble = self.fit_ensemble(selected_keys=selected_models) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please revert changes in this file as there are no changes to the actual code in this file.
def predict(self, predictions): | ||
non_null_weights = (weight for weight in self.weights_ if weight > 0) | ||
for i, weight in enumerate(non_null_weights): | ||
#non_null_weights = (weight for weight in self.weights_ if weight > 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please remove these comments?
# predictions[i] *= weight | ||
for i, weight in enumerate(self.weights_): | ||
predictions[i] *= weight | ||
return np.sum(predictions, axis=0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this code maybe be simplified a lot more to return np.average(predictions, axis=0, weights=self.weights_)
?
autosklearn/estimators.py
Outdated
X, batch_size=batch_size, n_jobs=n_jobs) | ||
assert np.allclose(np.sum(pred_proba, axis=1), | ||
np.ones_like(pred_proba[:, 0])),\ | ||
"prediction probability does not sum up to 1!" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please change the formatting to
assert (
np.allclose(
np.sum(pred_proba, axis=1),
np.ones_like(pred_proba[:, 0])),
), "prediction probability does not sum up to 1!"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I will make that modification. Currently some unittests fail due to this line. I'm working on the fix and I will remove [WIP] as soon as it is done.
Codecov Report
@@ Coverage Diff @@
## development #585 +/- ##
===============================================
- Coverage 78.63% 78.59% -0.04%
===============================================
Files 130 130
Lines 10119 10129 +10
===============================================
+ Hits 7957 7961 +4
- Misses 2162 2168 +6
Continue to review full report at Codecov.
|
This PR fixes the problem of the ensemble of classifiers returning prediction values larger than 1. This happens because predict() in ensemble_selection.py sometimes receive model predictions as the input, where predictions of models with zero weights are excluded, and sometimes it receives predictions including those of zero weight models. Therefore, predict() now deals with these two cases to make sure that model weights are applied correctly.