Skip to content

Conversation

ahn1340
Copy link
Contributor

@ahn1340 ahn1340 commented Nov 21, 2018

This PR fixes the problem of the ensemble of classifiers returning prediction values larger than 1. This happens because predict() in ensemble_selection.py sometimes receive model predictions as the input, where predictions of models with zero weights are excluded, and sometimes it receives predictions including those of zero weight models. Therefore, predict() now deals with these two cases to make sure that model weights are applied correctly.

Jin Woo Ahn and others added 30 commits May 3, 2018 18:10
Fix minor printing error in sprint_statistics.
* .

* .

* AutoSklearnClassifier/Regressor's fit, refit, fit_ensemble now return self.

* Initial commit. Work in Progress.

* Fix minor printing error in sprint_statistics.

* Revert "Fix#460"

* Resolve rebase conflict

* combined unittests to reduce travis runtime

* .

* .

* .

* .

* .
Copy link
Contributor

@mfeurer mfeurer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot. Could you please also fix the conflict?

# train ensemble
ensemble = self.fit_ensemble(selected_keys=selected_models)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please revert changes in this file as there are no changes to the actual code in this file.

def predict(self, predictions):
non_null_weights = (weight for weight in self.weights_ if weight > 0)
for i, weight in enumerate(non_null_weights):
#non_null_weights = (weight for weight in self.weights_ if weight > 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please remove these comments?

# predictions[i] *= weight
for i, weight in enumerate(self.weights_):
predictions[i] *= weight
return np.sum(predictions, axis=0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this code maybe be simplified a lot more to return np.average(predictions, axis=0, weights=self.weights_)?

@ahn1340 ahn1340 changed the title [WIP] Fix classifier bug Fix classifier bug Nov 30, 2018
@ahn1340 ahn1340 changed the title Fix classifier bug [WIP]Fix classifier bug Nov 30, 2018
X, batch_size=batch_size, n_jobs=n_jobs)
assert np.allclose(np.sum(pred_proba, axis=1),
np.ones_like(pred_proba[:, 0])),\
"prediction probability does not sum up to 1!"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please change the formatting to

assert (
    np.allclose(
        np.sum(pred_proba, axis=1),
        np.ones_like(pred_proba[:, 0])),
    ), "prediction probability does not sum up to 1!"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I will make that modification. Currently some unittests fail due to this line. I'm working on the fix and I will remove [WIP] as soon as it is done.

@codecov-io
Copy link

codecov-io commented Dec 5, 2018

Codecov Report

Merging #585 into development will decrease coverage by 0.03%.
The diff coverage is 92.85%.

Impacted file tree graph

@@               Coverage Diff               @@
##           development     #585      +/-   ##
===============================================
- Coverage        78.63%   78.59%   -0.04%     
===============================================
  Files              130      130              
  Lines            10119    10129      +10     
===============================================
+ Hits              7957     7961       +4     
- Misses            2162     2168       +6
Impacted Files Coverage Δ
autosklearn/ensembles/ensemble_selection.py 58.18% <100%> (+0.77%) ⬆️
autosklearn/estimators.py 90.65% <85.71%> (-0.44%) ⬇️
..._preprocessing/select_percentile_classification.py 82.75% <0%> (-6.9%) ⬇️
...e/components/feature_preprocessing/select_rates.py 84.61% <0%> (-1.54%) ⬇️
...ipeline/components/classification/decision_tree.py 93.75% <0%> (ø) ⬆️
...rn/pipeline/components/regression/decision_tree.py 94.64% <0%> (ø) ⬆️
autosklearn/evaluation/train_evaluator.py 93.77% <0%> (+0.02%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 66d9f09...b6336e3. Read the comment docs.

@ahn1340 ahn1340 changed the title [WIP]Fix classifier bug Fix classifier bug Dec 6, 2018
@mfeurer mfeurer merged commit b53c7e1 into automl:development Dec 6, 2018
@ahn1340 ahn1340 deleted the classifier_bug branch December 6, 2018 15:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants