Skip to content

Conversation

franchuterivera
Copy link
Contributor

  • Code cleanup for _read_np_fn

  • Fix the precision type when reading files, in _read_np_fn

  • Score the predictions without reading them. Only read the actual file when we are sure it is the best prediction.

@franchuterivera
Copy link
Contributor Author

I do have one extra open, and is the need of the complex data structure with loaded key.

That is, I don't see this "loaded" key being used at all, in the ensemble structure.

Copy link
Contributor

@mfeurer mfeurer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments throughout the changes. I really like that this PR even reduces the lines of code.

@codecov-commenter
Copy link

codecov-commenter commented Jun 11, 2020

Codecov Report

Merging #870 into development will increase coverage by 0.02%.
The diff coverage is 83.33%.

Impacted file tree graph

@@               Coverage Diff               @@
##           development     #870      +/-   ##
===============================================
+ Coverage        84.02%   84.04%   +0.02%     
===============================================
  Files              127      127              
  Lines             9458     9435      -23     
===============================================
- Hits              7947     7930      -17     
+ Misses            1511     1505       -6     
Impacted Files Coverage Δ
autosklearn/ensemble_builder.py 71.08% <82.92%> (+1.18%) ⬆️
autosklearn/automl.py 81.75% <100.00%> (+0.06%) ⬆️
...mponents/feature_preprocessing/nystroem_sampler.py 85.29% <0.00%> (-5.89%) ⬇️
..._preprocessing/select_percentile_classification.py 86.20% <0.00%> (-3.45%) ⬇️
.../metalearning/metalearning/kNearestDatasets/kND.py 94.11% <0.00%> (-0.62%) ⬇️
autosklearn/estimators.py 90.36% <0.00%> (ø)
...eline/components/feature_preprocessing/fast_ica.py 91.30% <0.00%> (+2.17%) ⬆️
...e/components/feature_preprocessing/select_rates.py 84.61% <0.00%> (+3.07%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c695989...1187a11. Read the comment docs.

@mfeurer mfeurer merged commit d313f26 into automl:development Jun 13, 2020
charlesfu4 pushed a commit to charlesfu4/auto-sklearn that referenced this pull request Jun 17, 2020
* Do not read predictions in memory, only after score

* Precission support for string/int
mfeurer added a commit that referenced this pull request Jul 3, 2020
* PEP8 (#718)

* multioutput_regression

* multioutput_regression

* multioutput_regression

* multioutput regression

* multioutput regression

* multioutput regression

* multioutput regression

* multioutput regression

* #782 showcase pipeline components iteration

* Fixed flake-8 violations

* multi_output regression v1

* fix y_shape in multioutput regression

* fix xy_data_manager change due to merge

* automl.py missing import

* Release note 070 (#842)

* First version of 070 release notes

* Missed a bugfix

* Vim added unexpected space -- fix

* prepare new release (#846)

* Clip predict values to [0-1] in classification

* Fix for 3.5 python!

* Sensible default value of 'score_func' for SelectPercentileRegression (#843)

Currently default value of 'score_func' for SelectPercentileRegression
is "f_classif", which is an invalid value, and will surely be rejected and
will not work

* More robust tmp file naming (#854)

* More robust tmp file naming

* UUID approach

* 771 worst possible result (#845)

* Initial Commit

* Make worst result a function

* worst possible result in metric

* Fixing the name of the scorers

* Add exceptions to log file, not just stdout (#863)

* Add exceptions to log file, not just stdout

* Removing dummy pred as trys is not needed

* Add prediction with models trained with cross-validation (#864)

* add the possibility to predict with cross-validation

* fix unit tests

* test new feature, too

* 715 ml memory (#865)

* #715 Support for no ml memory limit

* API update

* Docs enhancement (#862)

* Improved docs

* Fixed example typos

* Beautify examples

* cleanup examples

* fixed rsa equal

* Move to minmax scaler (#866)

* Do not read predictions in memory, only after score (#870)

* Do not read predictions in memory, only after score

* Precission support for string/int

* Removal of competition manager (#869)

* Removal of competition manager

* Removed additional unused methods/files and moved metrics to estimator

* Fix meta data generation

* Make sure pytest is older newer than 4.6

* Unit tst fixing

* flake8 fixes in examples

* Fix metadata gen metrics

* Fix dataprocessing get params (#877)

* Fix dataprocessing get params

* Add clone-test to regression pipeline

* Allow 1-D threshold binary predictions (#879)

* fix single output regression not working

* regression need no _enusre_prediction_array_size_prediction_array_sizess

* #782 showcase pipeline components iteration

* Fixed flake-8 violations

* Release note 070 (#842)

* First version of 070 release notes

* Missed a bugfix

* Vim added unexpected space -- fix

* prepare new release (#846)

* Clip predict values to [0-1] in classification

* Fix for 3.5 python!

* Sensible default value of 'score_func' for SelectPercentileRegression (#843)

Currently default value of 'score_func' for SelectPercentileRegression
is "f_classif", which is an invalid value, and will surely be rejected and
will not work

* More robust tmp file naming (#854)

* More robust tmp file naming

* UUID approach

* 771 worst possible result (#845)

* Initial Commit

* Make worst result a function

* worst possible result in metric

* Fixing the name of the scorers

* Add exceptions to log file, not just stdout (#863)

* Add exceptions to log file, not just stdout

* Removing dummy pred as trys is not needed

* Add prediction with models trained with cross-validation (#864)

* add the possibility to predict with cross-validation

* fix unit tests

* test new feature, too

* 715 ml memory (#865)

* #715 Support for no ml memory limit

* API update

* Docs enhancement (#862)

* Improved docs

* Fixed example typos

* Beautify examples

* cleanup examples

* fixed rsa equal

* Move to minmax scaler (#866)

* Do not read predictions in memory, only after score (#870)

* Do not read predictions in memory, only after score

* Precission support for string/int

* Removal of competition manager (#869)

* Removal of competition manager

* Removed additional unused methods/files and moved metrics to estimator

* Fix meta data generation

* Make sure pytest is older newer than 4.6

* Unit tst fixing

* flake8 fixes in examples

* Fix metadata gen metrics

* Fix dataprocessing get params (#877)

* Fix dataprocessing get params

* Add clone-test to regression pipeline

* Allow 1-D threshold binary predictions (#879)

* multioutput_regression

* multioutput_regression

* multioutput_regression

* multioutput_regression

* multioutput_regression

* multioutput_regression

* multioutput regression

* multioutput regression

* multioutput regression

* multioutput regression

* multi_output regression v1

* fix y_shape in multioutput regression

* fix xy_data_manager change due to merge

* fix single output regression not working

* regression need no _enusre_prediction_array_size_prediction_array_sizess

* Add prediction with models trained with cross-validation (#864)

* add the possibility to predict with cross-validation

* fix unit tests

* test new feature, too

* multioutput_regression

* multioutput_regression

* multioutput_regression

* Removal of competition manager (#869)

* Removal of competition manager

* Removed additional unused methods/files and moved metrics to estimator

* Fix meta data generation

* Make sure pytest is older newer than 4.6

* Unit tst fixing

* flake8 fixes in examples

* Fix metadata gen metrics

* multioutput after rebased to 0.7.0

Problem:

Cause:

Solution:

* Regressor target y shape index out of range

* Revision for make tester

* Revision: Cancel Multiclass-MultiOuput

* Resolve automl.py metrics(__init__) reg_gb reg_svm

* Fix Flake8 errors

* Fix automl.py flake8

* Preprocess w/ mulitout reg,automl self._n_outputs

* test_estimator.py changed back

* cancel multioutput multiclass for multi reg

* Fix automl self._n_output update placement

* fix flake8

* Kernel pca cancelled mulitout reg

* Kernel PCA test skip python <3.8

* Add test unit for multioutput reg and fix.

* Fix flake8 error

* Kernel PCA multioutput regression

* default kernel to cosine, dodge sklearn=0.22 error

* Kernel PCA should be updated to 0.23

* Kernel PCA uses rbf kernel

* Kernel Pca

* Modify labels in reg, class, perpro in examples

* Kernel PCA

* Add missing supports to mincoal and truncateSVD

Co-authored-by: Matthias Feurer <[email protected]>
Co-authored-by: chico <[email protected]>
Co-authored-by: Francisco Rivera Valverde <[email protected]>
Co-authored-by: Xiaodong DENG <[email protected]>
franchuterivera added a commit to franchuterivera/auto-sklearn that referenced this pull request Aug 21, 2020
* Do not read predictions in memory, only after score

* Precission support for string/int
franchuterivera added a commit to franchuterivera/auto-sklearn that referenced this pull request Aug 21, 2020
* PEP8 (automl#718)

* multioutput_regression

* multioutput_regression

* multioutput_regression

* multioutput regression

* multioutput regression

* multioutput regression

* multioutput regression

* multioutput regression

* automl#782 showcase pipeline components iteration

* Fixed flake-8 violations

* multi_output regression v1

* fix y_shape in multioutput regression

* fix xy_data_manager change due to merge

* automl.py missing import

* Release note 070 (automl#842)

* First version of 070 release notes

* Missed a bugfix

* Vim added unexpected space -- fix

* prepare new release (automl#846)

* Clip predict values to [0-1] in classification

* Fix for 3.5 python!

* Sensible default value of 'score_func' for SelectPercentileRegression (automl#843)

Currently default value of 'score_func' for SelectPercentileRegression
is "f_classif", which is an invalid value, and will surely be rejected and
will not work

* More robust tmp file naming (automl#854)

* More robust tmp file naming

* UUID approach

* 771 worst possible result (automl#845)

* Initial Commit

* Make worst result a function

* worst possible result in metric

* Fixing the name of the scorers

* Add exceptions to log file, not just stdout (automl#863)

* Add exceptions to log file, not just stdout

* Removing dummy pred as trys is not needed

* Add prediction with models trained with cross-validation (automl#864)

* add the possibility to predict with cross-validation

* fix unit tests

* test new feature, too

* 715 ml memory (automl#865)

* automl#715 Support for no ml memory limit

* API update

* Docs enhancement (automl#862)

* Improved docs

* Fixed example typos

* Beautify examples

* cleanup examples

* fixed rsa equal

* Move to minmax scaler (automl#866)

* Do not read predictions in memory, only after score (automl#870)

* Do not read predictions in memory, only after score

* Precission support for string/int

* Removal of competition manager (automl#869)

* Removal of competition manager

* Removed additional unused methods/files and moved metrics to estimator

* Fix meta data generation

* Make sure pytest is older newer than 4.6

* Unit tst fixing

* flake8 fixes in examples

* Fix metadata gen metrics

* Fix dataprocessing get params (automl#877)

* Fix dataprocessing get params

* Add clone-test to regression pipeline

* Allow 1-D threshold binary predictions (automl#879)

* fix single output regression not working

* regression need no _enusre_prediction_array_size_prediction_array_sizess

* automl#782 showcase pipeline components iteration

* Fixed flake-8 violations

* Release note 070 (automl#842)

* First version of 070 release notes

* Missed a bugfix

* Vim added unexpected space -- fix

* prepare new release (automl#846)

* Clip predict values to [0-1] in classification

* Fix for 3.5 python!

* Sensible default value of 'score_func' for SelectPercentileRegression (automl#843)

Currently default value of 'score_func' for SelectPercentileRegression
is "f_classif", which is an invalid value, and will surely be rejected and
will not work

* More robust tmp file naming (automl#854)

* More robust tmp file naming

* UUID approach

* 771 worst possible result (automl#845)

* Initial Commit

* Make worst result a function

* worst possible result in metric

* Fixing the name of the scorers

* Add exceptions to log file, not just stdout (automl#863)

* Add exceptions to log file, not just stdout

* Removing dummy pred as trys is not needed

* Add prediction with models trained with cross-validation (automl#864)

* add the possibility to predict with cross-validation

* fix unit tests

* test new feature, too

* 715 ml memory (automl#865)

* automl#715 Support for no ml memory limit

* API update

* Docs enhancement (automl#862)

* Improved docs

* Fixed example typos

* Beautify examples

* cleanup examples

* fixed rsa equal

* Move to minmax scaler (automl#866)

* Do not read predictions in memory, only after score (automl#870)

* Do not read predictions in memory, only after score

* Precission support for string/int

* Removal of competition manager (automl#869)

* Removal of competition manager

* Removed additional unused methods/files and moved metrics to estimator

* Fix meta data generation

* Make sure pytest is older newer than 4.6

* Unit tst fixing

* flake8 fixes in examples

* Fix metadata gen metrics

* Fix dataprocessing get params (automl#877)

* Fix dataprocessing get params

* Add clone-test to regression pipeline

* Allow 1-D threshold binary predictions (automl#879)

* multioutput_regression

* multioutput_regression

* multioutput_regression

* multioutput_regression

* multioutput_regression

* multioutput_regression

* multioutput regression

* multioutput regression

* multioutput regression

* multioutput regression

* multi_output regression v1

* fix y_shape in multioutput regression

* fix xy_data_manager change due to merge

* fix single output regression not working

* regression need no _enusre_prediction_array_size_prediction_array_sizess

* Add prediction with models trained with cross-validation (automl#864)

* add the possibility to predict with cross-validation

* fix unit tests

* test new feature, too

* multioutput_regression

* multioutput_regression

* multioutput_regression

* Removal of competition manager (automl#869)

* Removal of competition manager

* Removed additional unused methods/files and moved metrics to estimator

* Fix meta data generation

* Make sure pytest is older newer than 4.6

* Unit tst fixing

* flake8 fixes in examples

* Fix metadata gen metrics

* multioutput after rebased to 0.7.0

Problem:

Cause:

Solution:

* Regressor target y shape index out of range

* Revision for make tester

* Revision: Cancel Multiclass-MultiOuput

* Resolve automl.py metrics(__init__) reg_gb reg_svm

* Fix Flake8 errors

* Fix automl.py flake8

* Preprocess w/ mulitout reg,automl self._n_outputs

* test_estimator.py changed back

* cancel multioutput multiclass for multi reg

* Fix automl self._n_output update placement

* fix flake8

* Kernel pca cancelled mulitout reg

* Kernel PCA test skip python <3.8

* Add test unit for multioutput reg and fix.

* Fix flake8 error

* Kernel PCA multioutput regression

* default kernel to cosine, dodge sklearn=0.22 error

* Kernel PCA should be updated to 0.23

* Kernel PCA uses rbf kernel

* Kernel Pca

* Modify labels in reg, class, perpro in examples

* Kernel PCA

* Add missing supports to mincoal and truncateSVD

Co-authored-by: Matthias Feurer <[email protected]>
Co-authored-by: chico <[email protected]>
Co-authored-by: Francisco Rivera Valverde <[email protected]>
Co-authored-by: Xiaodong DENG <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants