Release Auto-sklearn 0.7 #847

mfeurer · 2020-05-07T16:01:02Z

No description provided.

@mfeurer

* working version of the nested pipeline * first moves on the direction of a column transformer autosklearn pipeline * a working pipeline * working and tested pipeline * automl in progress * mod gitignore * more work on automl * more work on automl * more work on automl * more work on automl * more work on automl * automl seems to be working * Removed some unnecessary testing files * Added some docstrings * merged CategoryShift with CategoricalImputation to get a cleaner solution on the categorical data preprocessing pipeline * doc string corrections * fixed some unittests * Unmerged category shift and categorical imputation * Implemented some of Matthias comments * corrected some unit tests * added a CategoryShift implementation * Added an OHE implementation for sparse datasets. Fixed a couple of OHE tests * Code for the minority coalescer choice * OHE now returns only sparse matrices (keeping the original behavior) * Corrected some OHE unit tests * Use the new preprocessing pipeline inside the SimpleRegressionPipeline * fixed some unit tests * OHE unit test adjustments * readded dataset.pkl * makes sure the input of the feature_type_splitter is dense * Modifications on the FeatureTypeSplitter code due to a sklearn's ColumnTransformer bug (see sklearn issue #15627) * Added tests for the SparseOneHotEncoder * added tests for the CategoryShift implementation * Added tests for the MinorityCoalescer implementation * Added tests for CategoricalImputation * small test adjustments * category_shift.transform(X) now works on a copy of X * fixed unittest * metalearning test fixed * metalearning test fixed * updated all metalearning configuration.csv tables * use of more convinient names * cleaned last dependencies on the old 1HE * renaming * small fixes on test_metalearning_features * removed the utils.datapreprocessing and corrected some unit tests * PEP8 * OneHotEncoder now uses handle_unknown='ignore' * PEP8 * PEP8 * added some new unit tests * PEP8 * added missing __init__ file * PEP8 * added tests for data_preprocessing_numerical * added unit tests for data_preprocessing.py * added unit tests for data_preprocessing * PEP8 * corrected fit and transform behavior in the MinorityCoalescer implementation * removed method fit_transformer from NumericalPreprocessingPipeline and CategoricalPreprocessingPipeline * minor modifications suggested on @mfeurer's PR review * minor modifications suggested by @mfeurer in his PR review * small code simplification in DataPreprocessor * more modifications suggested by @mfeurer in his PR review * more modifications suggested by @mfeurer in his PR review * PEP8 fixes * Improvemnt on PreprocessingPipelineTest * PEP8 fixes * making sure new components return the correct data type * fix unit test test_pca_95percent

…#739 (#762) * PEP8 (#718) * remove warning "No models better than random" * warning when other models not better than dummy * Add model number details in warning Co-authored-by: Matthias Feurer <[email protected]>

* initial commit * fix some unittests * Edit configurations.csv to reflect the new value of the hyperpar max_bins in the GradientBostingClassifier. The by-the-book approach would be to generate new metalearning data, but the impact of not doing so, in this case, should be very small * fixed some more unittests * fixed some more unit tests * changed assert values of the unit test of the extra_tree_regression feature preprocessing * small variable renaming * trying to fix the CI issue that makes all tests be executed before the examples * Revert "trying to fix the CI issue that makes all tests be executed before the examples" This reverts commit 0bb7f2a. * try again to fix the CI issue * try again to fix the CI issue * trying to fix CI issue * corrects typo

…-sklearn into numberly-examples-links-fix

It no longer exists on 3.8, and has been deprecated since 3.3. See e.g. https://docs.python.org/3.5/library/time.html#time.clock

…dividing by 5 (#769)

…into gui-miotto-comments

* initial commit * fix some unittests * Edit configurations.csv to reflect the new value of the hyperpar max_bins in the GradientBostingClassifier. The by-the-book approach would be to generate new metalearning data, but the impact of not doing so, in this case, should be very small * fixed some more unittests * fixed some more unit tests * changed assert values of the unit test of the extra_tree_regression feature preprocessing * small variable renaming * trying to fix the CI issue that makes all tests be executed before the examples * Revert "trying to fix the CI issue that makes all tests be executed before the examples" This reverts commit 0bb7f2a. * try again to fix the CI issue * try again to fix the CI issue * trying to fix CI issue * corrects typo * changes the deprecated .ix by .loc

* replace nosetests by pytest * actually use pytest * change coverage arguments * debug output * debug * debug * debug * debug

* bump SMAC version * fix that example * reduce the number of warnings

* ADD iteratove fit for gradient boosting * increase default n_iter to 2 * replace max_iter with 512 everywhere * remove restriction on max_iter * fix

@mfeurer

…785) * fixing huge file size when saving model: Issue #421 * PEP8 and some variable renaming * Bug fixes in the file removal logic * fixes some unit tests * Improvements on the model removal logic * adds flag to activate/deactivate deletion of non-winning model * unit test fixes * fix typo * improved new unit test; improved (again) model deletion logic * PEP8 * small bug fix on getting the correct ensemble models * PEP8 * Fix example_parallel_manual_spawning * PEP8 * delete model files just after training * delete after predict * unit test adjustments * unit test adjustments * PEP8 * implements some of @mfeurer PR review comments * Adds assert to verify that at least one model file has been deleted Co-authored-by: ml-mhs <As1596309384290136>

* intermediate commit * add budget + subsample successive halving * fix bug for holdout-iterative-fit * dump progress * update budget evaluator more * many new unit tests * add example for SH * fix cv example * ADD get_max_iter for all iterative estimators (#798) * ADD get_max_iter for all iterative estimators Also, make naive bayes estimators no longer iterative * Fix unittests * FIX unittest * FIX unittest * FIX unittest * Make SGD/PassiveAggressive use 1024 steps * Add get max iter (#797) * ADD get_max_iter for all iterative estimators Also, make naive bayes estimators no longer iterative * Fix unittests * FIX unittest * FIX unittest * FIX unittest * Make SGD/PassiveAggressive use 1024 steps * combine evaluator with budget retrieval * update * update docstring, revert unnecessary changes * fix test * fix bug in test evaluator Co-authored-by: Katharina Eggensperger <[email protected]>

* Fix a bug where re-using a dataset name from the meta-data would crash Auto-sklearn * add log message * PEP8

* add budget to output file names * fix warning which debugging incredibly hard * fix bug in ensemble builder * do not remove the dummy predictor * add budgets to refit * update PR/self-review * PEP8 * PEP8 * fix sorting issue in ensemble file reading

* rename ensemble_nbest * check whether max_keep_best is float or integer * minor fixes * minor * ADD unittest * ADD threshold on performance range; rename variable * flake8 * flake8 * skip tests not working for python 3.5 * fix unittests * Now correctly skip unittests for Python 3.5 * consider mfeurers comments * fix * Update ensemble_builder.py * flake8 Co-authored-by: Matthias Feurer <[email protected]>

* add new status type converged * pep8 * pep8 and example

…iterations as budget

* add iterative-fit-cv * pep8 * pep8 * update example

* fix passive aggressive iterative fit * pep8

* #700: New sklearn.metrics.balanced_accuracy_score * Removing the missing pac score Co-authored-by: chico <[email protected]>

…ons (#807) * Add deletion of model files * Add deletion of test and validation files * remove test that verifies the deletion validation files * PEP8 * Reverses logic: loop through the directory instead of list of candidates * PEP8 * Correct error message * data structure changes * Improve readability * rewrite AbstractEvaluator.file_output() without changing its functionality * implement locks on AbstractEvaluator.file_output() * bug fix * simplify lock naming * Adapt unittest * fix AbstractEvaluatorTest * Add some nosetest byproducts to .gitignore * Fix FunctionsTest (test_train_evaluator.py) * Fix a couple of unit tests from TestTrainEvaluator * PEP8 * delete unnecessary line of code

* Add deletion of model files * Add deletion of test and validation files * remove test that verifies the deletion validation files * PEP8 * Reverses logic: loop through the directory instead of list of candidates * PEP8 * Correct error message * data structure changes * Improve readability * rewrite AbstractEvaluator.file_output() without changing its functionality * implement locks on AbstractEvaluator.file_output() * bug fix * simplify lock naming * Adapt unittest * fix AbstractEvaluatorTest * Add some nosetest byproducts to .gitignore * Fix FunctionsTest (test_train_evaluator.py) * Fix a couple of unit tests from TestTrainEvaluator * PEP8 * delete unnecessary line of code * catch an invalid setting * fix unit tests Co-authored-by: Gui Miotto <[email protected]>

* Grant PEP8 compliance for util modules * Remove __init__.py imports * PEP8

* update meta-data * add missing test files * fix two more unit tests

* Make autosklearn/data PEP8 compliant * Make autosklearn/metrics PEP8 compliant * PEP8

* Read gzip; return preds; sort files * FIX valid/test being ordered differently * flake8 * consider comments, fix unittests * don't throw an error, but sleep and continue * flake8 * FIX unittest: ensemble files start from 1 (and not from 0) * fix

bug was introduce when test was not updated for commit e12182f which introduced polynomial feature expansion for sparse data

* Make autosklearn/data PEP8 compliant * Make autosklearn/metrics PEP8 compliant * Make autosklearn/metalearning PEP8 compliant * PEP8

* make components/feature_preprocessing PEP8 compliant * make components/data_preprocessing PEP8 compliant * make components/classification PEP8 compliant * make components/regression PEP8 compliant * make pipeline/* PEP8 compliant * implement PR review requested changes

* Make tests PEP8 compliant * bug fix * Implement PR review comments * Implement PR review sugestions * fix small bug

* Fix race condition in ensemble builder Due to a recent change, the ensemble builder can load gzipped files with the ending .npy.gz. The glob statement to find these files is `.npy*`, which can also find lock files like `.npy.lock`, which must not be confused with prediction files. * Update ensemble_builder.py

* Make examples PEP8 compliant * Make examples autosklearn/ (level 0) PEP8 compliant * Make autosklearn/evaluation PEP8 compliant * Make autosklearn/ensembles PEP8 compliant * Add some noqa justifications * Remove flake8_diff.sh * Remove two unnecessary lines

* Python 3.8 Warnings cleanup * Flake 8 bug fixing * Fixing flake erors Co-authored-by: chico <[email protected]>

* First version of models in disc * Added test for max models in disc * Origin/py 3 8 warns rebase changes (#834) * Python 3.8 Warnings cleanup * Flake 8 bug fixing * Fixing flake erors Co-authored-by: chico <[email protected]> * update PR * fix replace-all error * fix unit test * change sorting and add hidden argument * update unit tests Co-authored-by: chico <[email protected]> Co-authored-by: Matthias Feurer <[email protected]>

* First version of 070 release notes * Missed a bugfix * Vim added unexpected space -- fix

gui-miotto and others added 30 commits September 17, 2019 15:32

fix some typos in comments

e1282b1

Fix internal and external links

591c6e3

Add dot

00a49c7

Merge branch 'examples-links-fix' of https://github.com/numberly/auto…

fe67807

…-sklearn into numberly-examples-links-fix

Merge branch 'numberly-examples-links-fix' into development

9551783

Replace the use of the deprecated time.clock. (#778)

9485421

It no longer exists on 3.8, and has been deprecated since 3.3. See e.g. https://docs.python.org/3.5/library/time.html#time.clock

Was previously dividing by 10 even though num of cross val is 5. Now …

cbb32a3

…dividing by 5 (#769)

Merge branch 'comments' of https://github.com/gui-miotto/auto-sklearn …

426cd1c

…into gui-miotto-comments

Merge branch 'gui-miotto-comments' into development

0d2be4e

Update .travis.yml

16ca2c5

Replace nosetests by pytest (#793)

8f102d5

* replace nosetests by pytest * actually use pytest * change coverage arguments * debug output * debug * debug * debug * debug

Bump SMAC version (#791)

7ee603d

* bump SMAC version * fix that example * reduce the number of warnings

ADD iterative fit for gradient boosting (#794)

6561d15

* ADD iteratove fit for gradient boosting * increase default n_iter to 2 * replace max_iter with 512 everywhere * remove restriction on max_iter * fix

Fix metalearning with same dataset name (#799)

e5206bb

* Fix a bug where re-using a dataset name from the meta-data would crash Auto-sklearn * add log message * PEP8

Better budget handling (#804)

f754f90

* add budget to output file names * fix warning which debugging incredibly hard * fix bug in ensemble builder * do not remove the dummy predictor * add budgets to refit * update PR/self-review * PEP8 * PEP8 * fix sorting issue in ensemble file reading

allow polynomial feature expansion on sparse data

e12182f

reduce number of warnings

4682a7b

add travis environments for python 3.8 (#808)

96f46b8

speed up minority coalescer (#810)

b571351

add new status type converged (#812)

f699b59

* add new status type converged * pep8 * pep8 and example

allow iterative evaluation and iteration as budgets

b4794d7

allow running with ensemble size == 0 again, allow iterative fit and …

0d432a6

…iterations as budget

fix bug in the train evaluator for mixed budgets

b09e439

mfeurer and others added 24 commits March 31, 2020 21:12

remove unnecessary exception

d82b7e7

fix unit tests

f807cfb

Add iterative cv (#815)

6c875ba

* add iterative-fit-cv * pep8 * pep8 * update example

Fix passive aggressive (#816)

02480fc

* fix passive aggressive iterative fit * pep8

fix bugs in iterative cv and fast ica

e1a26bb

700 balanced accuracy (#814)

3bff740

* #700: New sklearn.metrics.balanced_accuracy_score * Removing the missing pac score Co-authored-by: chico <[email protected]>

Make autosklearn/util modules PEP8 compliant (#822)

da7a766

* Grant PEP8 compliance for util modules * Remove __init__.py imports * PEP8

Update metadata 0.7.0 (#823)

a6b7a3d

* update meta-data * add missing test files * fix two more unit tests

Make autosklearn/data and autosklearn/metrics PEP8 compliant (#824)

14d0747

* Make autosklearn/data PEP8 compliant * Make autosklearn/metrics PEP8 compliant * PEP8

Fix sort files for ensemble (#821)

1217cfe

* Read gzip; return preds; sort files * FIX valid/test being ordered differently * flake8 * consider comments, fix unittests * don't throw an error, but sleep and continue * flake8 * FIX unittest: ensemble files start from 1 (and not from 0) * fix

FIX bug in unit test

56b84ec

bug was introduce when test was not updated for commit e12182f which introduced polynomial feature expansion for sparse data

Make autosklearn/metalearning PEP8 compliant (#825)

4601c9e

* Make autosklearn/data PEP8 compliant * Make autosklearn/metrics PEP8 compliant * Make autosklearn/metalearning PEP8 compliant * PEP8

fix unit test for meta-data generation (#827)

8a7e564

Make tests PEP8 compliant (#829)

b5275b1

* Make tests PEP8 compliant * bug fix * Implement PR review comments * Implement PR review sugestions * fix small bug

remove duplicates (#833)

bde939f

Origin/py 3 8 warns rebase changes (#834)

56df73a

* Python 3.8 Warnings cleanup * Flake 8 bug fixing * Fixing flake erors Co-authored-by: chico <[email protected]>

Release note 070 (#842)

3ddb1e5

* First version of 070 release notes * Missed a bugfix * Vim added unexpected space -- fix

prepare new release (#846)

60f7b89

mfeurer merged commit bb8396b into master May 7, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Release Auto-sklearn 0.7 #847

Release Auto-sklearn 0.7 #847

Uh oh!

mfeurer commented May 7, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Release Auto-sklearn 0.7 #847

Release Auto-sklearn 0.7 #847

Uh oh!

Conversation

mfeurer commented May 7, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants