Unrecognized "cost_for_crash" keyword to sklearn resampling strategies

I am trying to use a StratifiedKFold (and also RepeatedStratifiedKFold) as my resampling strategy but both seem to be causing crashes.

Here is a sample script based on the cancer dataset in the documentation:

```
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
X, y = load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y)

import autosklearn
import autosklearn.classification
from sklearn.model_selection import StratifiedKFold

clf = autosklearn.classification.AutoSklearnClassifier(
    time_left_for_this_task=30,
    per_run_time_limit=10,
    n_jobs=1,
    ml_memory_limit=2**13,
    seed=5,
    resampling_strategy=StratifiedKFold,
    resampling_strategy_arguments={'n_splits':5, 'shuffle':True, 'random_state':0},
    delete_output_folder_after_terminate=False,
    delete_tmp_folder_after_terminate=False,
    tmp_folder='./tmp/',
    output_folder='./output/'
)
clf.fit(X_train, y_train)

print(clf.sprint_statistics())
```

The result is:
```
/home/nam4/local/anaconda2/envs/automl/lib/python3.7/site-packages/pyparsing.py:3190: FutureWarning: Possible set intersection at position 3
  self.re = re.compile(self.reString)
auto-sklearn results:
  Dataset name: b1863778bbca963da927ae292545f722
  Metric: accuracy
  Number of target algorithm runs: 29
  Number of successful target algorithm runs: 0
  Number of crashed target algorithm runs: 29
  Number of target algorithms that exceeded the time limit: 0
  Number of target algorithms that exceeded the memory limit: 0
```

The output indicates that all runs have crashed (changing memory or time allowed had no effect).  Inspection of the tmp/ folder logs seems to indicate the issue is that a key called "cost_for_crash" is being passed to the strategy and is not recognized. For example, in my tmp/AutoML(5):b1863778bbca963da927ae292545f722.log file I see something that looks like:

```
[DEBUG] [2020-07-15 11:04:51,506:AutoMLSMBO(5)::b1863778bbca963da927ae292545f722] Return: Status: <StatusType.CRASHED: 3>, cost: 1.000000, time: 0.023193, additional: {'traceback': 'Traceback (most recent call last):\n  File "/home/nam4/local/anaconda2/envs/automl/lib/python3.7/site-packages/autosklearn/evaluation/__init__.py", line 29, in fit_predict_try_except_decorator\n    return ta(queue=queue, **kwargs)\n  File "/home/nam4/local/anaconda2/envs/automl/lib/python3.7/site-packages/autosklearn/evaluation/train_evaluator.py", line 1236, in eval_cv\n    budget_type=budget_type,\n  File "/home/nam4/local/anaconda2/envs/automl/lib/python3.7/site-packages/autosklearn/evaluation/train_evaluator.py", line 179, in __init__\n    self.splitter = self.get_splitter(self.datamanager)\n  File "/home/nam4/local/anaconda2/envs/automl/lib/python3.7/site-packages/autosklearn/evaluation/train_evaluator.py", line 951, in get_splitter\n    cv = copy.deepcopy(self.resampling_strategy)(**init_dict)\nTypeError: __init__() got an unexpected keyword argument \'cost_for_crash\'\n', 'error': 'TypeError("__init__() got an unexpected keyword argument \'cost_for_crash\'")', 'configuration_origin': 'Initial design'}
[INFO] [2020-07-15 11:04:51,508:smac.intensification.intensification.Intensifier] Wallclock time limit for intensification reached (used: 0.174835 sec, available: 0.000010 sec)
[INFO] [2020-07-15 11:04:51,508:smac.intensification.intensification.Intensifier] Wallclock time limit for intensification reached (used: 0.174835 sec, available: 0.000010 sec)
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unrecognized "cost_for_crash" keyword to sklearn resampling strategies #901

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unrecognized "cost_for_crash" keyword to sklearn resampling strategies #901

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions