-
Notifications
You must be signed in to change notification settings - Fork 425
Description
I have a dataset X_train, features data: pandas.DataFrame column 'store_type_pdist' has dtype 'category', with numerical values like 0, 1, 2;
when running code like:
automl = AutoML(mode="Perform")
automl.fit(X_train, y_train);
it get the following error,please help to resolve, thanks
## Error for 3_Default_CatBoost
features data: pandas.DataFrame column 'store_type_pdist' has dtype 'category' but is not in cat_features list
Traceback (most recent call last):
File "/home/user/workspace/aitiaexplorer/uplift/automl/lib/python3.8/site-packages/supervised/base_automl.py", line 1074, in _fit
trained = self.train_model(params)
File "/home/user/workspace/aitiaexplorer/uplift/automl/lib/python3.8/site-packages/supervised/base_automl.py", line 363, in train_model
self.keep_model(mf, model_subpath)
File "/home/user/workspace/aitiaexplorer/uplift/automl/lib/python3.8/site-packages/supervised/base_automl.py", line 262, in keep_model
self._base_predict(self._one_sample, model)
File "/home/user/workspace/aitiaexplorer/uplift/automl/lib/python3.8/site-packages/supervised/base_automl.py", line 1265, in _base_predict
predictions = model.predict(X)
File "/home/user/workspace/aitiaexplorer/uplift/automl/lib/python3.8/site-packages/supervised/model_framework.py", line 387, in predict
y_p = learner.predict(X_data)
File "/home/user/workspace/aitiaexplorer/uplift/automl/lib/python3.8/site-packages/supervised/algorithms/catboost.py", line 275, in predict
return self.model.predict(X, ntree_end=self.best_ntree_limit)
File "/home/user/workspace/aitiaexplorer/uplift/automl/lib/python3.8/site-packages/catboost/core.py", line 4894, in predict
return self._predict(data, prediction_type, ntree_start, ntree_end, thread_count, verbose, 'predict')
File "/home/user/workspace/aitiaexplorer/uplift/automl/lib/python3.8/site-packages/catboost/core.py", line 1978, in _predict
data, data_is_single_object = self._process_predict_input_data(data, parent_method_name, thread_count)
File "/home/user/workspace/aitiaexplorer/uplift/automl/lib/python3.8/site-packages/catboost/core.py", line 1958, in _process_predict_input_data
data = Pool(
File "/home/user/workspace/aitiaexplorer/uplift/automl/lib/python3.8/site-packages/catboost/core.py", line 455, in __init__
self._init(data, label, cat_features, text_features, embedding_features, pairs, weight, group_id, group_weight, subgroup_id, pairs_weight, baseline, feature_names, thread_count)
File "/home/user/workspace/aitiaexplorer/uplift/automl/lib/python3.8/site-packages/catboost/core.py", line 966, in _init
self._init_pool(data, label, cat_features, text_features, embedding_features, pairs, weight, group_id, group_weight, subgroup_id, pairs_weight, baseline, feature_names, thread_count)
File "_catboost.pyx", line 3550, in _catboost._PoolBase._init_pool
File "_catboost.pyx", line 3597, in _catboost._PoolBase._init_pool
File "_catboost.pyx", line 3438, in _catboost._PoolBase._init_features_order_layout_pool
File "_catboost.pyx", line 2433, in _catboost._set_features_order_data_pd_data_frame
_catboost.CatBoostError: features data: pandas.DataFrame column 'store_type_pdist' has dtype 'category' but is not in cat_features list
Please set a GitHub issue with above error message at: https://github.com/mljar/mljar-supervised/issues/new
software version:
Package Version
alembic 1.5.8
attrs 20.3.0
backcall 0.2.0
catboost 0.24.4
category-encoders 2.2.2
cliff 3.7.0
cloudpickle 1.3.0
cmaes 0.8.2
cmd2 1.5.0
colorama 0.4.4
colorlog 5.0.1
colour 0.1.5
cycler 0.10.0
decorator 5.0.7
dill 0.3.3
dtreeviz 1.0
graphviz 0.16
greenlet 1.0.0
iniconfig 1.1.1
ipykernel 5.5.3
ipython 7.22.0
ipython-genutils 0.2.0
jedi 0.18.0
joblib 1.0.1
jupyter-client 6.2.0
jupyter-core 4.7.1
kiwisolver 1.3.1
lightgbm 3.0.0
llvmlite 0.36.0
Mako 1.1.4
MarkupSafe 1.1.1
matplotlib 3.4.1
mljar-supervised 0.10.3
nest-asyncio 1.5.1
numba 0.53.1
numpy 1.19.5
optuna 2.6.0
packaging 20.9
pandas 1.2.0
parso 0.8.2
patsy 0.5.1
pbr 5.5.1
pexpect 4.8.0
pickleshare 0.7.5
Pillow 8.2.0
pip 21.0.1
plotly 4.14.3
pluggy 0.13.1
prettytable 2.1.0
prompt-toolkit 3.0.18
ptyprocess 0.7.0
py 1.10.0
pyarrow 3.0.0
pyfunctional 1.4.3
Pygments 2.8.1
pyparsing 2.4.7
pyperclip 1.8.2
pytest 6.2.3
python-dateutil 2.8.1
python-editor 1.0.4
pytz 2021.1
PyYAML 5.4.1
pyzmq 22.0.3
retrying 1.3.3
scikit-learn 0.24.1
scipy 1.6.1
seaborn 0.10.1
setuptools 47.1.0
shap 0.36.0
six 1.15.0
slicer 0.0.7
SQLAlchemy 1.4.11
statsmodels 0.12.2
stevedore 3.3.0
tabulate 0.8.7
threadpoolctl 2.1.0
toml 0.10.2
tornado 6.1
tqdm 4.60.0
traitlets 5.0.5
wcwidth 0.2.5
wordcloud 1.8.1
xgboost 1.3.3