Add MLFLOW_FLATTEN_PARAMS support in MLflowCallback #17148

orieg · 2022-05-09T22:47:50Z

What does this PR do?

This PR add support for the environment variable MLFLOW_FLATTEN_PARAMS.

When first level parameters hold a dictionary value, it will be logged to MLflow as a string. Currently, it will skip that parameter when the string exceed 250 characters. This is especially true with the task_specific_params which can end up being a long string.

Current warning message look like this:

Trainer is attempting to log a value of "{'summarization': {'length_penalty': 1.0, 'max_length': 128, 'min_length': 12, 'num_beams': 4}, 'summarization_cnn': {'length_penalty': 2.0, 'max_length': 142, 'min_length': 56, 'num_beams': 4}, 'summarization_xsum': {'length_penalty': 1.0, 'max_length': 62, 'min_length': 11, 'num_beams': 6}}" for key "task_specific_params" as a parameter. MLflow's log_param() only accepts values no longer than 250 characters so we dropped this attribute.

With this PR, the warning message is updated to:

Trainer is attempting to log a value of "{'summarization': {'length_penalty': 1.0, 'max_length': 128, 'min_length': 12, 'num_beams': 4}, 'summarization_cnn': {'length_penalty': 2.0, 'max_length': 142, 'min_length': 56, 'num_beams': 4}, 'summarization_xsum': {'length_penalty': 1.0, 'max_length': 62, 'min_length': 11, 'num_beams': 6}}" for key "task_specific_params" as a parameter. MLflow's log_param() only accepts values no longer than 250 characters so we dropped this attribute. You can use MLFLOW_FLATTEN_PARAMS environment variable to flatten the parameters and avoid this message.

When a user set the env variable with os.environ['MLFLOW_FLATTEN_PARAMS'] = "True", the parameters will be properly sent to MLflow and logged as such:

task_specific_params.summarization.length_penalty	1.0
task_specific_params.summarization.max_length	128
task_specific_params.summarization.min_length	12
task_specific_params.summarization.num_beams	4
task_specific_params.summarization_cnn.length_penalty	2.0
task_specific_params.summarization_cnn.max_length	142
task_specific_params.summarization_cnn.min_length	56
task_specific_params.summarization_cnn.num_beams	4
task_specific_params.summarization_xsum.length_penalty	1.0
task_specific_params.summarization_xsum.max_length	62
task_specific_params.summarization_xsum.min_length	11
task_specific_params.summarization_xsum.num_beams	6

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@sgugger I added a flatten_dict function in .utils/py_utils.py as it didn't seem right to add this in integrations.py.

HuggingFaceDocBuilderDev · 2022-05-09T23:03:30Z

The documentation is not available anymore as the PR was closed or merged.

orieg · 2022-05-09T23:43:01Z

@sgugger the script utils/check_inits.py used for CI has a bug. Line 252:

submodule = short_path.replace(os.path.sep, ".").replace(".py", "")

This would lead to files starting by "py" to be improperly named. For example, utils/py_utils.py is interpreted as utils_utils.

A better replace() usage may be:

submodule = short_path.replace(".py", "").replace(os.path.sep, ".")

sgugger

Thanks a lot for your PR! Could you move the new flatten_dict function to the utils.generic module? Also could you add a test for it?

src/transformers/utils/py_utils.py

sgugger

Thanks for addressing the comments!

tests/utils/test_generics.py

utils/check_inits.py

utils/tests_fetcher.py

sgugger · 2022-05-10T17:19:47Z

Looks like you messed the rebase a little bit and there are now commits in this PR that shouldn't be here.

orieg · 2022-05-10T17:21:20Z

Yep. I messed up that rebase...

…port

sgugger · 2022-05-10T18:29:22Z

Thanks again!

* add support for MLFLOW_FLATTEN_PARAMS * ensure key is str * fix style and update warning msg * Empty commit to trigger CI * fix bug in check_inits.py * add unittest for flatten_dict utils * fix 'NoneType' object is not callable on __del__ * add generic flatten_dict unittest to SPECIAL_MODULE_TO_TEST_MAP * fix style

sgugger reviewed May 10, 2022

View reviewed changes

src/transformers/utils/py_utils.py Outdated Show resolved Hide resolved

sgugger approved these changes May 10, 2022

View reviewed changes

tests/utils/test_generics.py Outdated Show resolved Hide resolved

utils/check_inits.py Show resolved Hide resolved

sgugger reviewed May 10, 2022

View reviewed changes

utils/check_inits.py Show resolved Hide resolved

utils/tests_fetcher.py Show resolved Hide resolved

orieg added 10 commits May 10, 2022 10:25

add support for MLFLOW_FLATTEN_PARAMS

670e92d

ensure key is str

4671662

fix style and update warning msg

337ac5d

Empty commit to trigger CI

8bf593d

fix bug in check_inits.py

8ff4afc

add unittest for flatten_dict utils

7c1939c

fix 'NoneType' object is not callable on __del__

779dc4e

add generic flatten_dict unittest to SPECIAL_MODULE_TO_TEST_MAP

a00bcd8

fix style

c2aab41

Merge branch 'huggingface:main' into mlflowcallback-nested-params-sup…

04b77cc

…port

sgugger merged commit e99f0ef into huggingface:main May 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add MLFLOW_FLATTEN_PARAMS support in MLflowCallback #17148

Add MLFLOW_FLATTEN_PARAMS support in MLflowCallback #17148

Uh oh!

orieg commented May 9, 2022 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented May 9, 2022 •

edited

Loading

Uh oh!

orieg commented May 9, 2022

Uh oh!

sgugger left a comment

Uh oh!

Uh oh!

sgugger left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sgugger commented May 10, 2022

Uh oh!

orieg commented May 10, 2022

Uh oh!

sgugger commented May 10, 2022

Uh oh!

Uh oh!

Add MLFLOW_FLATTEN_PARAMS support in MLflowCallback #17148

Add MLFLOW_FLATTEN_PARAMS support in MLflowCallback #17148

Uh oh!

Conversation

orieg commented May 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

HuggingFaceDocBuilderDev commented May 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

orieg commented May 9, 2022

Uh oh!

sgugger left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sgugger left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sgugger commented May 10, 2022

Uh oh!

orieg commented May 10, 2022

Uh oh!

sgugger commented May 10, 2022

Uh oh!

Uh oh!

orieg commented May 9, 2022 •

edited

Loading

HuggingFaceDocBuilderDev commented May 9, 2022 •

edited

Loading