-
Notifications
You must be signed in to change notification settings - Fork 3k
[logging] Add centralized logging - Bump-up cache loads to warnings #538
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice addition !
| @@ -1 +1 @@ | |||
| {"write_array2d": 0.07093274600629229, "read_unformated after write_array2d": 0.03530075500020757, "read_formatted_as_numpy after write_array2d": 0.10929270699853078, "read_batch_unformated after write_array2d": 0.03727920600795187, "read_batch_formatted_as_numpy after write_array2d": 0.018853643006877974, "read_col_unformated after write_array2d": 0.05644163000397384, "read_col_formatted_as_numpy after write_array2d": 0.011610292000113986, "write_nested_sequence": 1.6535991109994939, "read_unformated after write_nested_sequence": 0.3739209540071897, "read_formatted_as_numpy after write_nested_sequence": 0.40762836500653066, "read_batch_unformated after write_nested_sequence": 0.3337586460111197, "read_batch_formatted_as_numpy after write_nested_sequence": 0.054717567007173784, "read_col_unformated after write_nested_sequence": 0.3173944180016406, "read_col_formatted_as_numpy after write_nested_sequence": 0.004956340009812266, "write_flattened_sequence": 1.4975415869994322, "read_unformated after write_flattened_sequence": 0.26713552299770527, "read_formatted_as_numpy after write_flattened_sequence": 0.07673935199272819, "read_batch_unformated after write_flattened_sequence": 0.25450974798877724, "read_batch_formatted_as_numpy after write_flattened_sequence": 0.009374254994327202, "read_col_unformated after write_flattened_sequence": 0.25912448299641255, "read_col_formatted_as_numpy after write_flattened_sequence": 0.004277604995877482} | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are these files supposed to be part of the PR ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't care that much I guess but let me remove them indeed
src/nlp/arrow_dataset.py
Outdated
| if os.path.exists(indices_cache_file_name) and load_from_cache_file: | ||
| if verbose: | ||
| logger.info("Loading cached shuffled indices for dataset at %s", indices_cache_file_name) | ||
| logger.warn("Loading cached shuffled indices for dataset at %s", indices_cache_file_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use logger.warning instead ? iirc warn is deprecated
src/nlp/arrow_dataset.py
Outdated
| train_indices_cache_file_name, | ||
| test_indices_cache_file_name, | ||
| ) | ||
| logger.warn( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
src/nlp/arrow_dataset.py
Outdated
| if os.path.exists(indices_cache_file_name) and load_from_cache_file: | ||
| if verbose: | ||
| logger.info("Loading cached sorted indices for dataset at %s", indices_cache_file_name) | ||
| logger.warn("Loading cached sorted indices for dataset at %s", indices_cache_file_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
src/nlp/arrow_dataset.py
Outdated
| if os.path.exists(cache_file_name) and load_from_cache_file: | ||
| if verbose: | ||
| logger.info("Loading cached processed dataset at %s", cache_file_name) | ||
| logger.warn("Loading cached processed dataset at %s", cache_file_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
src/nlp/utils/logging.py
Outdated
|
|
||
| def enable_propagation() -> None: | ||
| """Enable propagation of the library log outputs. | ||
| Please disable the HuggingFace Transformers's default handler to prevent double logging if the root logger has |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what would be the issue with transformers exactly ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copy-past error
Add a
nlp.loggingmodule to set the global logging level easily. The verbosity level also controls the tqdm bars (disabled when set higher than INFO).You can use:
And use the levels: