-
Notifications
You must be signed in to change notification settings - Fork 3k
Closed
Description
Version / Environment
Ubuntu 18.04
Python 3.6.8
nlp 0.4.0
Description
Loading imdb dataset works fine when when I don't specify any download_config argument. When I create a custom DownloadConfig object and pass it to the nlp.load_dataset function, this results in an error.
How to reproduce
Example without DownloadConfig --> works
import os
os.environ["HF_HOME"] = "/data/hf-test-without-dl-config-01/"
import logging
import nlp
logging.basicConfig(level=logging.INFO)
if __name__ == "__main__":
imdb = nlp.load_dataset(path="imdb")Example with DownloadConfig --> doesn't work
import os
os.environ["HF_HOME"] = "/data/hf-test-with-dl-config-01/"
import logging
import nlp
from nlp.utils import DownloadConfig
logging.basicConfig(level=logging.INFO)
if __name__ == "__main__":
download_config = DownloadConfig()
imdb = nlp.load_dataset(path="imdb", download_config=download_config)Error traceback:
Traceback (most recent call last):
File "/.../example_with_dl_config.py", line 13, in <module>
imdb = nlp.load_dataset(path="imdb", download_config=download_config)
File "/.../python3.6/python3.6/site-packages/nlp/load.py", line 549, in load_dataset
download_config=download_config, download_mode=download_mode, ignore_verifications=ignore_verifications,
File "/.../python3.6/python3.6/site-packages/nlp/builder.py", line 463, in download_and_prepare
dl_manager=dl_manager, verify_infos=verify_infos, **download_and_prepare_kwargs
File "/.../python3.6/python3.6/site-packages/nlp/builder.py", line 518, in _download_and_prepare
split_generators = self._split_generators(dl_manager, **split_generators_kwargs)
File "/.../python3.6/python3.6/site-packages/nlp/datasets/imdb/76cdbd7249ea3548c928bbf304258dab44d09cd3638d9da8d42480d1d1be3743/imdb.py", line 86, in _split_generators
arch_path = dl_manager.download_and_extract(_DOWNLOAD_URL)
File "/.../python3.6/python3.6/site-packages/nlp/utils/download_manager.py", line 220, in download_and_extract
return self.extract(self.download(url_or_urls))
File "/.../python3.6/python3.6/site-packages/nlp/utils/download_manager.py", line 158, in download
self._record_sizes_checksums(url_or_urls, downloaded_path_or_paths)
File "/.../python3.6/python3.6/site-packages/nlp/utils/download_manager.py", line 108, in _record_sizes_checksums
self._recorded_sizes_checksums[url] = get_size_checksum_dict(path)
File "/.../python3.6/python3.6/site-packages/nlp/utils/info_utils.py", line 79, in get_size_checksum_dict
with open(path, "rb") as f:
IsADirectoryError: [Errno 21] Is a directory: '/data/hf-test-with-dl-config-01/datasets/extracted/b6802c5b61824b2c1f7dbf7cda6696b5f2e22214e18d171ce1ed3be90c931ce5'
Metadata
Metadata
Assignees
Labels
No labels