Skip to content

Conversation

@lhoestq
Copy link
Member

@lhoestq lhoestq commented Sep 4, 2020

As discusses in #554 , we should use a module cache directory outside of the python packages directory since we may not have write permissions.

I added a new HF_MODULES_PATH directory that is added to the python path when doing import nlp.
In this directory, a module nlp_modules is created so that datasets can be added to nlp_modules.datasets and metrics to nlp_modules.metrics. nlp_modules doesn't exist on Pypi.

If someone using cloudpickle still wants to have the downloaded dataset/metrics scripts to be inside the nlp directory, it is still possible to change the environment variable HF_MODULES_CACHE to be a path inside the nlp lib.

@lhoestq lhoestq requested a review from thomwolf September 4, 2020 16:30
@lhoestq
Copy link
Member Author

lhoestq commented Sep 4, 2020

All the tests pass on my side. Not sure if it is a cache issue or a pytest issue or a circleci issue.
EDIT: I have the same error on google colab. Trying to fix that

@thomwolf
Copy link
Member

thomwolf commented Sep 4, 2020

I think I fixed it (sorry didn't notice you were on it as well)

Copy link
Member

@thomwolf thomwolf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect!

@lhoestq lhoestq merged commit ff272c2 into master Sep 7, 2020
@lhoestq lhoestq deleted the add-modules-cache branch September 7, 2020 09:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants