Skip to content

Conversation

guillaume-chevalier
Copy link
Member

@guillaume-chevalier guillaume-chevalier commented Jul 18, 2022

Improve services parallelization

What it is

This pull request allows:

  • Doing pickling checks on the parallelized services for them to avoid deadlocking the multithreading queues by having picklables services.
  • It also repairs several bugs that impacted the parallelism and sometimes causing deadlocks.
  • Moving some modules around and removing some dependencies between modules.

How it works

Behavior unchanged except that some repositories now have a wrapper and that the context does not have a global lock anymore, which is good.


Checklist before merging PR.

Things to check each time you contribute:

  •  If this is your first contribution to Neuraxle, please read the guide to contributing to the Neuraxle framework.
  • Your local Git username is set to your GitHub username, and your local Git email is set to your GitHub email. This is important to avoid breaking the cla-bot and for your contributions to be linked to your profile. More info: https://github.com/settings/emails
  • Argument's dimensions and types are specified for new steps (important), with examples in docstrings when needed.
  • Class names and argument / API variables are very clear: there is no possible ambiguity. They also respect the existing code style (avoid duplicating words for the same concept) and are intuitive.
  • Use typing like variable: Typing = ... as much as possible. Also use typing for function arguments and return values like def my_func(self, my_list: Dict[int, List[str]]) -> 'OrderedDict[int, str]':.
  • Classes are documented: their behavior is explained beyond just the title of the class. You may even use the description written in your pull request above to fill some docstrings accurately.
  • If a numpy array is used, it is important to remember that these arrays are a special type that must be documented accordingly, and that numpy array should not be abused. This is because Neuraxle is a library that is not only limited to transforming numpy arrays. To this effect, numpy steps should probably be located in the existing numpy python files as much as possible, and not be all over the place. The same applies to Pandas DataFrames.
  • Code coverage is above 90% for the added code for the unit tests.
  • The above description of the pull request in natural language was used to document the new code inside the code's docstrings so as to have complete documentation, with examples.
  • Respect the Unit Testing status check
  • Respect the Codacy status check
  • Respect the cla-bot status check (unless the cla-bot is truly broken - please try to debug it first)
  • Code files that were edited were reformatted automatically using PyCharm's Ctrl+Alt+L shortcut. You may have reorganized imports as well.

@guillaume-chevalier guillaume-chevalier self-assigned this Jul 18, 2022
@cla-bot cla-bot bot added the cla-signed label Jul 18, 2022
@Neuraxio Neuraxio deleted a comment from pull-checklist bot Jul 22, 2022
@guillaume-chevalier guillaume-chevalier merged commit c355500 into Neuraxio:master Jul 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant