-
Notifications
You must be signed in to change notification settings - Fork 309
Description
Hi,
I am trying to run DiffDock on c.70k ligands for the same receptor, currently my runtime on an A100 GPU is 7 hours for 1000 ligands and I am trying to improve this.
Firstly, I have looked at caching the ESM embedding and then loading it for each ligand so it isn't recalculated each time.
I have tried to increase the --batch_size
argument in the inference call but I am still only working on one ligand at a time how do I actually get it to perform this batching process? Will batching based on SMILE length or molecular weight work as the batching is based on graph input size?
How can I run these inference calls in parallel, I am seeing a lockup due to spawning multiple Docker/Singularity shells and microbamba environments so using parallel
to run processes in parallel
Does anyone have experience with this type of implementation and best practices?
Thanks for any support or advice.