Batching and parallelising inference calls for one receptor.

Hi,

I am trying to run DiffDock on c.70k ligands for the same receptor, currently my runtime on an A100 GPU is 7 hours for 1000 ligands and I am trying to improve this.

Firstly, I have looked at caching the ESM embedding and then loading it for each ligand so it isn't recalculated each time.

I have tried to increase the `--batch_size` argument in the inference call but I am still only working on one ligand at a time how do I actually get it to perform this batching process? Will batching based on SMILE length or molecular weight work as the batching is based on graph input size?

How can I run these inference calls in parallel, I am seeing a lockup due to spawning multiple Docker/Singularity shells and microbamba environments so using `parallel` to run processes in parallel

Does anyone have experience with this type of implementation and best practices?

Thanks for any support or advice.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Batching and parallelising inference calls for one receptor. #289

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Batching and parallelising inference calls for one receptor. #289

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions