- 
                Notifications
    
You must be signed in to change notification settings  - Fork 8
 
Closed
Labels
enhancementNew feature or requestNew feature or request
Description
For large read datasets, their size (especially plus a large countmin table) may exceed device memory. However they can happily be loaded (asynchronously) in blocks, as each read is independent.
This change will require:
- Remove the read interleaving, and loading into shared. A sync block load can be used to load all the reads for a warp into shared directly and more simply.
 - Pin the host memory in the DeviceReads class
 - Add a loadBlock method to DeviceReads which loads up to a specified size using memcpy async on a non-default stream
 - Launch the kernel on a second non-default stream.
 - Iterate through loadBlock, waiting on kernel completion.
 
The block size will be (device memory - size of countmin table) / 2 - epsilon
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request