-
Notifications
You must be signed in to change notification settings - Fork 42
Description
Using the C VDB API (and following the fasterq-dump
utility strategy for accessing SRA records) for reading SRA data can consume a significant amount of RAM while reading an SRA record. This can be an issue when using attempting to minimize the amount of Cloud computing resources (i.e. instance RAM) when processing a large number of SRA records.
The maximum amount of RAM used while reading (as measured with /usr/bin/time -v
) depends on the record:
While periodically calling VCursorRelease()
and VCursorOpen()
to force the VDB interface to deallocate RAM offers a minor reduction in the maximum amount of RAM used (about 25%), this strategy significantly slows down the rate at which an SRA record is read.
Is it possible/feasible to limit memory consumption using the VDB C API to sub-gigabyte levels, independent of the number of reads? The goal is to read through an SRA record once, as quickly as possible and using as little RAM as possible.