-
Notifications
You must be signed in to change notification settings - Fork 1
Description
When we try to find the link to download GCA_000193795.2, directsketch found the following link: https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/193/795/GCA_000193795.1_ASM19379v1/GCA_000193795.1_ASM19379v1_genomic.fna.gz. Note that the folder https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/193/795/GCA_000193795.1_ASM19379v1/ does indeed exist, but this .1 assembly is suppressed, so the download fails.
When I looked up the genome via NCBI, I found the v2 genome is available at : https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/193/795/GCA_000193795.2_ASM19379v2/GCA_000193795.2_ASM19379v2_genomic.fna.gz
so this script found the v1 folder, instead of the v2 folder, causing the download to fail.
To fix this, look into the link + version check here: https://github.com/sourmash-bio/sourmash_plugin_directsketch/blob/main/src/directsketch.rs#L106-L125