Skip to content

Error when running with too few reads/very low coverage #91

@Coppini

Description

@Coppini

When running Artic on a sample with too few reads, resulting in too small coverage, Longshot automatically calculates the maximum coverage as 0, and fails. We're having issues with that in an automatic pipeline, where some of our samples keep failing due to that, since they have too few reads (as expected, since they were negative samples).

Running: longshot -P 0 -F -A --no_haps --bam barcode91.primertrimmed.rg.sorted.bam --ref primer_schemes/SARS-CoV-2/V1200/SARS-CoV-2.reference.fasta --out barcode91.merged.vcf --potential_variants barcode91.merged.vcf.gz
Command failed:longshot -P 0 -F -A --no_haps --bam barcode91.primertrimmed.rg.sorted.bam --ref primer_schemes/SARS-CoV-2/V1200/SARS-CoV-2.reference.fasta --out barcode91.merged.vcf --potential_variants barcode91.merged.vcf.gz

Already reported it to them at pjedge/longshot#72, but I agree with them that the best solution would be to simply not use the -A flag (or, alternatively, to ignore the error in the Artic pipeline when it happens, and create an empty file instead).


I tried running the branch in 1.3.0-dev as well, but end up with an error in the artic-tools check_vcf step:

Command failed:artic-tools check_vcf --summaryOut barcode91.vcfreport.txt --vcfOut barcode91.merged.filtered.vcf barcode91.merged.vcf.gz primer_schemes/SARS-CoV-2/V1200/SARS-CoV-2.scheme.bed 2> barcode91.vcfcheck.log

It seems the problem appears when the VCF file is empty:

bcftools view barcode91.merged.vcf.gz
##fileformat=VCFv4.1
##FILTER=<ID=PASS,Description="All filters passed">
##medaka_version=1.2.3
##INFO=<ID=DP,Number=1,Type=Integer,Description="Depth of reads at pos">
##INFO=<ID=DPS,Number=2,Type=Integer,Description="Depth of reads at pos by strand (fwd, rev)">
##INFO=<ID=DPSP,Number=1,Type=Integer,Description="Depth of reads spanning pos +-25">
##INFO=<ID=SR,Number=.,Type=Integer,Description="Depth of spanning reads by strand which best align to each allele (ref fwd, ref rev, alt1 fwd, alt1 rev, etc.)">
##INFO=<ID=AR,Number=2,Type=Integer,Description="Depth of ambiguous spanning reads by strand which align equally well to all alleles (fwd, rev)">
##INFO=<ID=SC,Number=.,Type=Integer,Description="Total alignment score to each allele of spanning reads by strand (ref fwd, ref rev, alt1 fwd, alt1 rev, etc.) aligned with parasail match 5, mismatch -4, open 5, extend 3">
##INFO=<ID=Pool,Number=1,Type=String,Description="The pool name">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Medaka genotype.">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Medaka genotype quality score">
##bcftools_viewVersion=1.10.2+htslib-1.10.2
##bcftools_viewCommand=view barcode91.merged.vcf.gz; Date=Mon Aug 23 12:36:40 2021
#CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO	FORMAT	SAMPLE

Running artic-tools check_vcf ends up on a Segmentation fault (core dumped) when reading the empty VCF:

$ artic-tools check_vcf --summaryOut barcode91.vcfreport.txt --vcfOut barcode91.merged.filtered.vcf barcode91.merged.vcf.gz primer_schemes/SARS-CoV-2/V1200/SARS-CoV-2.scheme.bed
[12:35:34] [artic-tools::check_vcf] starting VCF checker
[12:35:34] [artic-tools::check_vcf] reading scheme
[12:35:34] [artic-tools::check_vcf] collecting scheme stats
[12:35:34] [artic-tools::check_vcf] 	primer scheme file:	primer_schemes/SARS-CoV-2/V1200/SARS-CoV-2.scheme.bed
[12:35:34] [artic-tools::check_vcf] 	reference sequence:	MN908947.3
[12:35:34] [artic-tools::check_vcf] 	number of pools:	2
[12:35:34] [artic-tools::check_vcf] 	number of primers:	58 (includes 0 alts)
[12:35:34] [artic-tools::check_vcf] 	minimum primer size:	22
[12:35:34] [artic-tools::check_vcf] 	maximum primer size:	30
[12:35:34] [artic-tools::check_vcf] 	number of amplicons:	29
[12:35:34] [artic-tools::check_vcf] 	mean amplicon size:	1091
[12:35:34] [artic-tools::check_vcf] 	maximum amplicon size:	1177
[12:35:34] [artic-tools::check_vcf] 	scheme ref. span:	30-29790
[12:35:34] [artic-tools::check_vcf] 	scheme overlaps:	6.5591393%
[12:35:34] [artic-tools::check_vcf] setting parameters
[12:35:34] [artic-tools::check_vcf] 	output report: barcode91.vcfreport.txt
[12:35:34] [artic-tools::check_vcf] 	filtering variants: true
[12:35:34] [artic-tools::check_vcf] 	output VCF: barcode91.merged.filtered.vcf
[12:35:34] [artic-tools::check_vcf] 	minimum quality threshold: 10.0
[12:35:34] [artic-tools::check_vcf] reading VCF file
Segmentation fault (core dumped)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions