-
Notifications
You must be signed in to change notification settings - Fork 104
Description
Greetings,
I think I hit a difficult issues with sambamba and I pondered whether it would be of interest or not, as it is not a hard blocker. In doubt, here it is.
Describe the bug
I observe a rare occurrence of Segmentation fault when running sambamba depth base using many threads:
$ sambamba depth base --min-coverage=0 reads_sort.bam -L target.bed --nthreads=12 2>/dev/null
REF POS COV A C G T DEL REFSKIP SAMPLE
ref 6 1 0 0 0 1 0 0 *
ref 7 1 0 0 0 1 0 0 *
ref 8 3 3 0 0 0 0 0 *
ref 9 3 0 0 3 0 0 0 *
[…]
ref2 32 3 3 0 0 0 0 0 *
ref2 33 3 0 3 0 0 0 0 *
ref2 34 2 0 0 0 2 0 0 *
ref2 35 1 1 0 0 0 0 0 *
Segmentation fault
Capturing the backtrace doesn't look too exploitable as it is D language, but I add it for reference, especially as I had difficulties to capture it:
#0 0x00007ffff7a9abe0 in object.ModuleInfo.tlsctor() const () from
+/lib/x86_64-linux-gnu/libdruntime-ldc-shared.so.106
#1 0x00007ffff7aa97d1 in ?? () from /lib/x86_64-linux-gnu/libdruntime-ldc-shared.so.106
#2 0x00007ffff7aaa759 in rt.sections_elf_shared.DSO.opApply(scope int(ref rt.sections_elf_shared.DSO)
+delegate) () from /lib/x86_64-linux-gnu/libdruntime-ldc-shared.so.106
#3 0x00007ffff7a93320 in thread_entryPoint () from /lib/x86_64-linux-gnu/libdruntime-ldc-shared.so.106
#4 0x00007ffff773d083 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:447
#5 0x00007ffff77bb7b8 in __GI___clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78
I'm a bit unsure at what rate the problem occurs. I started with about 1 issue per 5 or 6 runs, but further tests seemed to have much less occurrences. I never encountered the problem on single threaded runs, which suggests a race condition.
Software versions were:
- sambamba 1.0.1
- ldc 1.40.0
To Reproduce
Steps to reproduce the behavior:
- Run sambamba depth base at least a dozen of times with a high thread count:
$ sambamba depth base --min-coverage=0 reads_sort.bam -L target.bed --nthreads=12 2>/dev/null
- See the Segmentation fault in some of the outputs.
Expected behavior
I would expeect the program to always return successfully after being able to output results.
Additional context
The test data that has been used for the purpose of the test was initially preprocessed using routines that are in use in the Debian package nanosv autopkgtest routines. The raw data is the file toy.sam from examples of samtools:
@SQ SN:ref LN:45
@SQ SN:ref2 LN:40
r001 163 ref 7 30 8M4I4M1D3M = 37 39 TTAGATAAAGAGGATACTG * XX:B:S,12561,2,20,112
r002 0 ref 9 30 1S2I6M1P1I1P1I4M2I * 0 0 AAAAGATAAGGGATAAA *
r003 0 ref 9 30 5H6M * 0 0 AGCTAA *
r004 0 ref 16 30 6M14N1I5M * 0 0 ATAGCTCTCAGC *
r003 16 ref 29 30 6H5M * 0 0 TAGGC *
r001 83 ref 37 30 9M = 7 -39 CAGCGCCAT *
x1 0 ref2 1 30 20M * 0 0 aggttttataaaacaaataa ????????????????????
x2 0 ref2 2 30 21M * 0 0 ggttttataaaacaaataatt ?????????????????????
x3 0 ref2 6 30 9M4I13M * 0 0 ttataaaacAAATaattaagtctaca ??????????????????????????
x4 0 ref2 10 30 25M * 0 0 CaaaTaattaagtctacagagcaac ?????????????????????????
x5 0 ref2 12 30 24M * 0 0 aaTaattaagtctacagagcaact ????????????????????????
x6 0 ref2 14 30 23M * 0 0 Taattaagtctacagagcaacta ???????????????????????
toy.sam is then processed this way to obtain the reads_sort.bam and target.bed files exploited by the reproducer:
cat read.sam | samtools view -Sb > reads.bam
samtools sort reads.bam > reads_sort.bam
samtools index reads_sort.bam reads_sort.bai
bedtools bamtobed -i reads_sort.bam > target.bedThe issue was initially described in details in Debian bug #1095434. I'm a bit unsure whether this problem does stem from sambamba, or from Debian's D compiler. In any case, I thought you might like being aware of the issue.
For information,
Étienne.