Skip to content

Segmentation fault in sambamba depth base #522

@emollier

Description

@emollier

Greetings,

I think I hit a difficult issues with sambamba and I pondered whether it would be of interest or not, as it is not a hard blocker. In doubt, here it is.

Describe the bug

I observe a rare occurrence of Segmentation fault when running sambamba depth base using many threads:

$ sambamba depth base --min-coverage=0 reads_sort.bam -L target.bed --nthreads=12 2>/dev/null
REF     POS     COV     A       C       G       T       DEL     REFSKIP SAMPLE
ref     6       1       0       0       0       1       0       0       *
ref     7       1       0       0       0       1       0       0       *
ref     8       3       3       0       0       0       0       0       *
ref     9       3       0       0       3       0       0       0       *
[…]
ref2    32      3       3       0       0       0       0       0       *
ref2    33      3       0       3       0       0       0       0       *
ref2    34      2       0       0       0       2       0       0       *
ref2    35      1       1       0       0       0       0       0       *
Segmentation fault

Capturing the backtrace doesn't look too exploitable as it is D language, but I add it for reference, especially as I had difficulties to capture it:

#0  0x00007ffff7a9abe0 in object.ModuleInfo.tlsctor() const () from
+/lib/x86_64-linux-gnu/libdruntime-ldc-shared.so.106
#1  0x00007ffff7aa97d1 in ?? () from /lib/x86_64-linux-gnu/libdruntime-ldc-shared.so.106
#2  0x00007ffff7aaa759 in rt.sections_elf_shared.DSO.opApply(scope int(ref rt.sections_elf_shared.DSO)
+delegate) () from /lib/x86_64-linux-gnu/libdruntime-ldc-shared.so.106
#3  0x00007ffff7a93320 in thread_entryPoint () from /lib/x86_64-linux-gnu/libdruntime-ldc-shared.so.106
#4  0x00007ffff773d083 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:447
#5  0x00007ffff77bb7b8 in __GI___clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78

I'm a bit unsure at what rate the problem occurs. I started with about 1 issue per 5 or 6 runs, but further tests seemed to have much less occurrences. I never encountered the problem on single threaded runs, which suggests a race condition.

Software versions were:

  • sambamba 1.0.1
  • ldc 1.40.0

To Reproduce

Steps to reproduce the behavior:

  1. Run sambamba depth base at least a dozen of times with a high thread count:
$ sambamba depth base --min-coverage=0 reads_sort.bam -L target.bed --nthreads=12 2>/dev/null
  1. See the Segmentation fault in some of the outputs.

Expected behavior

I would expeect the program to always return successfully after being able to output results.

Additional context

The test data that has been used for the purpose of the test was initially preprocessed using routines that are in use in the Debian package nanosv autopkgtest routines. The raw data is the file toy.sam from examples of samtools:

@SQ	SN:ref	LN:45
@SQ	SN:ref2	LN:40
r001	163	ref	7	30	8M4I4M1D3M	=	37	39	TTAGATAAAGAGGATACTG	*	XX:B:S,12561,2,20,112
r002	0	ref	9	30	1S2I6M1P1I1P1I4M2I	*	0	0	AAAAGATAAGGGATAAA	*
r003	0	ref	9	30	5H6M	*	0	0	AGCTAA	*
r004	0	ref	16	30	6M14N1I5M	*	0	0	ATAGCTCTCAGC	*
r003	16	ref	29	30	6H5M	*	0	0	TAGGC	*
r001	83	ref	37	30	9M	=	7	-39	CAGCGCCAT	*
x1	0	ref2	1	30	20M	*	0	0	aggttttataaaacaaataa	????????????????????
x2	0	ref2	2	30	21M	*	0	0	ggttttataaaacaaataatt	?????????????????????
x3	0	ref2	6	30	9M4I13M	*	0	0	ttataaaacAAATaattaagtctaca	??????????????????????????
x4	0	ref2	10	30	25M	*	0	0	CaaaTaattaagtctacagagcaac	?????????????????????????
x5	0	ref2	12	30	24M	*	0	0	aaTaattaagtctacagagcaact	????????????????????????
x6	0	ref2	14	30	23M	*	0	0	Taattaagtctacagagcaacta	???????????????????????

toy.sam is then processed this way to obtain the reads_sort.bam and target.bed files exploited by the reproducer:

cat read.sam | samtools view -Sb > reads.bam
samtools sort reads.bam > reads_sort.bam
samtools index reads_sort.bam reads_sort.bai
bedtools bamtobed -i reads_sort.bam > target.bed

The issue was initially described in details in Debian bug #1095434. I'm a bit unsure whether this problem does stem from sambamba, or from Debian's D compiler. In any case, I thought you might like being aware of the issue.

For information,
Étienne.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions