PEKA jobs mismatching peak and crosslink inputs

**ISSUE**

Hi,

There seems to be a bug in how PEKA is executed in the pipeline: the jobs are mixing up input files, using peak files from one sample and crosslink files from another. For example:

```
[...]
TMPDIR=$(pwd)/tmp peka  \
   -i SAMPLE-1_R1_genome.clippy._rollmean10_minHeightAdjust1.0_minPromAdjust1.0_minGeneCount5_Peaks.bed  \
   -x SAMPLE-2_R1.genome.xl.bed \
   -g GRCh38.primary_assembly.genome.fa  \
   -gi GRCh38.primary_assembly.genome.fa.fai \
   -r gencode_filtered_regions_genicOtherfalse.resolved.gtf         
[...]
```


**SUGGESTED SOLUTION:**

I think is an issue with the way the `CLIPPY_GENOME.out.peaks` and `ch_genome_crosslink_bed` channels are passed separately into the PEKA module, without being joined on their metadata. So nextflow seems to combine whichever files are available first, instead of proper sample-matched pairs. I believe that Joining the channels beforehand should fix the issue. Suggested change:


Currently:
```
/*
* MODULE: Run peka on genome-level crosslinks
*/
PEKA (
   CLIPPY_GENOME.out.peaks,
   ch_genome_crosslink_bed,
   ch_fasta.collect{ it[1] },
   ch_fasta_fai.collect{ it[1] },
   ch_regions_resolved_gtf.collect{ it[1] }
)
ch_versions = ch_versions.mix(PEKA.out.versions)
```

Suggested fix:
```
/*
* MODULE: Run peka on genome-level crosslinks
*/
ch_peka_input = CLIPPY_GENOME.out.peaks
   .join(ch_genome_crosslink_bed, by: 0)
PEKA (
   ch_peka_input.map { meta, peaks, crosslinks -> [meta, peaks] },
   ch_peka_input.map { meta, peaks, crosslinks -> [meta, crosslinks] },
   ch_fasta.collect{ it[1] },
   ch_fasta_fai.collect{ it[1] },
   ch_regions_resolved_gtf.collect{ it[1] }
)
ch_versions = ch_versions.mix(PEKA.out.versions)
```


Best
Lucia


PS: I’ve submitted a pull request to address this issue: [#42](https://github.com/goodwright/clipseq/pull/34)




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PEKA jobs mismatching peak and crosslink inputs #33

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

PEKA jobs mismatching peak and crosslink inputs #33

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions