Skip to content

PEKA jobs mismatching peak and crosslink inputs #33

@lconde-ucl

Description

@lconde-ucl

ISSUE

Hi,

There seems to be a bug in how PEKA is executed in the pipeline: the jobs are mixing up input files, using peak files from one sample and crosslink files from another. For example:

[...]
TMPDIR=$(pwd)/tmp peka  \
   -i SAMPLE-1_R1_genome.clippy._rollmean10_minHeightAdjust1.0_minPromAdjust1.0_minGeneCount5_Peaks.bed  \
   -x SAMPLE-2_R1.genome.xl.bed \
   -g GRCh38.primary_assembly.genome.fa  \
   -gi GRCh38.primary_assembly.genome.fa.fai \
   -r gencode_filtered_regions_genicOtherfalse.resolved.gtf         
[...]

SUGGESTED SOLUTION:

I think is an issue with the way the CLIPPY_GENOME.out.peaks and ch_genome_crosslink_bed channels are passed separately into the PEKA module, without being joined on their metadata. So nextflow seems to combine whichever files are available first, instead of proper sample-matched pairs. I believe that Joining the channels beforehand should fix the issue. Suggested change:

Currently:

/*
* MODULE: Run peka on genome-level crosslinks
*/
PEKA (
   CLIPPY_GENOME.out.peaks,
   ch_genome_crosslink_bed,
   ch_fasta.collect{ it[1] },
   ch_fasta_fai.collect{ it[1] },
   ch_regions_resolved_gtf.collect{ it[1] }
)
ch_versions = ch_versions.mix(PEKA.out.versions)

Suggested fix:

/*
* MODULE: Run peka on genome-level crosslinks
*/
ch_peka_input = CLIPPY_GENOME.out.peaks
   .join(ch_genome_crosslink_bed, by: 0)
PEKA (
   ch_peka_input.map { meta, peaks, crosslinks -> [meta, peaks] },
   ch_peka_input.map { meta, peaks, crosslinks -> [meta, crosslinks] },
   ch_fasta.collect{ it[1] },
   ch_fasta_fai.collect{ it[1] },
   ch_regions_resolved_gtf.collect{ it[1] }
)
ch_versions = ch_versions.mix(PEKA.out.versions)

Best
Lucia

PS: I’ve submitted a pull request to address this issue: #42

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions