Authors: Katja Kozjek (katja.kozjek@ki.se), Fredrik Boulund (fredrik.boulund@ki.se)
Project: OncoDutch/FOCUS cohort
Folder: focus_metaphlan_2024
This folder contains the analysis related the FOCUS cohort of the ONCOBIOME project.
raw_data/: Raw data files for each flowcell. Not included in Github repo due to its size.data/: Data used for downstream analyses, and data generated during downstream analyses. Not included in Github repo due to its size.code/: Scripts used for data processing.
Instructions on how to prepare and organize data files on the CTMR Gandalf HPC system.
- Extract Metaphlan profiles from zipped file:
/ceph/projects/232_OncoDutch/analysis/analysis2024/focus_metaphlan_2024/code/extract_metaphlan.bash - Merge multiple tsv files into one file:
/ceph/projects/157_Screesco/kkatja/analysis_repo/code/multiple_merge_tsv.py - Count and taxa file generated:
/ceph/projects/157_Screesco/kkatja/analysis_repo/code/MetaphlanToPhyloseq_overall.py - Count and taxa file for specific taxonomic level (i.e. species) generated:
/ceph/projects/157_Screesco/kkatja/analysis_repo/code/MetaphlanToPhyloseq.py