Custom analysis software for ChIP-seq of DNA repair proteins after CRISPR/Cas9-mediated DSBs

Software requirements

python 2.7 (Anaconda's python distribution comes with the required numpy and scipy libraries)
pysam
bowtie2
samtools
Ensure that both samtools and bowtie2 are added to path and can be called directly from bash

Data requirements

Raw ChIP-seq FASTQ files and processed BAM files can be downloaded from SRA.

Generate processed BAM files (if starting from raw FASTQ files)

Download sequencing reads in FASTQ format from SRA
Download human prebuilt bowtie2 indices
- Human hg38
- move to the corresponding folder named hg38_bowtie2/
Download the human (hg38) genome assembly in FASTA format
- hg38.fa
- extract from gz files
- move to the corresponding folder named hg38_bowtie2/
Generate FASTA file indices
- samtools faidx hg38_bowtie2/hg38.fa
Modify the first lines of bash script process_reads.sh
- Fill the list variable filelist with paths to each sample name for processing. Note that the actual file paths include the sample name followed by "_1.fastq" or "_2.fastq", denoting read1 or read2 of paired-end reads, respectively.
- Fill in the path to the indexed genome denoted by variable genomepath.
Run the following code snippet, where -p denotes the number of samples to process in parallel; modify accordingly. This performs genome alignment, filtering, sorting, removal of PCR duplicates, indexing, and sample statistics output. This step takes less than one day to complete on our Intel i7-8700K, 32GB RAM desktop, though speed appears to be bottlenecked by read/writes to disk.
```
bash process_reads.sh -p 6
```
(optional) For fair comparison between different time points in a timeseries, we subset reads from all relevant samples to the sample with the fewest reads of the set. Run the following code snippet, where -s inputs the number of mapped reads to subset for each sample; modify accordingly. This step takes less than 10 minutes.
```
bash subset_reads.sh -p 6 -s 24400000
```
- for 53BP1 DNA-PKcs inhibitor experiments, subsetted to 9,045,179 for both replicates.
- for γH2AX DNA-PKcs inhibitor experiments, subsetted to 13,474,505 for replicate 1 and 7,626,728 for replicate 2.
- for MRE11 DNA-PKcs inhibitor experiments, subsetted to 5,507,748 for replicate 1 and 8,057,929 for replicate 2.
- for 53BP1 timeseries experiments, subsetted to 8,481,406 for replicate 1 and 6,504,200 for replicate 2.
- for MRE11 timeseries experiments, subsetted to 2,686,969 for replicate 1 and 6,207,727 for replicate 2.

Start from pre-processed BAM files

In addition to raw paired-end reads in FASTQ format, we have also uploaded pre-processed sequencing reads in BAM format to SRA. These are the output of the previous section. It is highly recommended to start from these BAM files.

Download pre-processed paired-end reads in BAM format from SRA.
Move the downloaded data for MRE11 and γH2AX ChIP-seq to the desired folder.
If not already done, index the downloaded BAM files:
```
samtools index /path/to/output.bam
```

ChIP-seq analysis scripts in Python

`chipseq_M_pki.py`: analyze MRE11 ChIP-seq, with/without DNA-PKcs inhibitor KU60648

`chipseq_BH_pki.py`: analyze 53BP1 and γh2AX ChIP-seq, with/without DNA-PKcs inhibitor KU60648

`chipseq_M_ts.py`: analyze MRE11 ChIP-seq, timeseries after Cas9/pcRNA deactivation

`chipseq_B_ts.py`: analyze 53BP1 ChIP-seq, timeseries after Cas9/pcRNA deactivation

Set base variable to be the path to the directory that holds the BAM files.
Create a new folder to hold the output of 53BP1 and γH2AX scripts, set its path to bs_a.
Create a new folder to hold the output of MRE11 scripts, set its path to bs_s.
Ensure that all file names are correct (if the BAM files were directly downloaded from SRA, they should be), then run script.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
2mL_tube_holder_for_sonication		2mL_tube_holder_for_sonication
Jaxman_LED_flashlight_holder_design		Jaxman_LED_flashlight_holder_design
lib		lib
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
chipseq_BH_PKi.py		chipseq_BH_PKi.py
chipseq_B_ts.py		chipseq_B_ts.py
chipseq_M_PKi.py		chipseq_M_PKi.py
chipseq_M_ts.py		chipseq_M_ts.py
process_reads.sh		process_reads.sh
subset_reads.sh		subset_reads.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Custom analysis software for ChIP-seq of DNA repair proteins after CRISPR/Cas9-mediated DSBs

Software requirements

Data requirements

Generate processed BAM files (if starting from raw FASTQ files)

Start from pre-processed BAM files

ChIP-seq analysis scripts in Python

`chipseq_M_pki.py`: analyze MRE11 ChIP-seq, with/without DNA-PKcs inhibitor KU60648

`chipseq_BH_pki.py`: analyze 53BP1 and γh2AX ChIP-seq, with/without DNA-PKcs inhibitor KU60648

`chipseq_M_ts.py`: analyze MRE11 ChIP-seq, timeseries after Cas9/pcRNA deactivation

`chipseq_B_ts.py`: analyze 53BP1 ChIP-seq, timeseries after Cas9/pcRNA deactivation

About

Uh oh!

Releases

Packages

Languages

License

rogerzou/chipseq_pcRNA

Folders and files

Latest commit

History

Repository files navigation

Custom analysis software for ChIP-seq of DNA repair proteins after CRISPR/Cas9-mediated DSBs

Software requirements

Data requirements

Generate processed BAM files (if starting from raw FASTQ files)

Start from pre-processed BAM files

ChIP-seq analysis scripts in Python

chipseq_M_pki.py: analyze MRE11 ChIP-seq, with/without DNA-PKcs inhibitor KU60648

chipseq_BH_pki.py: analyze 53BP1 and γh2AX ChIP-seq, with/without DNA-PKcs inhibitor KU60648

chipseq_M_ts.py: analyze MRE11 ChIP-seq, timeseries after Cas9/pcRNA deactivation

chipseq_B_ts.py: analyze 53BP1 ChIP-seq, timeseries after Cas9/pcRNA deactivation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

`chipseq_M_pki.py`: analyze MRE11 ChIP-seq, with/without DNA-PKcs inhibitor KU60648

`chipseq_BH_pki.py`: analyze 53BP1 and γh2AX ChIP-seq, with/without DNA-PKcs inhibitor KU60648

`chipseq_M_ts.py`: analyze MRE11 ChIP-seq, timeseries after Cas9/pcRNA deactivation

`chipseq_B_ts.py`: analyze 53BP1 ChIP-seq, timeseries after Cas9/pcRNA deactivation

Packages