PROseq_etal Output
All output files will be placed in a directory named results/ that will be created the first time you run PROseq_etal.
Preprocessing output:
trimmed/: fastq files processed with fastp.reports/: Reports about trimming generated by fastp.fastqc/: fastqc output.
Alignment output:
align/: Bam files generated by chosen aligner.sorted_bam/: Bam files sorted by coordinate with SAMtools.alignment_stats: Alignment statistics generated bybamtools stats
MACS2 output:
macs2_callpeak/: peak calls from MACS2- This is the only MACS2 output if the
methodparameter in your config is set toPROseq. macs2_differential/: output ofmacs2 bdgcmpcomparison of Input to enrichment using subtraction comparison.macs2_enrichment/: output ofmacs2 bdgcmpcomparison of Input to enrichment using fold difference comparison.macs2_sort/: sorted output frommacs2_differential/andmacs2_enrichment/; needed to run bedGraphtobigWig.macs2_FE_bw/: bigWig versions ofmacs2_enrichment/bedGraphsmacs2_diff_bw/: bigWig versions ofmacs2_differential/bedGraphs.
HOMER output:
homer_tagDir/: Tag directory created by HOMER, which HOMER requires as input for all of its tools.homer_findPeaks/: Peaks called by HOMER.homer_mergePeaks/: Merged HOMER peaks, merged with HOMER.homer_annotatePeaks/: HOMER called peaks annotated by HOMER. Both the separate and merged peaks are annotated.annotate_narrowPeaks: MACS2 called peaks annotated by HOMER.- If config file parameter
macs2_narrowis set toFalse, then this directory will be replaced byannotate_broadPeaks.
Coverage files:
genomecov/: BedGraphs generated withbedtools genomecov.- If
methodisPROseq, then separate plus and minus strand versions of all coverage files will be generated, with directories appended with_plusand_minusrespectively.
- If
chrom_sizes/: Chromosome sizes used for bedGraphToBigWig.bg2bw/: BigWig versions of the coverage bedGraphs.normalize/: Normalization scale factors used to scale the coverage graphs. Can use spikeins by providing a string unique to all gene names from the spike-in annotation to thespikenameconfig parameter.
HTSeq quantification (PRO-seq only):
quantify/: Calculation of gene body and pause site coverages using HTSeq, as well as pause indices calculated using a custom R script.
There is also a logs/ directory created in the working directory of the pipeline. This includes directories named after each rule in the pipeline, and log files from the execution of each of these rules. This is where you should go to get information about what went wrong if you run into pipeline failures.