THE_Aligner's Output

THE_Aligner Output

All output files will be placed in a directory named results/ that will be created the first time you run THE_Aligner.

Preprocessing output:

trimmed/: fastq files processed with fastp.
reports/: Reports about trimming generated by fastp.
fastqc/: fastqc output.

Alignment output:

align/: Bam files generated by chosen aligner.
sorted_bam/: Bam files sorted by coordinate with SAMtools.
alignment_stats: Alignment statistics generated by bamtools stats

Coverage files:

genomecov/: BedGraphs generated with bedtools genomecov.
- If method is PROseq, then separate plus and minus strand versions of all coverage files will be generated, with directories appended with _plus and _minus respectively.
chrom_sizes/: Chromosome sizes used for bedGraphToBigWig.
bg2bw/: BigWig versions of the coverage bedGraphs.
normalize/: Normalization scale factors used to scale the coverage graphs. Can use spikeins by providing a string unique to all gene names from the spike-in annotation to the spikename config parameter.

Quantification files:

kallisto_quant/ (kallisto only): Includes quantification files and run info json. See kallisto's manual for details on interpretation.
quant (salmon only): Includes a directory for each sample containing a quant.sf file and a lib_format_counts.json file. See Salmon documentation for details.

There is also a logs/ directory created in the working directory of the pipeline. This includes directories named after each rule in the pipeline, and log files from the execution of each of these rules. This is where you should go to get information about what went wrong if you run into pipeline failures.