STAR-methods | R Documentation |
These functions run the STAR aligner to build a STAR genome reference, calculate mappability exclusion regions using STAR, and align one or more FASTQ files (single or paired) to the generated genome. These functions only work on Linux-based systems with STAR installed. STAR must be accessible via $PATH. See details and examples
STAR_version() STAR_buildRef( reference_path, STAR_ref_path = file.path(reference_path, "STAR"), also_generate_mappability = TRUE, map_depth_threshold = 4, sjdbOverhang = 149, n_threads = 4, additional_args = NULL, ... ) STAR_Mappability( reference_path, STAR_ref_path = file.path(reference_path, "STAR"), map_depth_threshold = 4, n_threads = 4, ... ) STAR_align_experiment( Experiment, STAR_ref_path, BAM_output_path, trim_adaptor = "AGATCGGAAG", two_pass = FALSE, n_threads = 4 ) STAR_align_fastq( fastq_1 = c("./sample_1.fastq"), fastq_2 = NULL, STAR_ref_path, BAM_output_path, two_pass = FALSE, trim_adaptor = "AGATCGGAAG", memory_mode = "NoSharedMemory", additional_args = NULL, n_threads = 4 )
reference_path |
The path to the reference.
GetReferenceResource must first be run using this path
as its |
STAR_ref_path |
(Default - the "STAR" subdirectory under
|
also_generate_mappability |
Whether |
map_depth_threshold |
(Default 4) The depth of mapped reads
threshold at or below which Mappability exclusion regions are defined. See
Mappability-methods.
Ignored if |
sjdbOverhang |
(Default = 149) A STAR setting indicating the length of the donor / acceptor sequence on each side of the junctions. Ideally equal to (mate_length - 1). As the most common read length is 150, the default of this function is 149. See the STAR aligner manual for details. |
n_threads |
The number of threads to run the STAR aligner. |
additional_args |
A character vector of additional arguments to be parsed into STAR. See examples below. |
... |
Additional arguments to be parsed into
|
Experiment |
A two or three-column data frame with the columns denoting sample names, forward-FASTQ and reverse-FASTQ files. This can be conveniently generated using Find_FASTQ |
BAM_output_path |
The path under which STAR outputs the aligned BAM
files. In |
trim_adaptor |
The sequence of the Illumina adaptor to trim via STAR's
|
two_pass |
Whether to use two-pass mapping. In
|
fastq_1, fastq_2 |
In STAR_align_fastq: character vectors giving the
path(s) of one or more FASTQ (or FASTA) files to be aligned.
If single reads are to be aligned, omit |
memory_mode |
The parameter to be parsed to |
Pre-requisites
STAR_buildRef
requires GetReferenceResource to be run to fetch the
required genome and gene annotation files.
STAR_Mappability
, STAR_align_experiment
and STAR_align_fastq
requires a
STAR
genome, which can be built using STAR_buildRef
Function Description
For STAR_buildRef
: this function
will create a STAR
genome reference in the STAR
subdirectory in the
path given by reference_path
. Optionally, it will run STAR_Mappability
if also_generate_mappability
is set to TRUE
For STAR_Mappability
: this function will first
will run Mappability_GenReads, then use the given STAR
genome to align
the synthetic reads using STAR
. The aligned BAM file will then be
processed using Mappability_CalculateExclusions to calculate the
lowly-mappable genomic regions,
producing the MappabilityExclusion.bed.gz
output file.
For STAR_align_fastq
: aligns a single or pair of FASTQ files to the given
STAR
genome using the STAR
aligner.
For STAR_align_experiment
: aligns a set of FASTQ or paired FASTQ files
using the given
STAR
genome using the STAR
aligner.
A data.frame specifying sample names and corresponding FASTQ files are
required
None. STAR will output files into the given output directories.
STAR_version
: Checks whether STAR is installed, and its version
STAR_buildRef
: Creates a STAR genome reference.
STAR_Mappability
: Calculates lowly-mappable genomic regions using STAR
STAR_align_experiment
: Aligns multiple sets of FASTQ files, belonging to
multiple samples
STAR_align_fastq
: Aligns a single sample (with single or paired FASTQ
or FASTA files)
BuildReference Find_Samples Mappability-methods
The latest STAR documentation
# 0) Check that STAR is installed and compatible with NxtIRF STAR_version() ## Not run: # The below workflow illustrates # 1) Getting the reference resource # 2) Building the STAR Reference, including Mappability Exclusion calculation # 3) Building the NxtIRF Reference, using the Mappability Exclusion file # 4) Aligning (a) one or (b) multiple raw sequencing samples. # 1) Reference generation from Ensembl's FTP links FTP <- "ftp://ftp.ensembl.org/pub/release-94/" GetReferenceResource( reference_path = "Reference_FTP", fasta = paste0(FTP, "fasta/homo_sapiens/dna/", "Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz"), gtf = paste0(FTP, "gtf/homo_sapiens/", "Homo_sapiens.GRCh38.94.chr.gtf.gz") ) # 2) Generates STAR genome within the NxtIRF reference. Also generates # mappability exclusion gzipped BED file inside the "Mappability/" sub-folder STAR_buildRef( reference_path = "Reference_FTP", n_threads = 8, also_generate_mappability = TRUE ) # 2 alt) Generates STAR genome of the example NxtIRF genome. # This demonstrates using custom STAR parameters, as the example NxtIRF # genome is ~100k in length, so --genomeSAindexNbases needs to be # adjusted to be min(14, log2(GenomeLength)/2 - 1) GetReferenceResource( reference_path = "Reference_chrZ", fasta = chrZ_genome(), gtf = chrZ_gtf() ) STAR_buildRef( reference_path = "Reference_chrZ", n_threads = 8, additional_args = c("--genomeSAindexNbases", "7"), also_generate_mappability = TRUE ) # 3) Build NxtIRF reference using the newly-generated Mappability exclusions #' NB: also specifies to use the hg38 nonPolyA resource BuildReference(reference_path = "Reference_FTP", genome_type = "hg38") # 4a) Align a single sample using the STAR reference STAR_align_fastq( STAR_ref_path = file.path("Reference_FTP", "STAR"), BAM_output_path = "./bams/sample1", fastq_1 = "sample1_1.fastq", fastq_2 = "sample1_2.fastq", n_threads = 8 ) # 4b) Align multiple samples, using two-pass alignment Experiment <- data.frame( sample = c("sample_A", "sample_B"), forward = file.path("raw_data", c("sample_A", "sample_B"), c("sample_A_1.fastq", "sample_B_1.fastq")), reverse = file.path("raw_data", c("sample_A", "sample_B"), c("sample_A_2.fastq", "sample_B_2.fastq")) ) STAR_align_experiment( Experiment = Experiment, STAR_ref_path = file.path("Reference_FTP", "STAR"), BAM_output_path = "./bams", two_pass = TRUE, n_threads = 8 ) ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.