| STAR-methods | R Documentation |
These functions run the STAR aligner to build a STAR genome reference, calculate mappability exclusion regions using STAR, and align one or more FASTQ files (single or paired) to the generated genome. These functions only work on Linux-based systems with STAR installed. STAR must be accessible via $PATH. See details and examples
STAR_version()
STAR_buildRef(
reference_path,
STAR_ref_path = file.path(reference_path, "STAR"),
also_generate_mappability = TRUE,
map_depth_threshold = 4,
sjdbOverhang = 149,
n_threads = 4,
additional_args = NULL,
...
)
STAR_Mappability(
reference_path,
STAR_ref_path = file.path(reference_path, "STAR"),
map_depth_threshold = 4,
n_threads = 4,
...
)
STAR_align_experiment(
Experiment,
STAR_ref_path,
BAM_output_path,
trim_adaptor = "AGATCGGAAG",
two_pass = FALSE,
n_threads = 4
)
STAR_align_fastq(
fastq_1 = c("./sample_1.fastq"),
fastq_2 = NULL,
STAR_ref_path,
BAM_output_path,
two_pass = FALSE,
trim_adaptor = "AGATCGGAAG",
memory_mode = "NoSharedMemory",
additional_args = NULL,
n_threads = 4
)
reference_path |
The path to the reference.
GetReferenceResource must first be run using this path
as its |
STAR_ref_path |
(Default - the "STAR" subdirectory under
|
also_generate_mappability |
Whether |
map_depth_threshold |
(Default 4) The depth of mapped reads
threshold at or below which Mappability exclusion regions are defined. See
Mappability-methods.
Ignored if |
sjdbOverhang |
(Default = 149) A STAR setting indicating the length of the donor / acceptor sequence on each side of the junctions. Ideally equal to (mate_length - 1). As the most common read length is 150, the default of this function is 149. See the STAR aligner manual for details. |
n_threads |
The number of threads to run the STAR aligner. |
additional_args |
A character vector of additional arguments to be parsed into STAR. See examples below. |
... |
Additional arguments to be parsed into
|
Experiment |
A two or three-column data frame with the columns denoting sample names, forward-FASTQ and reverse-FASTQ files. This can be conveniently generated using Find_FASTQ |
BAM_output_path |
The path under which STAR outputs the aligned BAM
files. In |
trim_adaptor |
The sequence of the Illumina adaptor to trim via STAR's
|
two_pass |
Whether to use two-pass mapping. In
|
fastq_1, fastq_2 |
In STAR_align_fastq: character vectors giving the
path(s) of one or more FASTQ (or FASTA) files to be aligned.
If single reads are to be aligned, omit |
memory_mode |
The parameter to be parsed to |
Pre-requisites
STAR_buildRef requires GetReferenceResource to be run to fetch the
required genome and gene annotation files.
STAR_Mappability, STAR_align_experiment and STAR_align_fastq requires a
STAR genome, which can be built using STAR_buildRef
Function Description
For STAR_buildRef: this function
will create a STAR genome reference in the STAR subdirectory in the
path given by reference_path. Optionally, it will run STAR_Mappability
if also_generate_mappability is set to TRUE
For STAR_Mappability: this function will first
will run Mappability_GenReads, then use the given STAR genome to align
the synthetic reads using STAR. The aligned BAM file will then be
processed using Mappability_CalculateExclusions to calculate the
lowly-mappable genomic regions,
producing the MappabilityExclusion.bed.gz output file.
For STAR_align_fastq: aligns a single or pair of FASTQ files to the given
STAR genome using the STAR aligner.
For STAR_align_experiment: aligns a set of FASTQ or paired FASTQ files
using the given
STAR genome using the STAR aligner.
A data.frame specifying sample names and corresponding FASTQ files are
required
None. STAR will output files into the given output directories.
STAR_version: Checks whether STAR is installed, and its version
STAR_buildRef: Creates a STAR genome reference.
STAR_Mappability: Calculates lowly-mappable genomic regions using STAR
STAR_align_experiment: Aligns multiple sets of FASTQ files, belonging to
multiple samples
STAR_align_fastq: Aligns a single sample (with single or paired FASTQ
or FASTA files)
BuildReference Find_Samples Mappability-methods
The latest STAR documentation
# 0) Check that STAR is installed and compatible with NxtIRF
STAR_version()
## Not run:
# The below workflow illustrates
# 1) Getting the reference resource
# 2) Building the STAR Reference, including Mappability Exclusion calculation
# 3) Building the NxtIRF Reference, using the Mappability Exclusion file
# 4) Aligning (a) one or (b) multiple raw sequencing samples.
# 1) Reference generation from Ensembl's FTP links
FTP <- "ftp://ftp.ensembl.org/pub/release-94/"
GetReferenceResource(
reference_path = "Reference_FTP",
fasta = paste0(FTP, "fasta/homo_sapiens/dna/",
"Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz"),
gtf = paste0(FTP, "gtf/homo_sapiens/",
"Homo_sapiens.GRCh38.94.chr.gtf.gz")
)
# 2) Generates STAR genome within the NxtIRF reference. Also generates
# mappability exclusion gzipped BED file inside the "Mappability/" sub-folder
STAR_buildRef(
reference_path = "Reference_FTP",
n_threads = 8,
also_generate_mappability = TRUE
)
# 2 alt) Generates STAR genome of the example NxtIRF genome.
# This demonstrates using custom STAR parameters, as the example NxtIRF
# genome is ~100k in length, so --genomeSAindexNbases needs to be
# adjusted to be min(14, log2(GenomeLength)/2 - 1)
GetReferenceResource(
reference_path = "Reference_chrZ",
fasta = chrZ_genome(),
gtf = chrZ_gtf()
)
STAR_buildRef(
reference_path = "Reference_chrZ",
n_threads = 8,
additional_args = c("--genomeSAindexNbases", "7"),
also_generate_mappability = TRUE
)
# 3) Build NxtIRF reference using the newly-generated Mappability exclusions
#' NB: also specifies to use the hg38 nonPolyA resource
BuildReference(reference_path = "Reference_FTP", genome_type = "hg38")
# 4a) Align a single sample using the STAR reference
STAR_align_fastq(
STAR_ref_path = file.path("Reference_FTP", "STAR"),
BAM_output_path = "./bams/sample1",
fastq_1 = "sample1_1.fastq", fastq_2 = "sample1_2.fastq",
n_threads = 8
)
# 4b) Align multiple samples, using two-pass alignment
Experiment <- data.frame(
sample = c("sample_A", "sample_B"),
forward = file.path("raw_data", c("sample_A", "sample_B"),
c("sample_A_1.fastq", "sample_B_1.fastq")),
reverse = file.path("raw_data", c("sample_A", "sample_B"),
c("sample_A_2.fastq", "sample_B_2.fastq"))
)
STAR_align_experiment(
Experiment = Experiment,
STAR_ref_path = file.path("Reference_FTP", "STAR"),
BAM_output_path = "./bams",
two_pass = TRUE,
n_threads = 8
)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.