STAR.index: Create STAR genome index

Description Usage Arguments Details Value See Also Examples

View source: R/STAR.R

Description

Used as reference when aligning data
Get genome and gtf by running getGenomeAndFasta()

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
STAR.index(
  arguments,
  output.dir = paste0(dirname(arguments[1]), "/STAR_index/"),
  star.path = STAR.install(),
  max.cpus = min(90, detectCores() - 1),
  max.ram = 30,
  SAsparse = 1,
  wait = TRUE,
  remake = FALSE,
  script = system.file("STAR_Aligner", "STAR_MAKE_INDEX.sh", package = "ORFik")
)

Arguments

arguments

a named character vector containing paths wanted to use for index creation. They must be named correctly: names must be a subset of: c("gtf", "genome", "phix", "rRNA", "tRNA","ncRNA")

output.dir

directory to save indices, default: paste0(dirname(arguments[1]), "/STAR_index/"), where arguments is the arguments input for this function.

star.path

path to STAR, default: STAR.install(), if you don't have STAR installed at default location, it will install it there, set path to a runnable star if you already have it.

max.cpus

integer, default: min(90, detectCores() - 1), number of threads to use. Default is minimum of 90 and maximum cores - 1. So if you have 8 cores it will use 7.

max.ram

integer, default 30, in Giga Bytes (GB). Maximum amount of RAM allowed for STAR limitGenomeGenerateRAM argument. RULE: idealy 10x genome size, but do not set too close to machine limit. Default fits well for human genome size (3 GB * 10 = 30 GB)

SAsparse

int > 0, default 1. If you do not have at least 64GB RAM, you might need to set this to 2. suffux array sparsity, i.e. distance between indices: use bigger numbers to decrease needed RAM at the cost of mapping speed reduction. Only applies to genome, not conaminants.

wait

a logical (not NA) indicating whether the R interpreter should wait for the command to finish, or run it asynchronously. This will be ignored (and the interpreter will always wait) if intern = TRUE. When running the command asynchronously, no output will be displayed on the Rgui console in Windows (it will be dropped, instead).

remake

logical, default: FALSE, if TRUE remake everything specified

script

location of STAR index script, default internal ORFik file. You can change it and give your own if you need special alignments.

Details

Can only run on unix systems (Linux and Mac), and requires minimum 30GB memory on genomes like human, rat, zebrafish etc.
If for some reason the internal STAR index bash script will not work for you, like if you have a very small genome. You can copy the internal index script, edit it and give that as the Index script used for this function.

Value

output.dir, can be used as as input for STAR.align..

See Also

Other STAR: STAR.align.folder(), STAR.align.single(), STAR.allsteps.multiQC(), STAR.install(), STAR.multiQC(), STAR.remove.crashed.genome(), getGenomeAndAnnotation(), install.fastp()

Examples

1
2
3
4
5
6
7
8
9
## Manual way, specify all paths yourself.
#arguments <- c(path.GTF, path.genome, path.phix, path.rrna, path.trna, path.ncrna)
#names(arguments) <- c("gtf", "genome", "phix", "rRNA", "tRNA","ncRNA")
#STAR.index(arguments, "output.dir")

## Or use ORFik way:
output.dir <- "/Bio_data/references/Human"
# arguments <- getGenomeAndAnnotation("Homo sapiens", output.dir)
# STAR.index(arguments, output.dir)

ORFik documentation built on March 27, 2021, 6 p.m.