STAR.index | R Documentation |
Used as reference when aligning data
Get genome and gtf by running getGenomeAndFasta()
STAR.index(
arguments,
output.dir = paste0(dirname(arguments[1]), "/STAR_index/"),
star.path = STAR.install(),
max.cpus = min(90, BiocParallel::bpparam()$workers),
max.ram = 30,
SAsparse = 1,
tmpDirStar = "-",
wait = TRUE,
remake = FALSE,
script = system.file("STAR_Aligner", "STAR_MAKE_INDEX.sh", package = "ORFik"),
notify_load_existing = TRUE
)
arguments |
a named character vector containing paths wanted to use for index creation. They must be named correctly: names must be a subset of: c("gtf", "genome", "contaminants", "phix", "rRNA", "tRNA","ncRNA") |
output.dir |
directory to save indices, default: paste0(dirname(arguments[1]), "/STAR_index/"), where arguments is the arguments input for this function. |
star.path |
path to STAR, default: STAR.install(), if you don't have STAR installed at default location, it will install it there, set path to a runnable star if you already have it. |
max.cpus |
integer, default: |
max.ram |
integer, default 30, in Giga Bytes (GB). Maximum amount of RAM allowed for STAR limitGenomeGenerateRAM argument. RULE: idealy 10x genome size, but do not set too close to machine limit. Default fits well for human genome size (3 GB * 10 = 30 GB) |
SAsparse |
int > 0, default 1. If you do not have at least 64GB RAM, you might need to set this to 2. suffux array sparsity, i.e. distance between indices: use bigger numbers to decrease needed RAM at the cost of mapping speed reduction. Only applies to genome, not conaminants. |
tmpDirStar |
character, default "-". STAR automatic temp folder creation,
deleted when done. The directory can not exists, as a safety STAR must make it!.
If you are on a NFS file share drive, and you have a non NFS tmp dir,
set this to |
wait |
a logical (not |
remake |
logical, default: FALSE, if TRUE remake everything specified |
script |
location of STAR index script, default internal ORFik file. You can change it and give your own if you need special alignments. |
notify_load_existing |
logical, default TRUE. If annotation exists (defined as: locally (a file called outputs.rds) exists in outputdir), print a small message notifying the user it is not redownloading. Set to FALSE, if this is not wanted |
Can only run on unix systems (Linux and Mac), and requires
minimum 30GB memory on genomes like human, rat, zebrafish etc.
If for some reason the internal STAR index bash script will not work for you,
like if you have a very small genome. You can copy the internal index script,
edit it and give that as the Index script used for this function.
It is recommended to run through the RStudio local job tab, to give full info
about the run. The system console will not stall, as can happen in happen in
normal RStudio console.
output.dir, can be used as as input for STAR.align..
Other STAR:
STAR.align.folder()
,
STAR.align.single()
,
STAR.allsteps.multiQC()
,
STAR.install()
,
STAR.multiQC()
,
STAR.remove.crashed.genome()
,
getGenomeAndAnnotation()
,
install.fastp()
## Manual way, specify all paths yourself.
#arguments <- c(path.GTF, path.genome, path.phix, path.rrna, path.trna, path.ncrna)
#names(arguments) <- c("gtf", "genome", "phix", "rRNA", "tRNA","ncRNA")
#STAR.index(arguments, "output.dir")
## Or use ORFik way:
output.dir <- "/Bio_data/references/Human"
# arguments <- getGenomeAndAnnotation("Homo sapiens", output.dir)
# STAR.index(arguments, output.dir)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.