run_hisat2: Wrapper script to run HISAT2.

Description Usage Arguments Value Examples

Description

Script to align reads to a reference genome using hisat2. This requires an existing index which may be created using hisat2 itself. Commonly used genome indices may also be downloaded from the HISAT2 homepage.

Usage

1
2
3
4
5
6
7
run_hisat2(hisat2 = "hisat2", idx = NULL, mate1 = NULL,
  mate2 = NULL, fastq = TRUE, fasta = FALSE,
  softClipPenalty = NULL, noSoftClip = FALSE, noSplice = FALSE,
  knownSplice = NULL, strand = NULL, tmo = FALSE, maxAlign = NULL,
  secondary = FALSE, minInsert = NULL, maxInsert = NULL,
  nomixed = FALSE, nodiscordant = FALSE, threads = 1, rgid = NULL,
  quiet = FALSE, non_deterministic = FALSE)

Arguments

hisat2

Path to hisat2 (if using WSL, then this should be the full path on the linux subsystem)

idx

The basename of the index for the reference genome. The basename is the name of any of the index files up to but not including the final .1.ht2, etc.

mate1

Comma-separated list of files containing mate 1s (filename usually includes _1)

mate2

Comma-separated list of files containing mate 2s (filename usually includes _2). Sequences specified with this option must correspond file-for-file and read-for-read with those specified in .

fastq

Logical indicating if reads are FASTQ files.

fasta

Logical indicating if reads are FASTA files.

softClipPenalty

Sets the maximum (MX) and minimum (MN) penalties for soft-clipping per base, both integers. Must be given in the format "MX,MN".

noSoftClip

Logical indicating whether to disallow soft-clipping.

noSplice

Logical indicating whether to switch off spliced alignment, e.g., for DNA-seq analysis.

knownSplice

Path to text file containing known splice sites.

strand

Specify strand-specific information. Default is unstranded.

tmo

Logical indicating whether to report only those reads aligning to known transcripts.

maxAlign

Integer indicating the maximum number of distinct primary alignments to search for each read.

secondary

Logical indicating whether to report secondary alignments.

minInsert

The minimum fragment length for valid paired-end alignments. This option is valid only with noSplice = TRUE.

maxInsert

The maximum fragment length for valid paired-end alignments. This option is valid only with noSplice = TRUE.

nomixed

By default, when hisat2 cannot find a concordant or discordant alignment for a pair, it then tries to find alignments for the individual mates. If TRUE, this option disables that behavior.

nodiscordant

By default, hisat2 looks for discordant alignments if it cannot find any concordant alignments. If true, this option disables that behavior.

threads

an integer value indicating the number of workers to be used. If NULL then one less than the maximum number of cores will be used. [DEFAULT = NULL].

rgid

Character string, to which the read group ID is set.

quiet

If TRUE, print nothing except alignments and serious errors.

non_deterministic

When set to TRUE, HISAT2 re-initializes its pseudo-random generator for each read using the current time.

Value

Alignment file in SAM format

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
## Not run: 
run_hisat2(hisat2 = "hisat2", idx = "../prana/data-raw/index/UCSC.hg19",
mate1 = "../prana/data-raw/seqFiles/HB1_sample_1.fastq.gz",
mate2 = "../prana/data-raw/seqFiles/HB1_sample_2.fastq.gz",
fastq = TRUE, fasta = FALSE, softClipPenalty = NULL, noSoftClip = FALSE,
noSplice = FALSE, knownSplice = NULL, strand = NULL, tmo = FALSE,
maxAlign = NULL, secondary = FALSE, minInsert = NULL, maxInsert = NULL,
nomixed = FALSE, nodiscordant = FALSE,
threads = (parallel::detectCores() - 1), rgid = NULL, quiet = FALSE,
non_deterministic = TRUE)

## End(Not run)

anilchalisey/chompR documentation built on May 9, 2019, 3:59 a.m.