rariant: Rariant calling functions

Description Usage Arguments Details Value Examples

Description

The 'rariant' function identifies variant shifts between a test and control sample. These highlevel functions offers a convenient interface for large-scale identification as well as for reexamination of existing variant calls.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
  
  ## S4 method for signature 'BamFile,BamFile,GRanges'
rariant(test, control, region,
      beta = 0.95, alpha = 1 - beta, select = TRUE, consensus,
      resultFile, strand = c("both", "plus", "minus"), nCycles = 10,
      minQual = 20, block = 1e4, value = TRUE, criteria = c("both",
      "any", "fet", "ci"))

  ## S4 method for signature 'character,character,GRanges'
rariant(test, control, region,
      beta = 0.95, alpha = 1 - beta, select = TRUE, consensus,
      resultFile, strand = c("both", "plus", "minus"), nCycles = 10,
      minQual = 20, block = 1e4, value = TRUE, criteria = c("both",
      "any", "fet", "ci"))

  ## S4 method for signature 'array,array,GRanges'
rariant(test, control, region, beta =
0.95, alpha = 1 - beta, select = TRUE, consensus, strand = c("both",
"plus", "minus"), criteria = c("both", "any", "fet", "ci"))
  
  rariantStandalone()
  
  readRariant(file, ...)

  writeRariant(x, file, ...)

Arguments

test, control

Test and control BAM files. Other input sources will be supported in the future.

region

Region(s) of interest to analyze in the calling [GRanges with one or more entries]. If missing, the entire genomic space, as defined by the BAM headers of the input files, will be covered.

beta

Confidence level [numeric in the range [0,1], default: 0.95].

alpha

Significance threshold for BH-adjusted p-values of the Fisher's exact test.

select

Should only likely variant positions be selected and returned, or the results for all sites be returned.

consensus

How to determine the consensus sequence. By default, the consensus is given by the most abundant allele in the control sample. Alternatively, an object with a reference sequence ('BSgenome', 'FaFile') can be passed to define the consensus sequence.

resultFile

If not missing, write the results to a tab-delimited file.

strand

Which strand should be extracted? By default, the counts of both strands are summed up.

nCycles

Number of sequencing cycles to remove from the beginning and end of each read when creating the base count table. This avoids low quality read positions [default: 10 is reasonable for current Illumina sequencing].

minQual

Minimum base call quality for reads to be considered for the nucleotide count table [default: 20 is reasonable for current Illumina sequencing]. Reads with a lower quality are dropped.

block

Number of the genomic sites to analyze in one chunk. The default is a good compromise between memory usage and speed, and normally does not require changing.

value

Should the results be returned by the function. For calls within R, this is generally set to TRUE and does not need to be changed.

criteria

The criteria to determine significant sites. Criteria are: Fisher's exact test, confidence intervals, any or both [default] of them.

file

Path to output file from a 'rariant' call.

x

Output of 'rariant' call.

...

Additional arguments passed to 'read.table' or 'write.table'.

Details

The 'rariant' function is the workhorse for the comparative variant calling and assessment. It starts with the aligned reads for the test (e.g. tumor) and the control (e.g. normal) sample in the BAM format; later versions will support additional inputs.

The 'select' parameter determines whether only significant variant sites or all sites are returned. While the first is suitable for detecting variants, the second becomes relevant assessing for example the abundance of variants at particular sites of interest - an example would be to determine the absence of a specific variant.

For analyses over large genomic regions and for use with infrastructure outside of R, initiating the calling from the command line may be a desirable alternative. The 'rariantStandalone' functions returns the full path to a script that can be directly called from the command line. For further details, see the help of the script by calling it with the '-h' option, for example 'rariant -h'.

The 'readRariant' and 'writeRariant' functions allow to import and export the results of a 'rariant' call from and to a file output, and will return the same object.

Value

A 'GRanges' object, with each row corresponding to a genomic site, and columns:

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
  library(GenomicRanges)
  
  control_bam = system.file("extdata", "NRAS.Control.bam", package = "h5vcData", mustWork = TRUE)
  test_bam = system.file("extdata", "NRAS.AML.bam", package = "h5vcData", mustWork = TRUE)
  
  roi = GRanges("1", IRanges(start = 115258439, end = 115259089))
  
  vars = rariant(test_bam, control_bam, roi)
  
  vars_all = rariant(test_bam, control_bam, roi, select =
  FALSE)

  ## Not run: 
    system2(rariantStandalone(), "-h")
  
## End(Not run)

Rariant documentation built on Nov. 8, 2020, 6:56 p.m.