makeVarseq: Introduce variations in a DNA sequence

Description Usage Arguments Value Examples

View source: R/makeVarseq.R

Description

Default values for % of variations (insertion, deletion, mismatches) are defined based on the 5th and 95th percentile of values observed from a MinION run

Usage

1
2
3
4
5
6
7
8
makeVarseq(
  dnaseq,
  lettrs = c("A", "T", "G", "C"),
  subst = c(0.014, 0.052),
  del = c(0.011, 0.023),
  ins = c(0.006, 0.014),
  returnString = TRUE
)

Arguments

dnaseq

Either a character string, a DNAString or a character vector with individual characters. The DNA sequence in which variations are introduced

lettrs

character vector

subst

numeric vector of length 2 with values in [0,1]. Percentage of substitutions (upper and lower bounds). Default is 1.4-5.2% substition.

del

numeric vector of length 2 with values in [0,1]. Percentage of deletions (upper and lower bounds). Default is 1.1-2.3% deletion.

ins

numeric vector of length 2 with values in [0,1]. Percentage of insertions (upper and lower bounds). Default is 0.6-1.4% insertions.

returnString

Logical. Should the function return a single character string? (Default is TRUE)

Value

Either a vector of individual characters (if returnString==FALSE) or a single character string if returnString==TRUE

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
set.seed(12345) # for reproducibility
## Introduce ~10% variation in a sequence
makeVarseq("ATGCATGCATGCATGCATGCATGC",
           subst = c(0.08, 0.12),
           del = c(0.08, 0.12),
           ins = c(0.08, 0.12))

## The function will not verify if the string is a canonical DNA string
## thus, it can be used to modify any string:
makeVarseq("ABCDEFGHIJKLMNOPQRSTUVWXYZ+-!.?",
           lettrs=letters,
           subst=c(0.05,0.1),
           del=c(0.02,0.1),
           ins=c(0.02, 0.06))

pgpmartin/NanoBAC documentation built on Dec. 11, 2020, 9:51 a.m.