createFakeDNA: Create fake DNA or amino acid sequences

View source: R/plotDNA.R

createFakeDNAR Documentation

Create fake DNA or amino acid sequences

Description

Creates a random reference sequence then adds mutations and indels to hierarchical sets of the sequences and random noise

Usage

createFakeDNA(
  n = 500,
  nChar = 400,
  nSplit = 3,
  pGap = 0.3,
  pNoise = 0.01,
  pMutation = 0.005,
  bases = c("A", "C", "T", "G", "-"),
  excludeBases = c("-", "X")
)

createFakeAA(
  n = 100,
  nChar = 100,
  ...,
  pGap = 0.2,
  pNoise = 0,
  pMutation = 0.01,
  bases = c(names(dnaplotr::aminoCols), "-")
)

Arguments

n

number of fake sequences to generate

nChar

character length of the output fake sequences

nSplit

The number of hierarchical splits to make in the data e.g. 3 splits produces 2^3=8 "species"

pGap

probability of a large insertion or deletion in each grouping

pNoise

probability of a random substitution at each base

pMutation

probability of a substitution at each base in each hierarchical grouping

bases

the bases used in generating the sequence (bases listed in excludeBases are excluded from the initial reference sequence generation

excludeBases

bases excluded from the initial reference sequence generation (default: '-' and 'X')

...

additional arguments to createFakeDNA

Value

A n+1 length character vector of fake sequences. The first sequence is the reference. Names of the remaining sequences indicate their hierarchical groupings follow by an arbitrary id

Examples

createFakeDNA(10,10)
createFakeAA(10,10)

sherrillmix/dnaplotr documentation built on Oct. 29, 2022, 4:42 p.m.