random_length: Create a named object with random sequences and qualities

Description Usage Arguments Value Author(s) Examples

View source: R/simulate.R

Description

Create a ShortReadQ object with random sequences and qualities

Usage

1
2
3
4
5
random_length(n, widths, random_widths = TRUE, replace = TRUE,
  len_prob = NULL, seq_prob = c(0.25, 0.25, 0.25, 0.25),
  q_prob = NULL, nuc = c("DNA", "RNA"), qual = NULL,
  encod = c("Sanger", "Illumina1.8", "Illumina1.5", "Illumina1.3",
  "Solexa"), base_name = "s", sep = "_")

Arguments

n

number of sequences

widths

width of the sequences

random_widths

width must be picked at random from the passed parameter 'widths', considering the value as an interval where any integer can be picked. Default TRUE. Otherwise, widths are picked only from the vector passed.

replace

sample widths with replacement? Default TRUE.

len_prob

vector with probabilities for each width value. Default NULL (equiprobability)

seq_prob

a vector of four probabilities values to set the frequency of the nucleotides 'A', 'C', 'G', 'T', for DNA, or 'A', 'C', 'G', 'U', for RNA. For example = c(0.25, 0.25, 0.5, 0). Default is = c(0.25, 0.25, 0.25, 0.25) (equiprobability for the 4 bases). If the sum of the probabilities is > 1, the values will be nomalized to the range [0, 1].

q_prob

a vector of range = range(qual), with probabilities to set the frequency of each quality value. Default is equiprobability. If the sum of the probabilities is > 1, the values will be nomalized to the range [0, 1].

nuc

create sequences of DNA (nucleotides = c('A', 'C', 'G', 'T')) or RNA (nucleotides = c('A, 'C', 'G', 'U'))?. Default: 'DNA'

qual

quality range for the sequences. It must be a range included in the selected encoding:

'Sanger' = [0, 40]

'Illumina1.8' = [0, 41]

'Illumina1.5' = [0, 40]

'Illumina1.3' = [3, 40]

'Solexa' = [-5, 40]

example: for a range from 20 to 30 in Sanger encoding, pass the argument = c(20, 30)

encod

sequence encoding

base_name

Base name for strings

sep

Character separing base names and the read number. Default: '_'

Value

ShortReadQ object

Author(s)

Leandro Roser learoser@gmail.com

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# For reproducible examples, make a call to set.seed before 
# running each random function

set.seed(10)
s1 <- random_seq(slength = 10, swidth = 20)
s1

set.seed(10)
s2 <- random_seq(slength = 10, swidth = 20, 
prob = c(0.6, 0.1, 0.3, 0))
s2

FastqCleaner documentation built on Nov. 8, 2020, 5:05 p.m.