syntheticNucReadsFromDist: Generate a synthetic nucleosome map containing forward and...
In ArnaudDroitLab/nucleoSim: Generate synthetic nucleosome maps

syntheticNucReadsFromDist

R Documentation

Generate a synthetic nucleosome map containing forward and reverse reads (paired-end reads)

Description

Generate a synthetic nucleosome map, a map with forward and reverses reads (paired-end reads) covering the nucleosome regions, using the distribution selected by the user. The distribution is used to assign the start position to the forward reads associated with the nucleosomes. The user has choice between three different distributions: Normal, Student and Uniform. The final map is composed of paired-end reads.

#' The synthetic nucleosome map creation is separated into 3 steps :

1. Adding well-positioned nucleosomes following specified parameters. The nucleosomes are all positioned at equidistance. Assigning the starting positions of forward reads using the specified distribution and parameters. The distance between starting positions of paired-end reads is assigned using a normal distribution and specified variance.

2. Deleting some well-positioned nucleosomes following specified parameters. Each nucleosome has an equal probability to be selected.

3. Adding fuzzy nucleosomes following an uniform distribution and specified parameters. Assigning the starting positions of forward reads using the specified distribution and parameters. The distance between starting positions of paired-end reads is assigned using a normal distribution and specified variance.

This function has been largely inspired by the Generating synthetic maps section of the nucleR package (Flores et Orozco, 2011).

Usage

syntheticNucReadsFromDist(
  wp.num,
  wp.del,
  wp.var,
  fuz.num,
  fuz.var,
  max.cover = 100,
  nuc.len = 147,
  len.var = 10,
  lin.len = 20,
  read.len = 40,
  rnd.seed = NULL,
  distr = c("Uniform", "Normal", "Student"),
  offset
)

Arguments

`wp.num`	a non-negative `integer`, the number of well-positioned (non-overlapping) nucleosomes.
`wp.del`	a non-negative `integer`, the number of well-positioned nucleosomes to remove to create uncovered regions.
`wp.var`	a non-negative `integer`, the variance associated with the distribution used to assign the start position to the forward reads of the well-positioned nucleosomes. This parameter introduces some variation in the starting positions.
`fuz.num`	a non-negative `numeric`, the number of fuzzy nucleosomes. Those nucleosomes are distributed accordingly to an uniform distribution all over the region. Those nucleosomes can overlap other well-positioned or fuzzy nucleosomes.
`fuz.var`	a non-negative `numeric`, the maximum variance of the fuzzy nucleosomes. This variance can be different than the one used for the well-positioned nucleosome reads.
`max.cover`	a positive `numeric`, the maximum coverage for one nucleosome. The final coverage can have a higher value than `max.cover` since reads from different nucleosomes can be overlapping. Default = 100.
`nuc.len`	a positive `integer`, the nucleosome length. Default = 147.
`len.var`	a positive `numeric`, the variance of the distance between a forward read and its paired reverse read. Default = 10.
`lin.len`	a non-negative `integer`, the length of the DNA linker. Default = 20.
`read.len`	a positive `integer`, the length of each of the paired-end reads. Default = 40.
`rnd.seed`	a single value, interpreted as an `integer`, or `NULL`. If an `integer` is given, the value is used to set the seed of the random number generator. By fixing the seed, the generated results can be reproduced. Default = `NULL`.
`distr`	the name of the distribution used to generate the nucleosome map. The choices are : `"Uniform"`, `"Normal"` and `"Student"`. Default = `"Uniform"`.
`offset`	a non-negative `integer`, the number of bases used to offset all nucleosomes and reads. This is done to ensure that all nucleosome positions and read alignment are of positive values.

Value

an list of class "syntheticNucReads" containing the following elements:

call the matched call.
dataIP a data.frame with the chromosome name, the starting and ending positions and the direction of all forward and reverse reads for all well-positioned and fuzzy nucleosomes. Paired-end reads are identified with an unique id.
wp a data.frame with the positions of all the well-positioned nucleosomes, as well as the number of paired-reads associated to each one.
fuz a data.frame with the positions of all the fuzzy nucleosomes, as well as the number of paired-reads associated to each one.
paired a data.frame with the starting and ending positions of the reads used to generate the paired-end reads. Paired-end reads are identified with an unique id.

Author(s)

Pascal Belleau, Rawane Samb, Astrid Deschenes

Examples


## Generate a synthetic map with 20 well-positioned + 10 fuzzy nucleosomes
## using a Normal distribution with a variance of 30 for the well-positioned
## nucleosomes, a variance of 40 for the fuzzy nucleosomes and a seed of 15.
## Because of the fixed seed, each time is going to be run, the results
## are going to be the seed.
res <- syntheticNucReadsFromDist(wp.num = 20, wp.del = 0, wp.var = 30,
fuz.num = 10, fuz.var = 40, rnd.seed = 15, distr = "Normal",
offset = 1000)

ArnaudDroitLab/nucleoSim documentation built on March 17, 2022, 11 p.m.