binomRegMethModelSim: Simulate Bisulfite sequencing data from specified smooth...

View source: R/binomRegMethModelSim.R

binomRegMethModelSimR Documentation

Simulate Bisulfite sequencing data from specified smooth covariate effects

Description

Simulate Bisulfite sequencing data from a Generalized Additive Model with functional parameters varying with the genomic position. Both the true methylated counts and observed methylated counts are generated, given the error/conversion rate parameters p0 and p1. In addition, the true methylated counts can be simulated from a binomial or a dispersed binomial distribution (Beta-binomial distribution).

Usage

binomRegMethModelSim(
  n,
  posit,
  theta.0,
  beta,
  phi,
  random.eff = FALSE,
  mu.e = 0,
  sigma.ee = 1,
  p0 = 0.003,
  p1 = 0.9,
  X,
  Z,
  binom.link = "logit",
  verbose = TRUE
)

Arguments

n

sample size

posit

a numeric vector of size p (the number of CpG sites in the considered region) containing the genomic positions;

theta.0

numeric vector of size p which is a functional parameter for the intercept of the GAMM model.

beta

numeric vector of size p which is a functional parameter for the slope of cell type composition.

phi

a vector of length p determining the multiplicative dispersion parameter for each loci in a region. The dispersed-Binomial counts are simulated from beta-binomial distribution, so each element of phi has to be greater than 1.

random.eff

indicates whether adding the subject-specific random effect term e.

mu.e

number, the mean of the random effect.

sigma.ee

positive number, variance of the random effect

p0

the probability of observing a methylated read when the underlying true status is unmethylated. p0 is the rate of false methylation calls, i.e. false positive rate.

p1

the probability of observing a methylated read when the underlying true status is methylated. 1-p1 is the rate of false non-methylation calls, i.e. false negative rate.

X

the matrix of the read coverage for each CpG in each sample; a matrix of n rows and p columns.

Z

numeric matrix with p columns and n rows storing the covariate information.

binom.link

the link function used for simulation

verbose

logical indicates if the algorithm should provide progress report information. The default value is TRUE.

Value

The function returns a list of following objects

  • S a numeric matrix of n rows and p columns containing the true methylation counts;

  • Y a numeric matrix of n rows and p columns containing the observed methylation counts;

  • theta a numeric matrix of n rows and p columns containing the methylation parameter (after the logit transformation);

  • pi a numeric matrix of n rows and p columns containing the true methylation proportions used to simulate the data.

Author(s)

Kaiqiong Zhao

Examples

#------------------------------------------------------------#
data(RAdat)
RAdat.f <- na.omit(RAdat[RAdat$Total_Counts != 0, ])
out <- binomRegMethModel(
   data=RAdat.f, n.k=rep(5, 3), p0=0, p1=1,
   epsilon=10^(-6), epsilon.lambda=10^(-3), maxStep=200, RanEff = FALSE
)
Z = as.matrix(RAdat.f[match(unique(RAdat.f$ID), RAdat.f$ID),
c('T_cell', 'RA')])
set.seed(123)
X = matrix(sample(80, nrow(Z)*length(out$uni.pos), replace = TRUE),
nrow = nrow(Z), ncol = length(out$uni.pos))+10
simdat = binomRegMethModelSim(n=nrow(Z), posit= out$uni.pos,
theta.0=out$Beta.out[,1], beta= out$Beta.out[,-1], random.eff=FALSE,
mu.e=0,sigma.ee=1, p0=0.003, p1=0.9,X=X , Z=Z, binom.link='logit',
phi = rep(1, length(out$uni.pos)))

kaiqiong/SOMNiBUS documentation built on Feb. 24, 2023, 5:38 a.m.