simuDataREM: Data Simulation Under the Random-Effects Mixture Model In DIRECT: Bayesian Clustering of Multivariate Data Under the Dirichlet-Process Prior

Description

Function simuDataREM simulates data under the Ornstein-Uhlenbeck (OU) (or Brownian Motion; BM) process-based random-effects mixture (REM) model.

Usage

 1 2 simuDataREM(pars.mtx, dt, T, ntime, nrep, nsize, times, method = c("eigen", "svd", "chol"), model = c("OU", "BM")) 

Arguments

 pars.mtx A K \times 8 matrix, where K is the number of clusters. Each row contains 8 parameters: standard deviation of within-cluster variability, of variability across time points, and of replicates, respectively; mean and standard deviation for the value at the first time point; the overall mean, standard deviation and mean-reverting rate of the OU process. dt Increment in times. T Maximum time. ntime Number of time points to simulate data for. Needs to be same as the length of vector times. nrep Number of replicates. nsize An integer vector containing sizes of simulated clusters. times Vector of length ntime indicating at which time points to simulate data. method Method to compute the determinant of the covariance matrix in the calculation of the multivariate normal density. Required. Method choices are: "chol" for Choleski decomposition, "eigen" for eigenvalue decomposition, and "svd" for singular value decomposition. model Model to generate realizations of the mean vector of a mixture component. Required. Choices are: "OU" for an Ornstein-Uhlenbeck process (a.k.a. the mean-reverting process) and "BM" for a Brown motion (without drift).

Value

 means A matrix of ntime columns. The number of rows is the same as that of pars.mtx, which is the number of clusters. Each row contains the true mean vector of the corresponding cluster. data A matrix of N rows and ntime*nrep+1 columns, where N is the sum of cluster sizes nsize. The first column contains the true cluster membership of the corresponding item. The rest of the columns in each row is formatted as follows: values for replicate 1 through nrep at time 1; values for replicate 1 through nrep at time 2, ...

Audrey Q. Fu

References

Fu, A. Q., Russell, S., Bray, S. and Tavare, S. (2013) Bayesian clustering of replicated time-course gene expression data with weak signals. The Annals of Applied Statistics. 7(3) 1334-1361.

plotSimulation for plotting simulated data.

outputData for writing simulated data and parameter values used in simulation into external files.

DIRECT for clustering the data.

Examples

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 ## Not run: # Simulate replicated time-course gene expression profiles # from OU processes # Simulation parameters times = c(0,5,10,15,20,25,30,35,40,50,60,70,80,90,100,110,120,150) ntime=length (times) nrep=4 nclust = 6 npars = 8 pars.mtx = matrix (0, nrow=nclust, ncol=npars) # late weak upregulation or downregulation pars.mtx[1,] = c(0.05, 0.1, 0.5, 0, 0.16, 0.1, 0.4, 0.05) # repression pars.mtx[2,] = c(0.05, 0.1, 0.5, 1, 0.16, -1.0, 0.1, 0.05) # early strong upregulation pars.mtx[3,] = c(0.05, 0.5, 0.2, 0, 0.16, 2.5, 0.4, 0.15) # strong repression pars.mtx[4,] = c(0.05, 0.5, 0.2, 1, 0.16, -1.5, 0.4, 0.1) # low upregulation pars.mtx[5,] = c(0.05, 0.3, 0.3, -0.5, 0.16, 0.5, 0.2, 0.08) # late strong upregulation pars.mtx[6,] = c(0.05, 0.3, 0.3, -0.5, 0.16, 0.1, 1, 0.1) nsize = rep(40, nclust) # Generate data simudata = simuDataREM (pars=pars.mtx, dt=1, T=150, ntime=ntime, nrep=nrep, nsize=nsize, times=times, method="svd", model="OU") # Display simulated data plotSimulation (simudata, times=times, nsize=nsize, nrep=nrep, lty=1, ylim=c(-4,4), type="l", col="black") # Write simulation parameters and simulated data # to external files outputData (datafilename= "simu_test.dat", parfilename= "simu_test.par", meanfilename= "simu_test_mean.dat", simudata=simudata, pars=pars.mtx, nitem=sum(nsize), ntime=ntime, nrep=nrep) ## End(Not run) 

DIRECT documentation built on May 1, 2019, 8:08 p.m.