data.generator.y.F: Generate Data using Skewed Pointwise Distributions and...
In cSFM: Covariate-adjusted Skewed Functional Model (cSFM)

Description Usage Arguments Details Value Note References Examples

View source: R/Package_Sim.GenerateData.r

Generate function data with skewed pointwise distributions using a Gaussian copula. The Gaussian copula is generated by a eigen-decomposition with the eigenfunction specified by a basis system. This function can be used for simulation to check the performance of various approaches.

1 2	data.generator.y.F(n.subject, n.timepoints, s, D, csi,lambdas, basis.system=legendre.polynomials, var.noise=0.10)

`n.subject`	number of subjects in the data denoted by n
`n.timepoints`	number of timepoints, which are equally spaced in [0,1] and denoted by m
`s`	vector with length m or matrix with dimension m by n for the mean parameter; See "Details"
`D`	vector with length m or matrix with dimension m by n for the log varaince parameter; See "Details"
`csi`	vector with length m or matrix with dimension m by n for the shape parameter; See "Details"
`lambdas`	vector of eigenvalue for the latent Gaussian process
`basis.system`	basis system for the latent Gaussian process; `legendre.polynomial` uses the Legendre polynomials and `DFT.basis` uses Fourier basis.
`var.noise`	variance of white noises to corrupte the latent Gaussian Process

The generated data is determined by both the marginal distributions and dependence structure. The marginal distributions require three parameters (mean, variance, shape), which corresponds to (s, D, csi). Each of them can be a vector or a matrix, indicating the distributions to be covariate indepedent (univariate) or covariant dependent (bivariate). When the parameter is a vecotor, it must have the length of n.timepoints.

The latent Gaussian process with 0 mean and unit variance is determined by a correlation matrix. The correlation is given by lambdas, basis.system, which specify the eigenvalues and eigenfunctions correspondingly. We allow measurement error for the considerred Gaussian process, to be corrupted by independent Gaussian noises with mean 0 and variance var.noise.

A list with two components:

`data`	generated data as a matrix with dimension m by n
`corr.true`	correlation matrix of the latent Gaussian process (Gaussian copula is used)

When the shape parameter is equal to 0, then the generated data is a regular functional data with covariate-adjusted mean and variances and Gaussian errors.

[1]. Meng Li, Ana-Maria Staicu and Howard D. Bondell (2013), Incorporating Covariates in Skewed Functional Data Models. http://www.stat.ncsu.edu/information/library/papers/mimeo2654_Li.pdf.

###### generate the data set ##################################### 
set.seed(2013)
n.sub <- 100    # number of subjects    
n.tim <- 80     # number of timepoints
# true population-level functions (mean, standard deviation and shape)
true.fun <- function(par){
  if (par == "mean") f =function(x,y) sin(pi*x)* cos(pi*y)
  if (par == "logvar"){
    f=function(x,y)   2*dnorm(x, mean=0.5, sd = 0.2, log=TRUE) + 
      2*dnorm(y, mean=0.5, sd = 0.2, log =TRUE) - 8
  }
  if (par == "shape") f=function(x,y)  10*sin(2*pi*y)
  return(f)
}   
# covriate: Sort at the very beginning
point.cov <- sort(runif(n.sub));  
# timepoints
point.tim <- seq(from=0, to=1, length=n.tim); 
# calculate and collect all the true population-level parameters
col.true <- list(mean = outer(point.cov, point.tim, true.fun("mean")), 
                 logvar = outer(point.cov, point.tim, true.fun("logvar")),
                 shape = outer(point.cov, point.tim, true.fun("shape")))

# generate data
my.data <- data.generator.y.F(n.subject = n.sub, n.timepoints = n.tim, 
                              s = col.true$mean, D = col.true$logvar,   
                              csi=col.true$shape,  
                              lambdas = c(1/2, 1/4, 1/8),
                              basis.system = legendre.polynomials, 
                              var.noise = 0.10)
# here shape is covariate independent
# so it's sufficient to keep 1st row of shape to use csi = col.true$shape[1,]


######################Visualize the data #####################################
par(mfrow = c(2,2))
# plot the data surface 
persp(point.cov, point.tim, my.data$data, theta=60, phi=15,
      ticktype = "detailed", col="lightblue", 
      xlab = "covariate", ylab = "time",
      zlab="data", main="data surface")
# plot the mean surface
persp(point.cov, point.tim, col.true$mean, theta=45, phi=15,
      ticktype = "detailed", col="lightblue",
      xlab = "covariate", ylab = "time",
      zlab="mean", main="mean surface")
# the logvar surface
persp(point.cov, point.tim, col.true$logvar, theta=45, phi=15,
      ticktype = "detailed", col="lightblue",
      xlab = "covariate", ylab = "time",
      zlab="logvar", main="logvar surface")
# the shape surface
persp(point.cov, point.tim, col.true$shape, theta=45, phi=15,
      ticktype = "detailed", col="lightblue",
      xlab = "covariate", ylab = "time",
      zlab="shape", main="shape surface")