data.generator.y.F: Generate Data using Skewed Pointwise Distributions and...

Description Usage Arguments Details Value Note References Examples

View source: R/Package_Sim.GenerateData.r

Description

Generate function data with skewed pointwise distributions using a Gaussian copula. The Gaussian copula is generated by a eigen-decomposition with the eigenfunction specified by a basis system. This function can be used for simulation to check the performance of various approaches.

Usage

1
2
data.generator.y.F(n.subject, n.timepoints, s, D, csi,lambdas, 
                   basis.system=legendre.polynomials, var.noise=0.10)

Arguments

n.subject

number of subjects in the data denoted by n

n.timepoints

number of timepoints, which are equally spaced in [0,1] and denoted by m

s

vector with length m or matrix with dimension m by n for the mean parameter; See "Details"

D

vector with length m or matrix with dimension m by n for the log varaince parameter; See "Details"

csi

vector with length m or matrix with dimension m by n for the shape parameter; See "Details"

lambdas

vector of eigenvalue for the latent Gaussian process

basis.system

basis system for the latent Gaussian process; legendre.polynomial uses the Legendre polynomials and DFT.basis uses Fourier basis.

var.noise

variance of white noises to corrupte the latent Gaussian Process

Details

The generated data is determined by both the marginal distributions and dependence structure. The marginal distributions require three parameters (mean, variance, shape), which corresponds to (s, D, csi). Each of them can be a vector or a matrix, indicating the distributions to be covariate indepedent (univariate) or covariant dependent (bivariate). When the parameter is a vecotor, it must have the length of n.timepoints.

The latent Gaussian process with 0 mean and unit variance is determined by a correlation matrix. The correlation is given by lambdas, basis.system, which specify the eigenvalues and eigenfunctions correspondingly. We allow measurement error for the considerred Gaussian process, to be corrupted by independent Gaussian noises with mean 0 and variance var.noise.

Value

A list with two components:

data

generated data as a matrix with dimension m by n

corr.true

correlation matrix of the latent Gaussian process (Gaussian copula is used)

Note

When the shape parameter is equal to 0, then the generated data is a regular functional data with covariate-adjusted mean and variances and Gaussian errors.

References

[1]. Meng Li, Ana-Maria Staicu and Howard D. Bondell (2013), Incorporating Covariates in Skewed Functional Data Models. http://www.stat.ncsu.edu/information/library/papers/mimeo2654_Li.pdf.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
###### generate the data set ##################################### 
set.seed(2013)
n.sub <- 100    # number of subjects    
n.tim <- 80     # number of timepoints
# true population-level functions (mean, standard deviation and shape)
true.fun <- function(par){
  if (par == "mean") f =function(x,y) sin(pi*x)* cos(pi*y)
  if (par == "logvar"){
    f=function(x,y)   2*dnorm(x, mean=0.5, sd = 0.2, log=TRUE) + 
      2*dnorm(y, mean=0.5, sd = 0.2, log =TRUE) - 8
  }
  if (par == "shape") f=function(x,y)  10*sin(2*pi*y)
  return(f)
}   
# covriate: Sort at the very beginning
point.cov <- sort(runif(n.sub));  
# timepoints
point.tim <- seq(from=0, to=1, length=n.tim); 
# calculate and collect all the true population-level parameters
col.true <- list(mean = outer(point.cov, point.tim, true.fun("mean")), 
                 logvar = outer(point.cov, point.tim, true.fun("logvar")),
                 shape = outer(point.cov, point.tim, true.fun("shape")))

# generate data
my.data <- data.generator.y.F(n.subject = n.sub, n.timepoints = n.tim, 
                              s = col.true$mean, D = col.true$logvar,   
                              csi=col.true$shape,  
                              lambdas = c(1/2, 1/4, 1/8),
                              basis.system = legendre.polynomials, 
                              var.noise = 0.10)
# here shape is covariate independent
# so it's sufficient to keep 1st row of shape to use csi = col.true$shape[1,]


######################Visualize the data #####################################
par(mfrow = c(2,2))
# plot the data surface 
persp(point.cov, point.tim, my.data$data, theta=60, phi=15,
      ticktype = "detailed", col="lightblue", 
      xlab = "covariate", ylab = "time",
      zlab="data", main="data surface")
# plot the mean surface
persp(point.cov, point.tim, col.true$mean, theta=45, phi=15,
      ticktype = "detailed", col="lightblue",
      xlab = "covariate", ylab = "time",
      zlab="mean", main="mean surface")
# the logvar surface
persp(point.cov, point.tim, col.true$logvar, theta=45, phi=15,
      ticktype = "detailed", col="lightblue",
      xlab = "covariate", ylab = "time",
      zlab="logvar", main="logvar surface")
# the shape surface
persp(point.cov, point.tim, col.true$shape, theta=45, phi=15,
      ticktype = "detailed", col="lightblue",
      xlab = "covariate", ylab = "time",
      zlab="shape", main="shape surface")

cSFM documentation built on May 29, 2017, 6:10 p.m.