sample.data: Sample Data for AclustsCCA R Package

sample.dataR Documentation

Sample Data for AclustsCCA R Package

Description

Generates synthetic data with p=20 exposures and q=100 DNA methylation probe data in chromosome 7. A correlation structure of exposures are defined as a block diagonal where each block is size of five with the first order autoregressive correlation structure of correlation 0.7. A correlation structure of outcomes are defined as a block diagonal where each block is size of its clusters defined by bumphunter with the first order autoregressive correlation structure of correlation 0.9. The data is generated from multivariate normal distribution.

A five exposures (exposures 1,2,3,9, and 10) are associated with a five DMR regions (each of size 2,3,4,7,and 12).

Usage

sample.data

Format

A list of length 10, which are:

DATA.X

A n=1000 by p matrix where n is a number of subjects and p is a number of exposures.

DATA.Y

A n=1000 by q matrix where n is a number of subjects and q is a number of CpG sites.

DATA.Z

A n=1000 by 2 matrix where n is a number of subjects and 2 is a number of confounders.

clusters.list

A list of 9 clusters with CpG sites obtained using A-clustering, each item is a cluster that contains a set of probes.

TRUE.Clusters

A index of clusters that are truly associated with exposures

TRUE.Cancors

A true canonical correlation for each true clusters

TRUE.CpGs

A list of 28 true CpG sites that are associated with exposures

TRUE.Exposures

A vector of 5 true exposures that are associated with outcome

TRUE.ALPHA

A vector of length p which represents true loading vector alpha

TRUE.BETA

A vector of length q which represents true loading vector beta

SIGMA.XX

A p by p exposure correlation matrix used to generated data

SIGMA.YY

A q by q outcome correlation matrix used to generated data

SIGMA.XY

A p by q correlation matrix of exposure and outcome used to generated data


jennyjyounglee/AclustsCCA documentation built on June 15, 2022, 7:45 p.m.