Home

/

GitHub

/

JoshuaTian/EpiCluster

/

GenSimData: A function generate Artificial Simulation Dataset.

GenSimData: A function generate Artificial Simulation Dataset.
In JoshuaTian/EpiCluster: Clustering of (epigenetic) DNA methylation data using a variational Bayes NMF algorithm

Description Usage Arguments Value Author(s) References Examples

View source: R/GenSimData.R

This is a function used to generate simulation data. In the simulation data, number of phenotypes, number of samples in each phenotype and number of significant CpGs can be assigned. The function randomly generates significant differential CpGs in each phenotype. This dataset can be used to test bgNMF function and EpiCluster.

1 2	GenSimData(Ncpg = 10000, Npheno = 3, Nsample = 10, alpha_p = 1, beta_p = 3, Nsig = 1000)

`Ncpg`	Number of total CpGs will be generated. Number of rows for generated Beta matrix. The default number is 10000.
`Npheno`	Number of PhenoTypes in artificial dataset. Note that number of total samples is the number of phenotypes multiply number of samples in each phenotype. Each phenotype will share same number of sample. The default number is 3.
`Nsample`	Number of samples in each PhenoTypes in aritificial dataset. The default number is 10 which means each phenotype will contain 10 samples.
`alpha_p`	One parameter control the beta distribution of artificial simulation data, which will be used to generated beta-distributed data as rbeta(N,alpha_p,beta_p). The default number is 1.
`beta_p`	one parameter control the beta distribution of artificial simulation data, which will be used to generate beta-distributed data as rbeta(N,alpha_p,beta_p). The default number is 3.
`Nsig`	Number of significant CpGs in each PhenoTypes among all Simulation Dataset. The default number is 1000. Normally, number of significant CpGs should be around 10% of total CpG.

A list will be returned, which contain three following information. This simulation data was merely used to test EpiCluster package.

`beta`	The aritificial simulation beta matrix generated from this function, which can be used to do EpiCluster clustering.
`SigCpG`	A list recording significant CpGs selected in each phenotype. Users may use this to estimate optimised ee value and effect of bgNMF.
`pheno.v`	PhenoTypes for each sample. This maybe used in EpiDraw function or EpiAnalysis function.

Yuan Tian, Zhanyu Ma, Andrew Teschendorff

Yuan T, Ma Z, Beck S, Teschendorff AE. (2015). A fast variational Bayes dimensional reduction and clustering algorithm for Epigenome-Wide Association Studies (EWAS). Under Review.

1
2
3

    Data <- GenSimData(Ncpg=20000,Npheno=5,Nsig=1200)

    Data <- GenSimData(Ncpg=10000,Npheno=3,Nsample=30,Nsig=1000)

JoshuaTian/EpiCluster documentation built on May 20, 2019, 10:19 p.m.

JoshuaTian/EpiCluster index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

JoshuaTian/EpiCluster
Clustering of (epigenetic) DNA methylation data using a variational Bayes NMF algorithm

GenSimData: A function generate Artificial Simulation Dataset.
In JoshuaTian/EpiCluster: Clustering of (epigenetic) DNA methylation data using a variational Bayes NMF algorithm

Description

Usage

Arguments

Value

Author(s)

References

Examples

Related to GenSimData in JoshuaTian/EpiCluster...

R Package Documentation

Browse R Packages

We want your feedback!

JoshuaTian/EpiCluster Clustering of (epigenetic) DNA methylation data using a variational Bayes NMF algorithm

GenSimData: A function generate Artificial Simulation Dataset. In JoshuaTian/EpiCluster: Clustering of (epigenetic) DNA methylation data using a variational Bayes NMF algorithm

Description

Usage

Arguments

Value

Author(s)

References

Examples

Related to GenSimData in JoshuaTian/EpiCluster...

R Package Documentation

Browse R Packages

We want your feedback!

JoshuaTian/EpiCluster
Clustering of (epigenetic) DNA methylation data using a variational Bayes NMF algorithm

GenSimData: A function generate Artificial Simulation Dataset.
In JoshuaTian/EpiCluster: Clustering of (epigenetic) DNA methylation data using a variational Bayes NMF algorithm