newCountDataSet: Generate a simulated sequencing data set using a negative...
In zhangli1109/CAEN: Category encoding method for selecting feature genes for the classification of single-cell RNA-seq

Description Usage Arguments Value Examples

View source: R/newCountDataSet.R

Generate two nxp data sets: a training set and a test set, as well as outcome vectors y and yte of length n indicating the class labels of the training and test observations.

1	newCountDataSet(n, p, K, param, sdsignal,drate)

`n`	Number of observations desired.
`p`	Number of features desired. Note that drate of the features will differ between classes, though some of those differences may be small.
`K`	Number of classes desired. Note that the function requires that n be at least equal to 4K.i.e. there must be at least 4 observations per class on average.
`param`	The dispersion parameter for the negative binomial distribution. The negative binomial distribution is parameterized using "mu" and "size" in the R function "rnbinom". That is, Y ~ NB(mu, param) means that E(Y)=mu and Var(Y) = mu+mu^2/param.So when param is very large this is essentially a Poisson distribution, and when param is smaller then there is a lot of overdispersion relative to the Poisson distribution.
`sdsignal`	The extent to which the classes are different. If this equals zero then there are no class differences and if this is large then the classes are very different.
`drate`	The proportion of differentially expressed genes

list(.) A list of output, "sim_train_data" represents training data of q*n data matrix. "sim_test_data" represents test data of q*n data matrix. The colnames of this two matrix are class labels for the n observations May have q<p because features with 0 total counts are removed. The q features are those with >0 total counts in dataset. So q <= p. "truesf" denotes size factors for training observations."isDE" represnts the differential gene label.

1	dat <- newCountDataSet(n=40,p=500, K=4, param=10, sdsignal=0.1,drate=0.4)

zhangli1109/CAEN documentation built on Nov. 14, 2020, 11:41 a.m.

zhangli1109/CAEN index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

zhangli1109/CAEN
Category encoding method for selecting feature genes for the classification of single-cell RNA-seq

newCountDataSet: Generate a simulated sequencing data set using a negative...
In zhangli1109/CAEN: Category encoding method for selecting feature genes for the classification of single-cell RNA-seq

Description

Usage

Arguments

Value

Examples

Related to newCountDataSet in zhangli1109/CAEN...

R Package Documentation

Browse R Packages

We want your feedback!

zhangli1109/CAEN Category encoding method for selecting feature genes for the classification of single-cell RNA-seq

newCountDataSet: Generate a simulated sequencing data set using a negative... In zhangli1109/CAEN: Category encoding method for selecting feature genes for the classification of single-cell RNA-seq

Description

Usage

Arguments

Value

Examples

Related to newCountDataSet in zhangli1109/CAEN...

R Package Documentation

Browse R Packages

We want your feedback!

zhangli1109/CAEN
Category encoding method for selecting feature genes for the classification of single-cell RNA-seq

newCountDataSet: Generate a simulated sequencing data set using a negative...
In zhangli1109/CAEN: Category encoding method for selecting feature genes for the classification of single-cell RNA-seq