Description Usage Arguments Value Examples
View source: R/newCountDataSet.R
Generate two nxp data sets: a training set and a test set, as well as outcome vectors y and yte of length n indicating the class labels of the training and test observations.
| 1 | newCountDataSet(n, p, K, param, sdsignal,drate)
 | 
| n | Number of observations desired. | 
| p | Number of features desired. Note that drate of the features will differ between classes, though some of those differences may be small. | 
| K | Number of classes desired. Note that the function requires that n be at least equal to 4K.i.e. there must be at least 4 observations per class on average. | 
| param | The dispersion parameter for the negative binomial distribution. The negative binomial distribution is parameterized using "mu" and "size" in the R function "rnbinom". That is, Y ~ NB(mu, param) means that E(Y)=mu and Var(Y) = mu+mu^2/param.So when param is very large this is essentially a Poisson distribution, and when param is smaller then there is a lot of overdispersion relative to the Poisson distribution. | 
| sdsignal | The extent to which the classes are different. If this equals zero then there are no class differences and if this is large then the classes are very different. | 
| drate | The proportion of differentially expressed genes | 
list(.) A list of output, "sim_train_data" represents training data of q*n data matrix. "sim_test_data" represents test data of q*n data matrix. The colnames of this two matrix are class labels for the n observations May have q<p because features with 0 total counts are removed. The q features are those with >0 total counts in dataset. So q <= p. "truesf" denotes size factors for training observations."isDE" represnts the differential gene label.
| 1 | dat <- newCountDataSet(n=40,p=500, K=4, param=10, sdsignal=0.1,drate=0.4)
 | 
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.