Description Usage Arguments Value Examples

View source: R/newCountDataSet.R

Generate two nxp data sets: a training set and a test set, as well as outcome vectors y and yte of length n indicating the class labels of the training and test observations.

1 | ```
newCountDataSet(n, p, K, param, sdsignal,drate)
``` |

`n` |
Number of observations desired. |

`p` |
Number of features desired. Note that drate of the features will differ between classes, though some of those differences may be small. |

`K` |
Number of classes desired. Note that the function requires that n be at least equal to 4K.i.e. there must be at least 4 observations per class on average. |

`param` |
The dispersion parameter for the negative binomial distribution. The negative binomial distribution is parameterized using "mu" and "size" in the R function "rnbinom". That is, Y ~ NB(mu, param) means that E(Y)=mu and Var(Y) = mu+mu^2/param.So when param is very large this is essentially a Poisson distribution, and when param is smaller then there is a lot of overdispersion relative to the Poisson distribution. |

`sdsignal` |
The extent to which the classes are different. If this equals zero then there are no class differences and if this is large then the classes are very different. |

`drate` |
The proportion of differentially expressed genes |

list(.) A list of output, "sim_train_data" represents training data of q*n data matrix. "sim_test_data" represents test data of q*n data matrix. The colnames of this two matrix are class labels for the n observations May have q<p because features with 0 total counts are removed. The q features are those with >0 total counts in dataset. So q <= p. "truesf" denotes size factors for training observations."isDE" represnts the differential gene label.

1 | ```
dat <- newCountDataSet(n=40,p=500, K=4, param=10, sdsignal=0.1,drate=0.4)
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.