Description Usage Arguments Details Value References Examples
Given training set and training labels, classifies the test samples into existing categories.
1 2 3 4 5 6 7 | qtQDA(training, training.labels, test, prior = NULL,
dispersion = c("common", "trended", "tagwise"), num.genes = NULL)
qtQDA.resampling(training, training.labels, prior=NULL,
dispersion=c("common", "trended", "tagwise"), num.genes=NULL,
resampling=c("bootstrap","cross.validation"), nfold=7, nbs=10)
discriminant(training, training.labels, test, prior = NULL,
dispersion = c("common", "trended", "tagwise"))
|
training |
a |
training.labels |
a character vector of class labels for each sample. The labels show which category each sample comes from. |
test |
a |
prior |
a numeric vector of prior probabilities for each class.
Default value is |
dispersion |
a string to specify which dispersion to be used. The values can be "common", "trended" or "tagwise". Default value is "tagwise". |
num.genes |
a numeric vector containing number of features to be selected. If more than one value is provided, values will be sorted in descending order and the output will be printed out accordingly. If |
resampling |
a character value indicating the resampling method to be used. Values can be "bootstrap", "cross.validation". |
nfold |
a numeric value indicating number of folds to be constructed. Default value is 7. |
nbs |
a numeric value indicating number of bootstrap to be run. Default value is 10. |
This functions are used to classify the test
samples into existing categories using training data.
The data here are assumed to be marginally negative binomial, but dependent.
discriminant
function first estimates the parameters of underlying model from training
data using the statistical methodology implemented in R
package edgeR
, then quantile transforms the data values as proposed in Kochan et al (2019) using those parameter estimations.
To incorporate the dependence, discriminant
then computes the covariance matrices for each class separately and regularises them with cov.shrink
function implemented in R
package corpcor
.
Covariance matrix skrinkage approach was developed by Schafer and Strimmer (2005).
Lastly, discriminant
performs the quadratic discriminant analysis performed on the transformed data and estimates the test sample labels.
qtQDA
and qtQDA.resampling
utilise glmLRTqtQDA
to obtain
a data frame where the rows are same as for training
but sorted in descending order by likelihood ratio statistics (LR). Then applying quantile transformation and Gaussian quadratic discriminant analysis using discriminant
function, qtQDA
and qtQDA.resampling
produce the estimated test set labels.
Dispersion parameters are selected from one of the three sophisticated approaches produced from estimateDisp
function in edgeR
.
discriminant
produces a character vector with the test set labels.
qtQDA
produces a list object containing one component:
classes |
a data frame with the test set labels. |
qtQDA.resampling
produces a list object -when resampling
is cross.validation
- containing two components:
class estimations for last fold |
a data frame with the test set labels for the samples selected by last fold.
If |
mean estimated error rates |
a numeric vector of average error rates estimated for each fold (or bootstrap). |
For both qtQDA
and qtQDA.resampling
, if num.genes
is NULL
test set labels (for qtQDA.resampling
test sets are reselected for each cross validation or bootstrap) are computed by training the algorithm with the entire list of genes; and if num.genes
is of length 2 or more, then a data frame with the column names corresponding to the number of genes provided.
Schafer, J and Strimmer, K (2005). A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics. Statistical Applications in Genetics and Molecular Biology, 4(1), pp. 32. https://doi.org/10.2202/1544-6115.1175
Chen, Y, Lun, ATL, and Smyth, GK (2014). Differential expression analysis of complex RNA-seq experiments using edgeR. Statistical Analysis of Next Generation Sequence Data, Somnath Datta and Daniel S. Nettleton (eds), Springer, New York, pages 51-74. http://www.statsci.org/smyth/pubs/edgeRChapterPreprint.pdf
1 2 3 4 5 6 7 8 9 10 11 12 | data("cervical")
### In this example test sets are included in the training sets to control the algorithm. We expect to see minor or no error in classification.
training <- cervical
training.labels <- c(rep("N", 29), rep("T", 29))
test <- cervical[, c(1:5,54:58)]
test.labels <-c(rep("N", 5), rep("T", 5))
qtQDA(training=training, test=test, training.labels=training.labels,
num.genes=NULL, prior=NULL, dispersion="tagwise")
qtQDA(training=training, test=test, training.labels=training.labels,
num.genes=c(20, 50, 100, 200, 300, 500, 714), prior=NULL, dispersion="tagwise")
discriminant(training, training.labels, test, prior=NULL, dispersion="tagwise")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.