Description Usage Arguments Details Value Author(s) References Examples

This function fits Negative Binomial classifier using various model parameters and finds the best model parameter using the resampling based performance measures.

1 2 3 | ```
trainNBLDA(x, y, type = c("mle", "deseq", "quantile", "tmm"),
tuneLength = 10, metric = c("accuracy", "error"),
train.control = nbldaControl(), ...)
``` |

`x` |
a n-by-p data frame or matrix. Samples should be in the rows and variables in the columns. Used to train the classifier. |

`y` |
a vector of length n. Each element corresponds to a class label of a sample. Integer and/or factor types are allowed. |

`type` |
a character string indicating the type of normalization method within the NBLDA model. See details. |

`tuneLength` |
a positive integer. This is the total number of levels to be used while tuning the model parameter(s). |

`metric` |
which criteria should be used while determining the best parameter? overall accuracu or avarage number of misclassified samples? |

`train.control` |
a list with control parameters to be used in NBLDA model. See nbldaControl for details. |

`...` |
further arguments. Deprecated. |

NBLDA is proposed to classify count data from any field, e.g. economics, social sciences, genomics, etc.
In RNA-Seq studies, for example, normalization is used to adjust between-sample differences for downstream analysis.
`type`

is used to define normalization method. Available options are "mle", "deseq", "quantile" and "tmm".
Since "deseq", "quantile" and "tmm" methods are originally proposed as robust methods to be used in RNA-Sequencing studies, one should
carefully define normalization types. In greater details, "deseq" estimates the size factors by dividing each sample by the geometric means
of the transcript counts (Anders and Huber, 2010). "tmm" trims the lower and upper side of the data by log fold changes to
minimize the log-fold changes between the samples and by absolute intensity (Robinson and Oshlack, 2010). "quantile" is
quantile normalization approach of Bullard et al (2010). "mle" (less robust) divides total counts of each sample to the grand total
counts (Witten, 2010). See related papers for mathematical backgrounds.

an `nblda`

object with following slots:

`input` |
an |

`result` |
an |

`call` |
a call expression. |

Dincer Goksuluk

Witten, DM (2011). Classification and clustering of sequencing data using a Poisson model. Ann. Appl. Stat. 5(4), 2493–2518. doi:10.1214/11-AOAS493.

Dong, K., Zhao, H., Tong, T., & Wan, X. (2016). NBLDA: negative binomial linear discriminant analysis for RNA-Seq data. BMC Bioinformatics, 17(1), 369. http://doi.org/10.1186/s12859-016-1208-1.

Anders S. Huber W. (2010). Differential expression analysis for sequence count data. Genome Biology, 11:R106

Witten D. et al. (2010) Ultra-high throughput sequencing-based small RNA discovery and discrete statistical biomarker analysis in a collection of cervical tumours and matched controls. BMC Biology, 8:58

Robinson MD, Oshlack A (2010). A scaling normalization method for differential expression analysis of RNA-Seq data. Genome Biology, 11:R25, doi:10.1186/gb-2010-11-3-r25

1 2 3 4 5 6 7 8 9 10 11 12 13 | ```
set.seed(2128)
counts <- generateCountData(n = 20, p = 10, K = 2, param = 1, sdsignal = 0.5, DE = 0.8,
allZero.rm = FALSE, tag.samples = TRUE)
x <- t(counts$x + 1)
y <- counts$y
xte <- t(counts$xte + 1)
ctrl <- nbldaControl(folds = 2, repeats = 2)
fit <- trainNBLDA(x = x, y = y, type = "mle", tuneLength = 10,
metric = "accuracy", train.control = ctrl)
fit
nbldaTrained(fit) # Cross-validated model summary.
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.