precision.simulate.flex: Classification analysis of simulation study (with more...
In LXQin/precision: PaiREd miCrorna sImulation on Study desIgn for mOlecular classificatioN

Description Usage Arguments Details Value References Examples

Perform the simulation study similar to the one in Qin et al., but allow different combinations of study designs and normalization methods on training set and test set and support one internal validation and two external validations - one using uniformly-handled test set and the other one using nonuniformly-handled test set.

precision.simulate.flex(seed, N, biological.effect.tr, biological.effect.te,
  handling.effect.tr, handling.effect.te, group.id.tr, group.id.te,
  design.tr.list, design.te.list = NULL, norm.tr.list = c("NN", "QN"),
  norm.te.list = NULL, class.list = c("PAM", "LASSO"),
  valid.list = c("int", "ext.uh", "ext.sim.nuh"), batch.id.tr = NULL,
  batch.id.te = NULL, icombat = FALSE, isva = FALSE, iruv = FALSE,
  biological.effect.tr.ctrl = NULL, handling.effect.tr.ctrl = NULL,
  norm.tr.funcs = NULL, norm.te.funcs = NULL, class.funcs = NULL,
  pred.funcs = NULL)

`seed`	an integer used to initialize a pseudorandom number generator.
`N`	number of simulation runs.
`biological.effect.tr`	the training set of the estimated biological effects. This dataset must have rows as probes and columns as samples.
`biological.effect.te`	the test set of the estimated biological effects. This dataset must have rows as probes and columns as samples. It must have the same number of probes and the same probe names as the training set of the estimated biological effects.
`handling.effect.tr`	the training set of the estimated handling effects. This dataset must have rows as probes and columns as samples. It must have the same dimensions and the same probe names as the training set of the estimated biological effects.
`handling.effect.te`	the test set of the estimated handling effects. This dataset must have rows as probes, columns as samples. It must have the same dimensions and the same probe names as the training set of the estimated handling effects.
`group.id.tr`	a vector of sample-group labels for each sample of the training set of the estimated biological effects. It must be a 2-level non-numeric factor vector.
`group.id.te`	a vector of sample-group labels for each sample of the test set of the estimated biological effects. It must be a 2-level non-numeric factor vector.
`design.tr.list`	a list of strings for study designs on the training set to be compared in the simulation study. The built-in designs are "CC+", "CC-", "PC+", "PC-", "BLK", and "STR" for "Complete Confounding 1", "Complete Confounding 2", "Partial Confounding 1", "Partial Confounding 2", "Blocking", and "Stratification" in Qin et al.
`design.te.list`	a list of strings for study designs on the test set to be compared in the simulation study. It must have the same length as `design.tr.list`. See `design.tr.list` for the built-in designs.
`norm.tr.list`	a list of strings for normalization methods on the training set to be compared in the simulation study. It must have the same length as `design.tr.list` and `design.te.list`. The build-in available normalization methods are "NN", "QN", "MN", "VSN" for "No Normalization", "Quantile Normalization", "Median Normalization", "Variance Stabilizing Normalization". User can provide a list of normalization methods given the functions are supplied (also see `norm.tr.funcs`).
`norm.te.list`	a list of strings for normalization methods on the test set to be compared in the simulation study. It must have the same length as `norm.te.list`. See `norm.tr.list` for the build-in available normalization methods. User can provide a list of normalization methods given the functions are supplied (also see `norm.tr.funcs`).
`class.list`	a list of strings for classification methods to be compared in the simulation study. The built-in classification methods are "PAM" and "LASSO" for "prediction analysis for microarrays" and "least absolute shrinkage and selection operator". User can provide a list of classification methods given the correponding model-building and predicting functions are supplied (also see `class.funcs` and `pred.funcs`).
`valid.list`	a list of strings for validation methods to be compared in the simulation study. The built-in validation methods are: `int`, `ext.uh`, and `ext.sim.nuh` which respectively represent internal validation, external validation using uniformly-handled test set (i.e., with biological effects only), and external validation using nonuniformly-handled test set. By default, `valid.list = c("int", "ext.uh", "ext.sim.nuh")`.
`batch.id.tr`	a list of array indices grouped by batches when training data were profiled. The length of the list must be equal to the number of batches in the training data; the number of array indices must be the same as the number of samples. This is required if stratification study design is specified in `design.tr.list`; otherwise `batch.id.tr = NULL`.
`batch.id.te`	a list of array indices grouped by batches when test data were profiled. The length of the list must be equal to the number of batches in the test data; the number of array indices must be the same as the number of samples. This is required if stratification study design is specified in `design.te.list`; otherwise `batch.id.te = NULL`.
`icombat`	an indicator for combat adjustment. By default, `icombat = FALSE` for no ComBat adjustment.
`isva`	an indicator for sva adjustment. By default, `isva = FALSE` for no sva adjustment.
`iruv`	an indicator for RUV-4 adjustment. By default, `iruv = FALSE` for no RUV-4 adjustment.
`biological.effect.tr.ctrl`	the training set of the negative-control probe biological effect data if `iruv = TRUE`. This dataset must have rows as probes and columns as samples. It also must have the same number of samples and the same sample names as `biological.effect.tr`.
`handling.effect.tr.ctrl`	the training set of the negative-control probe handling effect data if `iruv = TRUE`. This dataset must have rows as probes and columns as samples. It also must have the same dimensions and the same probe names as `biological.effect.tr.ctrl`.
`norm.tr.funcs`	a list of strings for names of user-defined normalization method functions for the training set, in the order of `norm.tr.list`, excluding any built-in normalization methods.
`norm.te.funcs`	a list of strings for names of user-defined normalization method functions for the test set, in the order of `norm.te.list`, excluding any built-in normalization methods.
`class.funcs`	a list of strings for names of user-defined classification model-building functions, in the order of `class.list`, excluding any built-in classification methods.
`pred.funcs`	a list of strings for names of user-defined classification predicting functions, in the order of `class.list`, excluding any built-in classification methods.

The main steps of the classification anlaysis of simulation study are explained in precision.simulate in details. This function includes more flexible functionalities such as allowing different combinations of study designs and normalization methods on training and test sets. For instance, user can now compare the effect of frozen quantile normalization by running two simulations: 1) quantile normalization on both training set and test set and 2) quantile normalization on training set but forzen quantile normalization on test set. Or user can compare the effect of forzen normaliation on test set by varying frozen normalization methods on test set but fixing normalization method on traing set for two simulations. Another functionality is that user can include a third external validation method - using the nonuniformly-handled test set as the external independent set. The nonuniformly-handled test set is simulated the same way as the nonuniformly-handled training set, using user-specified study design(s).

simulation study results – a list of array-to-sample assignments, fitted models, and misclassification error rates across simulation runs:

`assign_store`	array-to-sample assignments for each study design
`model_store`	models for each combination of study designs, normalization methods, and classification methods
`error_store`	misclassification error rates of the specified validation method(s) for each combination of study designs, normalization methods, and classification methods

Qin LX, Huang HC, Begg CB. Cautionary note on cross validation in molecular classification. Journal of Clinical Oncology. 2016

## Not run: 
set.seed(101)
biological.effect <- estimate.biological.effect(uhdata = uhdata.pl)
handling.effect <- estimate.handling.effect(uhdata = uhdata.pl,
                             nuhdata = nuhdata.pl)

ctrl.genes <- unique(rownames(uhdata.pl))[grep("NC", unique(rownames(uhdata.pl)))]

biological.effect.nc <- biological.effect[!rownames(biological.effect) %in%
  ctrl.genes, ]
handling.effect.nc <- handling.effect[!rownames(handling.effect) %in% ctrl.genes, ]

group.id <- substr(colnames(biological.effect.nc), 7, 7)

# randomly split biological effect data into training and test set with
# equal number of endometrial and ovarian samples
biological.effect.train.ind <- colnames(biological.effect.nc)[c(sample(which(
  group.id == "E"), size = 64), sample(which(group.id == "V"), size = 64))]
biological.effect.test.ind <- colnames(biological.effect.nc)[!colnames(
  biological.effect.nc) %in% biological.effect.train.ind]
biological.effect.train.test.split =
  list("tr" = biological.effect.train.ind,
       "te" = biological.effect.test.ind)

# non-randomly split handling effect data into training and test set
handling.effect.train.test.split =
  list("tr" = c(1:64, 129:192),
       "te" = 65:128)

biological.effect.nc.tr <- biological.effect.nc[, biological.effect.train.ind]
biological.effect.nc.te <- biological.effect.nc[, biological.effect.test.ind]
handling.effect.nc.tr <- handling.effect.nc[, c(1:64, 129:192)]
handling.effect.nc.te <- handling.effect.nc[, 65:128]

# Simulation without batch adjustment
precision.results.flex <- precision.simulate.flex(seed = 1, N = 3,
  biological.effect.tr = biological.effect.nc.tr,
  biological.effect.te = biological.effect.nc.te,
  handling.effect.tr = handling.effect.nc.tr,
  handling.effect.te = handling.effect.nc.te,
  group.id.tr = substr(colnames(biological.effect.nc.tr), 7, 7),
  group.id.te = substr(colnames(biological.effect.nc.te), 7, 7),
  design.tr.list = c("PC-", "PC-", "STR", "STR"),
  design.te.list = c("PC-", "STR", "PC-", "STR"),
  norm.tr.list = c("NN", "QN", "NN", "QN"),
  norm.te.list = c("NN", "NN", "QN", "QN"),
  class.list = c("PAM", "LASSO"),
  batch.id.tr = list(1:40,
    41:64,
    (129:160) - 64,
    (161:192) - 64),
  batch.id.te = list(1:32, 33:64))

## End(Not run)

LXQin/precision documentation built on May 11, 2019, 6:24 p.m.

LXQin/precision index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

LXQin/precision
PaiREd miCrorna sImulation on Study desIgn for mOlecular classificatioN

precision.simulate.flex: Classification analysis of simulation study (with more...
In LXQin/precision: PaiREd miCrorna sImulation on Study desIgn for mOlecular classificatioN

Description

Usage

Arguments

Details

Value

References

Examples

Related to precision.simulate.flex in LXQin/precision...

R Package Documentation

Browse R Packages

We want your feedback!

LXQin/precision PaiREd miCrorna sImulation on Study desIgn for mOlecular classificatioN

precision.simulate.flex: Classification analysis of simulation study (with more... In LXQin/precision: PaiREd miCrorna sImulation on Study desIgn for mOlecular classificatioN

Description

Usage

Arguments

Details

Value

References

Examples

Related to precision.simulate.flex in LXQin/precision...

R Package Documentation

Browse R Packages

We want your feedback!

LXQin/precision
PaiREd miCrorna sImulation on Study desIgn for mOlecular classificatioN

precision.simulate.flex: Classification analysis of simulation study (with more...
In LXQin/precision: PaiREd miCrorna sImulation on Study desIgn for mOlecular classificatioN