Description Usage Arguments Value Author(s) See Also Examples
Biomarkers can be identified in several ways: the classical way is to look at those variables with large model coefficients or large t statistics. One other is based on the higher criticism approach (HC), and the third possibility assesses the stability of these coefficients under subsampling of the data set.
1 2 3 4 5 6 7 8 9 |
X |
Data matrix. Usually the number of columns (variables) is (much) larger than the number of rows (samples). |
Y |
Class indication. For classification with two or more factors
a factor; a numeric vector will be interpreted as a regression
situation, which can only be tackled by |
fmethod |
Modelling method(s) employed. The default is to use
|
type |
Whether to use coefficient size as a criterion
( |
ncomp |
Number of latent variables to use in PCR and PLS (VIP)
modelling. In function |
biom.opt |
Options for the biomarker selection - a list with
several named elements. See |
scale.p |
Scaling. This is performed individually in every crossvalidation iteration, and can have a profound effect on the results. Default: "auto" (autoscaling). Other possible choices: "none" for no scaling, "pareto" for pareto scaling, "log" and "sqrt" for log and square root scaling, respectively. |
object, x |
A BMark object. |
... |
Further arguments for modelling functions. Often used to catch unused arguments. |
Function get.biom
returns an object of class "BMark", a
list containing an element
for every fmethod
that is selected, as well as an element
info
. The individual elements contain information depending on
the type chosen: for type == "coef"
, the only element returned
is a matrix containing coefficient sizes. For type == "HC"
and type == "stab"
, a list is returned containing elements
biom.indices
, and either pvals
(for type == "HC"
)
or fraction.selected
(for type == "stab"
).
Element biom.indices
contains the indices of
the selected variables, and can be extracted using function
selection
. Element pvals
contains the p values
used to perform HC thresholding; these are presented in the original
order of the variables, and can be obtained directly from e.g. t
statistics, or from permutation sampling. Element
fraction.selected
indicates in what fraction of the
stability selection iterations a particular variable has been
selected. The more often it has been selected, the more stable it is
as a biomarker. Generic function coef.biom
extracts model
coefficients, p values or stability fractions for types "coef"
,
"HC"
and "stab"
, respectively.
Ron Wehrens
biom.options
, get.segments
,
selection
, scalefun
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 | ## Real apple data (small set)
data(spikedApples)
apple.coef <- get.biom(X = spikedApples$dataMatrix,
Y = factor(rep(1:2, each = 10)),
ncomp = 2:3, type = "coef")
coef.sizes <- coef(apple.coef)
sapply(coef.sizes, range)
## stability-based selection
set.seed(17)
apple.stab <- get.biom(X = spikedApples$dataMatrix,
Y = factor(rep(1:2, each = 10)),
ncomp = 2:3, type = "stab")
selected.variables <- selection(apple.stab)
unlist(sapply(selected.variables, function(x) sapply(x, length)))
## Ranging from more than 70 for pcr, approx 40 for pls and student t,
## to 0-29 for the lasso
unlist(sapply(selected.variables,
function(x) lapply(x, function(xx, y) sum(xx %in% y),
spikedApples$biom)))
## TPs (stab): all find 5/5, except pcr.2 and the lasso with values for lambda
## larger than 0.0484
unlist(sapply(selected.variables,
function(x) lapply(x, function(xx, y) sum(!(xx %in% y)),
spikedApples$biom)))
## FPs (stab): PCR finds most FPs (approx. 60), other latent-variable
## methods approx 40, lasso allows for the optimal selection around
## lambda = 0.0702
## regression example
data(gasoline) ## from the pls package
gasoline.stab <- get.biom(gasoline$NIR, gasoline$octane,
fmethod = c("pcr", "pls", "lasso"), type = "stab")
## Not run:
## Same for HC-based selection
## Warning: takes a long time!
apple.HC <- get.biom(X = spikedApples$dataMatrix,
Y = factor(rep(1:2, each = 10)),
ncomp = 2:3, type = "HC")
sapply(apple.HC[names(apple.HC) != "info"],
function(x, y) sum(x$biom.indices %in% y),
spikedApples$biom)
sapply(apple.HC[names(apple.HC) != "info"],
function(x, y) sum(!(x$biom.indices %in% y)),
spikedApples$biom)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.