Prior computing proportion of overlap between ranked vector of features
it is necessary to remove the redundant features.
This can be accomplished using a number of methods implemeted
filterRedundant function, as explained below.
1 2 3
a data.frame from which redundant features (rows) must be removed.
character. The method used for removing redundancy.
Currently available methods are:
character or numeric. Name or index of the column containing redundant identifiers (e.g. ENTREZID, SYMBOLS, ...).
character or numeric. Name or index of the column
containing the ranking statistics (used only with
logical. Indicates whether the absolute statistics,
as defined by
logical. Indicates whether reodering should be
decreasing or not (used only with
numeric. Indicates whether a trimmed mean should
be computed (used only with
further arguments to be passed (not currently implemented).
maxORmin method removes
redundant features by selecting the rows
that correspond to the maximum or minimum
value of a selected statistics.
With this approach
redundant features are first
ranked in increasing or decreasing order,
as defined by the
using the ranking statistics defined by
either in their original or absolute scale,
as defined by
Subsequently data.frame rows corresponding to redundant
identifiers are removed, after these have been identified in
the column defined by the
random methods provide alternative ways
for summarizing numerical values corresponding to
redundant features, as defined by the
mean takes the average,
median the median,
geoMean the geometric mean,
random select a random value.
A data.frame with fewer rows with respect to the input one,
unique by the identifier specified by the
filterRedundant is a utility function providing various
methods to remove redundant rows from a data.frame.
The choice of the method depends on the nature of the values,
and the final goal.
Therefore caution should be used when taking the mean
or the median across few values, or passing the arguments
minORmax method (for instance it would
make no sense at all to use a decreasing ordering if the ranking
statistics is a p-value).
Luig Marchionni <email@example.com>
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
###load data data(matchBoxExpression) ###check whether there are redundant identifiers sapply(matchBoxExpression,nrow) ###the column name for the identifiers idCol <- "SYMBOL" ###the column name for the ranking statistics byCol <- "t" ###use lapply to remove redundancy from all data.frames ###default method is "maxORmin" newMatchBoxExpression <- lapply(matchBoxExpression, filterRedundant, idCol=idCol, byCol=byCol) ###recheck number of rows sapply(newMatchBoxExpression, nrow)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.