Description Usage Arguments Details Value Note Author(s) See Also Examples

Prior computing proportion of overlap between ranked vector of features
it is necessary to remove the redundant features.
This can be accomplished using a number of methods implemeted
in the `filterRedundant`

function, as explained below.

1 2 3 |

`object` |
a data.frame from which redundant features (rows) must be removed. |

`method` |
character. The method used for removing redundancy.
Currently available methods are: |

`idCol` |
character or numeric. Name or index of the column containing redundant identifiers (e.g. ENTREZID, SYMBOLS, ...). |

`byCol` |
character or numeric. Name or index of the column
containing the ranking statistics (used only with |

`absolute` |
logical. Indicates whether the absolute statistics,
as defined by |

`decreasing` |
logical. Indicates whether reodering should be
decreasing or not (used only with |

`trim` |
numeric. Indicates whether a trimmed mean should
be computed (used only with |

`...` |
further arguments to be passed (not currently implemented). |

The `maxORmin`

method removes
redundant features by selecting the rows
that correspond to the maximum or minimum
value of a selected statistics.
With this approach
redundant features are first
ranked in increasing or decreasing order,
as defined by the `decreasing`

argument,
using the ranking statistics defined by `byCol`

,
either in their original or absolute scale,
as defined by `absolute`

argument.
Subsequently data.frame rows corresponding to redundant
identifiers are removed, after these have been identified in
the column defined by the `idCol`

,
using the `duplicated`

function.

The `mean`

, `median`

, `geoMean`

,
and `random`

methods provide alternative ways
for summarizing numerical values corresponding to
redundant features, as defined by the `idCol`

argument:
`mean`

takes the average,
`median`

the median,
`geoMean`

the geometric mean,
`random`

select a random value.

A data.frame with fewer rows with respect to the input one,
unique by the identifier specified by the `idCol`

argument.

`filterRedundant`

is a utility function providing various
methods to remove redundant rows from a data.frame.
The choice of the method depends on the nature of the values,
and the final goal.
Therefore caution should be used when taking the mean
or the median across few values, or passing the arguments
with the `minORmax`

method (for instance it would
make no sense at all to use a decreasing ordering if the ranking
statistics is a p-value).

Luig Marchionni <marchion@jhu.edu>

See `duplicated`

.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | ```
###load data
data(matchBoxExpression)
###check whether there are redundant identifiers
sapply(matchBoxExpression,nrow)
###the column name for the identifiers
idCol <- "SYMBOL"
###the column name for the ranking statistics
byCol <- "t"
###use lapply to remove redundancy from all data.frames
###default method is "maxORmin"
newMatchBoxExpression <- lapply(matchBoxExpression, filterRedundant, idCol=idCol, byCol=byCol)
###recheck number of rows
sapply(newMatchBoxExpression, nrow)
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.