Home

/

GitHub

/

filterRedundant: This functions removes redundant features from a data.frame

filterRedundant: This functions removes redundant features from a data.frame
In marchion/matchBox: Utilities to compute, compare, and plot the agreement between ordered vectors of features (ie. distinct genomic experiments). The package includes Correspondence-At-the-TOP (CAT) analysis.

Description Usage Arguments Details Value Note Author(s) See Also Examples

Prior computing proportion of overlap between ranked vector of features it is necessary to remove the redundant features. This can be accomplished using a number of methods implemeted in the filterRedundant function, as explained below.

1
2
3

filterRedundant(object,
    method=c("maxORmin", "geoMean", "mean", "median","random"),
    idCol=1, byCol=2, absolute=TRUE, decreasing=TRUE, trim=0, ...)

`object`	a data.frame from which redundant features (rows) must be removed.
`method`	character. The method used for removing redundancy. Currently available methods are: `maxORmin`, `geoMean`, `random`, `mean`, `median`, (see Details below).
`idCol`	character or numeric. Name or index of the column containing redundant identifiers (e.g. ENTREZID, SYMBOLS, ...).
`byCol`	character or numeric. Name or index of the column containing the ranking statistics (used only with `maxORmin` method).
`absolute`	logical. Indicates whether the absolute statistics, as defined by `byCol`, should be used when reordering (used only with `maxORmin` method).
`decreasing`	logical. Indicates whether reodering should be decreasing or not (used only with `maxORmin` method).
`trim`	numeric. Indicates whether a trimmed mean should be computed (used only with `mean` method).
`...`	further arguments to be passed (not currently implemented).

The maxORmin method removes redundant features by selecting the rows that correspond to the maximum or minimum value of a selected statistics. With this approach redundant features are first ranked in increasing or decreasing order, as defined by the decreasing argument, using the ranking statistics defined by byCol, either in their original or absolute scale, as defined by absolute argument. Subsequently data.frame rows corresponding to redundant identifiers are removed, after these have been identified in the column defined by the idCol, using the duplicated function.

The mean, median, geoMean, and random methods provide alternative ways for summarizing numerical values corresponding to redundant features, as defined by the idCol argument: mean takes the average, median the median, geoMean the geometric mean, random select a random value.

A data.frame with fewer rows with respect to the input one, unique by the identifier specified by the idCol argument.

filterRedundant is a utility function providing various methods to remove redundant rows from a data.frame. The choice of the method depends on the nature of the values, and the final goal. Therefore caution should be used when taking the mean or the median across few values, or passing the arguments with the minORmax method (for instance it would make no sense at all to use a decreasing ordering if the ranking statistics is a p-value).

Luig Marchionni <marchion@jhu.edu>

See duplicated.

###load data
data(matchBoxExpression)

###check whether there are redundant identifiers
sapply(matchBoxExpression,nrow)

###the column name for the identifiers
idCol <- "SYMBOL"

###the column name for the ranking statistics
byCol <- "t"

###use lapply to remove redundancy from all data.frames
###default method is "maxORmin"
newMatchBoxExpression <- lapply(matchBoxExpression, filterRedundant, idCol=idCol, byCol=byCol)

###recheck number of rows
sapply(newMatchBoxExpression, nrow)

marchion/matchBox documentation built on May 9, 2019, 4:07 p.m.

marchion/matchBox index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

marchion/matchBox
Utilities to compute, compare, and plot the agreement between ordered vectors of features (ie. distinct genomic experiments). The package includes Correspondence-At-the-TOP (CAT) analysis.

filterRedundant: This functions removes redundant features from a data.frame
In marchion/matchBox: Utilities to compute, compare, and plot the agreement between ordered vectors of features (ie. distinct genomic experiments). The package includes Correspondence-At-the-TOP (CAT) analysis.

Description

Usage

Arguments

Details

Value

Note

Author(s)

See Also

Examples

Related to filterRedundant in marchion/matchBox...

R Package Documentation

Browse R Packages

We want your feedback!

marchion/matchBox Utilities to compute, compare, and plot the agreement between ordered vectors of features (ie. distinct genomic experiments). The package includes Correspondence-At-the-TOP (CAT) analysis.

filterRedundant: This functions removes redundant features from a data.frame In marchion/matchBox: Utilities to compute, compare, and plot the agreement between ordered vectors of features (ie. distinct genomic experiments). The package includes Correspondence-At-the-TOP (CAT) analysis.

Description

Usage

Arguments

Details

Value

Note

Author(s)

See Also

Examples

Related to filterRedundant in marchion/matchBox...

R Package Documentation

Browse R Packages

We want your feedback!

marchion/matchBox
Utilities to compute, compare, and plot the agreement between ordered vectors of features (ie. distinct genomic experiments). The package includes Correspondence-At-the-TOP (CAT) analysis.

filterRedundant: This functions removes redundant features from a data.frame
In marchion/matchBox: Utilities to compute, compare, and plot the agreement between ordered vectors of features (ie. distinct genomic experiments). The package includes Correspondence-At-the-TOP (CAT) analysis.