removeDuplicates | R Documentation |
The removeDuplicates
function removes duplicate columns from
a binaryMatrix
object in the Mercator
package.
removeDuplicates(object)
removeDuplicateFeatures(object)
object |
An object of class |
In some analyses, it may be desirable to remove duplicate features to collapse
a group of identical, related events to a single feature, to prevent overweighting
when clustering. Historically, this funciton was called
removeDuplicateFeatures
. That name is still retained for
backwards compatibility, but it may be deprecated in future versions in
favor of removeDuplicates
.
In the same way, for some clustering applications,
it may be usedful to remove duplicate samples, or those that have an
identical feature set.
Removal of duplicates is not required for performance of the
binaryMatrix
or Mercator
objects and associated functions.
The history
slot of the binaryMatrix
object documents the removal of
duplicates.
Future versions of this package may include functionality to store the identities of any duplicates that were removed.
Returns an object of class binaryMatrix
with duplicate columns removed.
Transposing the binaryMatrix
can allow the removeDuplicates
function to be applied to both features and observations, if desired.
Features containing exclusively 0s or 1s may interfere with performance of
removeDuplicates
.
Kevin R. Coombes <krc@silicovore.com>, Caitlin E. Coombes
my.matrix <- matrix(rbinom(50*100, 1, 0.15), ncol=50)
my.matrix <- cbind(my.matrix, my.matrix[, 1:5]) # add duplicates
dimnames(my.matrix) <- list(paste("R", 1:100, sep=''),
paste("C", 1:55, sep=''))
my.binmat <- BinaryMatrix(my.matrix)
dim(my.binmat)
my.binmat <- removeDuplicates(my.binmat)
dim(my.binmat)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.