remove.dup.rows: Remove duplicate rows
In cwhmisc: Miscellaneous Functions for Math, Plotting, Printing, Statistics, Strings, and Tools

Description Usage Arguments Details Value Note Author(s) Examples

Removes duplicate rows from a dataframe.

1	remove.dup.rows(dfr)

dfr

A dataframe

Uses the function eql.

The dataframe with only one copy of identical rows.

Method: Sort the dataframe, figure out which rows have all values identical to their successor. This gives logical vector, in the order of the sorted values, so reorder it. Finally select nondups. As a "bonus feature", I think this will also remove any row containing all NA's...

A major stumbling block is that you'll want two NAs to compare equal, hence the eql() function.

Actually, I think you can do away with the isdup array and do

all.dup <- do.call("pmin", lapply(dfr[o,], function(x) eql(x,c(x[-1],NA))))

and there may be further cleanups possible.

One dirty trick which is much quicker but not quite as reliable is

dfr[!duplicated(do.call("paste",dfr)), ]

(watch out for character strings with embedded spaces and underflowing differences in numeric data!)

Peter Dalgaard, p.dalgaard@biostat.ku.dk

1 2	dfr <- data.frame(matrix(c(1:3,2:4,1:3,1:3,2:4,3:5),6,byrow=TRUE)) remove.dup.rows(dfr)

cwhmisc documentation built on May 1, 2019, 7:55 p.m.

cwhmisc index

Package overview cwhmisc gChangeLogmisc

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

cwhmisc
Miscellaneous Functions for Math, Plotting, Printing, Statistics, Strings, and Tools

remove.dup.rows: Remove duplicate rows
In cwhmisc: Miscellaneous Functions for Math, Plotting, Printing, Statistics, Strings, and Tools

Description

Usage

Arguments

Details

Value

Note

Author(s)

Examples

Related to remove.dup.rows in cwhmisc...

R Package Documentation

Browse R Packages

We want your feedback!

cwhmisc Miscellaneous Functions for Math, Plotting, Printing, Statistics, Strings, and Tools

remove.dup.rows: Remove duplicate rows In cwhmisc: Miscellaneous Functions for Math, Plotting, Printing, Statistics, Strings, and Tools

Description

Usage

Arguments

Details

Value

Note

Author(s)

Examples

Related to remove.dup.rows in cwhmisc...

R Package Documentation

Browse R Packages

We want your feedback!

cwhmisc
Miscellaneous Functions for Math, Plotting, Printing, Statistics, Strings, and Tools

remove.dup.rows: Remove duplicate rows
In cwhmisc: Miscellaneous Functions for Math, Plotting, Printing, Statistics, Strings, and Tools