cleaner: cleaner

View source: R/cleaner.R

cleanerR Documentation

cleaner

Description

cleaner removes duplicate entries from a data.frame.

is_clean checks whether a data.frame has duplicate entries.

cleaned shows which rows have been cleaned from a data.frame with cleaner

Usage

cleaner(d, ufn, orderAlsoBy = character(), decr = FALSE)

is_clean(d, ufn = names(d))

cleaned(before, after)

Arguments

d

a data.frame

ufn

character, contains column names identifying unique entries (ufn is an abbreviation of "unique field names")

orderAlsoBy

optional character, specifies column name that d is ordered by prior to cleaning

decr

logical, specifying if orderAlsoBy is to be sorted in a decreasing fashion (defaults to FALSE)

before

a data.frame

after

a data.frame that has been cleaned with cleaner

Value

cleaner returns a cleaned (i.e. deduplicated) data.frame

is_clean returns a logical indicating whether duplicate entries were found

cleaned returns a data.frame containing the entries in before that have been cleaned.

Examples

# clean heart rate data, only allow first measurement per person and condition
onlyfirst <- cleaner(heartrate, c("person", "condition"), "timestamp")
# alternatively, only allow last measurement
onlylast <- cleaner(heartrate, c("person", "condition"), "timestamp", TRUE)
# show whether 'heartrate' contains duplicates
is_clean(heartrate, c("person", "condition")) # returns false
# show entries cleaned from 'heartrate' compared to 'onlyfirst'
cleaned(heartrate, onlyfirst) # contains rows that were removed

joheli/kungfu documentation built on March 25, 2024, 10:10 a.m.