df.duplicated: Extract Duplicated or Unique Rows

View source: R/df.duplicated.R

df.duplicatedR Documentation

Extract Duplicated or Unique Rows

Description

The function df.duplicated extracts duplicated rows and the function df.unique extracts unique rows from a matrix or data frame.

Usage

df.duplicated(..., data, first = TRUE, keep.all = TRUE, from.last = FALSE,
              keep.row.names = TRUE, check = TRUE)

df.unique(..., data, keep.all = TRUE, from.last = FALSE,
          keep.row.names = TRUE, check = TRUE)

Arguments

...

an expression indicating the variable names in data used to determine duplicated or unique rows.e.g., df.duplicated(x1, x2, data = dat). Note that the operators ., +, -, ~, :, ::, and ! can also be used to select variables, see Details in the df.subset function.

data

a data frame.

first

logical: if TRUE (default), the df.duplicated() function will return duplicated rows including the first of identical rows.

keep.all

logical: if TRUE (default), the function will return all variables in x after extracting duplicated or unique rows based on the variables specified in the argument ....

from.last

logical: if TRUE, duplication will be considered from the reversed side, i.e., the last of identical rows would correspond to duplicated = FALSE. Note that this argument is only used when first = FALSE.

keep.row.names

logical: if TRUE (default), the row names from x are kept, otherwise they are set to NULL.

check

logical: if TRUE (default), argument specification is checked.

Details

Note that df.unique(x) is equivalent to unique(x). That is, the main difference between the df.unique() and the unique() function is that the df.unique() function provides the ... argument to specify a variable or multiple variables which are used to determine unique rows.

Value

Returns duplicated or unique rows of the data frame in ... or data.

Author(s)

Takuya Yanagida takuya.yanagida@univie.ac.at

References

Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language. Wadsworth & Brooks/Cole.

See Also

df.merge, df.move, df.rbind, df.rename, df.sort, df.subset

Examples

dat <- data.frame(x1 = c(1, 1, 2, 1, 4),
                  x2 = c(1, 1, 2, 1, 6),
                  x3 = c(2, 2, 3, 2, 6),
                  x4 = c(1, 1, 2, 2, 4),
                  x5 = c(1, 1, 4, 4, 3))

#-------------------------------------------------------------------------------
# df.duplicated() function

# Example 1: Extract duplicated rows based on all variables
df.duplicated(., data = dat)

# Example 2: Extract duplicated rows based on x4
df.duplicated(x4, data = dat)

# Example 3: Extract duplicated rows based on x2 and x3
df.duplicated(x2, x3, data = dat)

# Example 4: Extract duplicated rows based on all variables
# exclude first of identical rows
df.duplicated(., data = dat, first = FALSE)

# Example 5: Extract duplicated rows based on x2 and x3
# do not return all variables
df.duplicated(x2, x3, data = dat, keep.all = FALSE)

# Example 6: Extract duplicated rows based on x4
# consider duplication from the reversed side
df.duplicated(x4, data = dat, first = FALSE, from.last = TRUE)

# Example 7: Extract duplicated rows based on x2 and x3
# set row names to NULL
df.duplicated(x2, x3, data = dat, keep.row.names = FALSE)

#-------------------------------------------------------------------------------
# df.unique() function

# Example 8: Extract unique rows based on all variables
df.unique(., data = dat)

# Example 9: Extract unique rows based on x4
df.unique(x4, data = dat)

# Example 10: Extract unique rows based on x1, x2, and x3
df.unique(x1, x2, x3, data = dat)

# Example 11: Extract unique rows based on x2 and x3
# do not return all variables
df.unique(x2, x3, data = dat, keep.all = FALSE)

# Example 12: Extract unique rows based on x4
# consider duplication from the reversed side
df.unique(x4, data = dat, from.last = TRUE)

# Example 13: Extract unique rows based on x2 and x3
# set row names to NULL
df.unique(x2, x3, data = dat, keep.row.names = FALSE)

misty documentation built on Oct. 24, 2024, 5:10 p.m.

Related to df.duplicated in misty...