checkdupl: Find and remove duplicated row observations between two data...

View source: R/checkdupl.R

checkduplR Documentation

Find and remove duplicated row observations between two data sets

Description

Function checkdupl finds the duplicated row observations between two matrices or data frames.

Function rmdupl removes the duplicated row observations between two matrices or data frames.

Usage


checkdupl(X, Y, nam = NULL, digits = NULL, check.all = FALSE)

rmdupl(X, nam = NULL, digits = NULL, check.all = FALSE)

Arguments

X

A matrix or data frame, compared to Y.

Y

A matrix or data frame, compared to X.

nam

The names of the variables to consider in X and Y: the test of duplication is undertaken only over the variables in nam. If NULL (default), nam is set to all the column names of X. The variables set in nam must be common between X and Y.

digits

The number of digits used when rounding the variables (set in nam) before the test. Default to NULL (no rounding.

check.all

Logical (default = FALSE). If TRUE, an additionnal test of duplication is undertaken considering all the columns of X (even if nam is defined as a part of these columns).

Value

A data frame reporting the duplicated rows.

Examples


dat1 <- matrix(c(1:5, 1:5, c(1, 2, 7, 4, 8)), nrow = 3, byrow = TRUE)
dimnames(dat1) <- list(1:3, c("v1", "v2", "v3", "v4", "v5"))

dat2 <- matrix(c(6:10, 1:5, c(1, 2, 7, 6, 12)), nrow = 3, byrow = TRUE)
dimnames(dat2) <- list(1:3, c("v1", "v2", "v3", "v4", "v5"))

dat1
dat2

checkdupl(dat1, dat2)

checkdupl(dat1, dat2, nam = c("v1", "v2"))

checkdupl(dat1, dat2, nam = c("v1", "v2"), check.all = TRUE)

z <- checkdupl(X = dat1, Y = dat1)
z[z$rownum.X != z$rownum.Y, ]

z <- checkdupl(dat1, dat1, nam = c("v1", "v2"))
z[z$rownum.Y != z$rownum.Y, ]

rmdupl(dat1)

rmdupl(dat1, nam = c("v1", "v2"))


mlesnoff/rnirs documentation built on April 24, 2023, 4:17 a.m.