# checkdupl: Find and remove duplicated row observations between two data... In mlesnoff/rnirs: Dimension reduction, Regression and Discrimination for Chemometrics

 checkdupl R Documentation

## Find and remove duplicated row observations between two data sets

### Description

Function `checkdupl` finds the duplicated row observations between two matrices or data frames.

Function `rmdupl` removes the duplicated row observations between two matrices or data frames.

### Usage

``````
checkdupl(X, Y, nam = NULL, digits = NULL, check.all = FALSE)

rmdupl(X, nam = NULL, digits = NULL, check.all = FALSE)

``````

### Arguments

 `X` A matrix or data frame, compared to `Y`. `Y` A matrix or data frame, compared to `X`. `nam` The names of the variables to consider in `X` and `Y`: the test of duplication is undertaken only over the variables in `nam`. If `NULL` (default), `nam` is set to all the column names of `X`. The variables set in `nam` must be common between `X` and `Y`. `digits` The number of digits used when rounding the variables (set in `nam`) before the test. Default to `NULL` (no rounding. `check.all` Logical (default = `FALSE`). If `TRUE`, an additionnal test of duplication is undertaken considering all the columns of `X` (even if `nam` is defined as a part of these columns).

### Value

A data frame reporting the duplicated rows.

### Examples

``````
dat1 <- matrix(c(1:5, 1:5, c(1, 2, 7, 4, 8)), nrow = 3, byrow = TRUE)
dimnames(dat1) <- list(1:3, c("v1", "v2", "v3", "v4", "v5"))

dat2 <- matrix(c(6:10, 1:5, c(1, 2, 7, 6, 12)), nrow = 3, byrow = TRUE)
dimnames(dat2) <- list(1:3, c("v1", "v2", "v3", "v4", "v5"))

dat1
dat2

checkdupl(dat1, dat2)

checkdupl(dat1, dat2, nam = c("v1", "v2"))

checkdupl(dat1, dat2, nam = c("v1", "v2"), check.all = TRUE)

z <- checkdupl(X = dat1, Y = dat1)
z[z\$rownum.X != z\$rownum.Y, ]

z <- checkdupl(dat1, dat1, nam = c("v1", "v2"))
z[z\$rownum.Y != z\$rownum.Y, ]

rmdupl(dat1)

rmdupl(dat1, nam = c("v1", "v2"))

``````

mlesnoff/rnirs documentation built on April 24, 2023, 4:17 a.m.