duplicate_count_colpair: Count duplicate values by column
In scrutiny: Error Detection in Science

View source: R/duplicate-count-colpair.R

duplicate_count_colpair

R Documentation

Count duplicate values by column

Description

duplicate_count_colpair() takes a data frame and checks each combination of columns for duplicates. Results are presented in a tibble, ordered by the number of duplicates.

Usage

duplicate_count_colpair(data, ignore = NULL, show_rates = TRUE)

Arguments

`data`	Data frame.
`ignore`	Optionally, a vector of values that should not be checked for duplicates.
`show_rates`	Logical. If `TRUE` (the default), adds columns `rate_x` and `rate_y`. See value section. Set `show_rates` to `FALSE` for higher performance.

Value

A tibble (data frame) with these columns –

x and y: Each line contains a unique combination of data's columns, stored in the x and y output columns.
count: Number of "duplicates", i.e., values that are present in both x and y.
total_x, total_y, rate_x, and rate_y (added by default): total_x is the number of non-missing values in the column named under x. Also, rate_x is the proportion of x values that are duplicated in y, i.e., count / total_x. Likewise with total_y and rate_y. The two ⁠rate_*⁠ columns will be equal unless NA values are present.

Summaries with `audit()`

There is an S3 method for audit(), so you can call audit() following duplicate_count_colpair(). It returns a tibble with summary statistics.

Examples

# Basic usage:
mtcars %>%
  duplicate_count_colpair()

# Summaries with `audit()`:
mtcars %>%
  duplicate_count_colpair() %>%
  audit()

scrutiny documentation built on Sept. 22, 2024, 9:06 a.m.

scrutiny index

Package overview README.md Consistency tests in depth Data wrangling DEBIT Developer tools Duplication analysis GRIM GRIMMER Implementing your consistency test Related software Rounding in depth Rounding options

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

scrutiny
Error Detection in Science

duplicate_count_colpair: Count duplicate values by column
In scrutiny: Error Detection in Science

Count duplicate values by column

Description

Usage

Arguments

Value

Summaries with `audit()`

See Also

Examples

Related to duplicate_count_colpair in scrutiny...

R Package Documentation

Browse R Packages

We want your feedback!

scrutiny Error Detection in Science

duplicate_count_colpair: Count duplicate values by column In scrutiny: Error Detection in Science

Count duplicate values by column

Description

Usage

Arguments

Value

Summaries with audit()

See Also

Examples

Related to duplicate_count_colpair in scrutiny...

R Package Documentation

Browse R Packages

We want your feedback!

scrutiny
Error Detection in Science

duplicate_count_colpair: Count duplicate values by column
In scrutiny: Error Detection in Science

Summaries with `audit()`