duplicate_count_colpair: Count duplicate values by column
In lhdjung/scrutiny: Error Detection in Science

View source: R/duplicate-count-colpair.R

duplicate_count_colpair

R Documentation

Count duplicate values by column

Description

duplicate_count_colpair() takes a data frame and checks each combination of columns for duplicates. Results are presented in a tibble, ordered by the number of duplicates.

Usage

duplicate_count_colpair(data, ignore = NULL, show_rates = TRUE)

Arguments

`data`	Data frame.
`ignore`	Optionally, a vector of values that should not be checked for duplicates.
`show_rates`	Logical. If `TRUE` (the default), adds columns `rate_x` and `rate_y`. See value section. Set `show_rates` to `FALSE` for higher performance.

Value

A tibble (data frame) with these columns –

x and y: Each line contains a unique combination of data's columns, stored in the x and y output columns.
count: Number of "duplicates", i.e., values that are present in both x and y.
total_x, total_y, rate_x, and rate_y (added by default): total_x is the number of non-missing values in the column named under x. Also, rate_x is the proportion of x values that are duplicated in y, i.e., count / total_x. Likewise with total_y and rate_y. The two ⁠rate_*⁠ columns will be equal unless NA values are present.

Summaries with `audit()`

There is an S3 method for audit(), so you can call audit() following duplicate_count_colpair(). It returns a tibble with summary statistics.

Examples

# Basic usage:
mtcars %>%
  duplicate_count_colpair()

# Summaries with `audit()`:
mtcars %>%
  duplicate_count_colpair() %>%
  audit()

lhdjung/scrutiny documentation built on Sept. 28, 2024, 12:14 a.m.

lhdjung/scrutiny index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

lhdjung/scrutiny
Error Detection in Science

duplicate_count_colpair: Count duplicate values by column
In lhdjung/scrutiny: Error Detection in Science

Count duplicate values by column

Description

Usage

Arguments

Value

Summaries with `audit()`

See Also

Examples

Related to duplicate_count_colpair in lhdjung/scrutiny...

R Package Documentation

Browse R Packages

We want your feedback!

lhdjung/scrutiny Error Detection in Science

duplicate_count_colpair: Count duplicate values by column In lhdjung/scrutiny: Error Detection in Science

Count duplicate values by column

Description

Usage

Arguments

Value

Summaries with audit()

See Also

Examples

Related to duplicate_count_colpair in lhdjung/scrutiny...

R Package Documentation

Browse R Packages

We want your feedback!

lhdjung/scrutiny
Error Detection in Science

duplicate_count_colpair: Count duplicate values by column
In lhdjung/scrutiny: Error Detection in Science

Summaries with `audit()`