detect_dupl_cols: Detect if any column of a data.frame is a duplicate of...

View source: R/detect_dupl_cols.R

detect_dupl_colsR Documentation

Detect if any column of a data.frame is a duplicate of another


It occasionally happens that 2 (or more) columns in dataframe are exactly identical. This could lead to redundant computational cost and unexpected behavior in Machine Learning methods. This function scans though all column combinations of dataframe to examine if any 2 columns are exactly identical.


detect_dupl_cols(dataset, return_type = "col_names", duplicate_col = "right")



A data.frame


How to return detected duplicate columns Use "col_names", "col_positions" or "dataset" to return dataset with deleted duplicate columns


If 2 columns are identical, which of the 2 columns should be treated as duplicate? Use "right" for right column, "left" for left.


A vector of duplicate column names or column positions or dataset with deleted duplicate columns. Use return_type parameter to specify.


## Not run: 
detect_dupl_cols(dataset = head(mutate(mtcars, mpg_2 =  mpg)), duplicate_col = "right")

## End(Not run)

dataframeexplorer documentation built on April 4, 2022, 9:05 a.m.