remove_dup: Remove duplicates
In TalhoukLab/biostatUtil: Utility Functions Used in Biostatistics Projects

remove_dup

R Documentation

Remove duplicates

Description

Remove duplicates for specified columns of a data frame

Usage

remove_dup(x, cols)

Arguments

`x`	data frame
`cols`	character vector of column names from `x` to remove duplicates

Details

In Mass Spec data, there are occasionally duplicated entries that need to be removed before further analysis. Duplication is indicated by the Quan.Info and PSM.Ambiguity columns. remove_dup removes duplicates for certain columns, then collapses repeated information into a single row.

This function is intended to be used after a call to dplyr::group_by() such that the removal of duplicates is performed within each group of unique protein IDs (e.g. Reporter.Quan.Result.ID).