duplicates: Tag, report, delete or keep duplicate observations

Description Usage Arguments Details Author(s) See Also Examples

View source: R/duplicates.R

Description

duplicates displays structure of a data frame

Usage

1
2
3
4
5

Arguments

data

dataframe

...

any variables within dataframe for unique id

print.table

logical value to display formatted outputs

Details

duplicates

tags duplicate observations within dataframe with a new variable called dupID_ and reports statistics. Duplicates are observations with identical values either on all variables if no variable is specified in the optional argument ... or on a specified list of variables.

ANNOTATIONS:

Copies - Number of duplicates

Observations - Number of records per Copies

Surplus - Number of surplus copies

keepUnique

delete all but the first occurrence of each group of duplicated observations.

keepDup keep all but the first occurrence of each group of duplicated observations. This function returns the opposite dataset generated from keepUnique.

Author(s)

Myo Minn Oo (Email: dr.myominnoo@gmail.com | Website: https://myominnoo.github.io/)

See Also

keep, lose

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
## Not run: 
# finding duplicates across all variables
duplicates(iris)

# finding duplicates on variables of interest
duplicates(iris, Sepal.Length, Sepal.Width)
duplicates(iris, Species)
duplicates(iris, Sepal.Length, Sepal.Width, print.table = FALSE)

# Keep Unique records
keepUnique(iris, Sepal.Length)
keepUnique(iris, Species)
keepUnique(infert, case)

# Keep duplicated records (opposite of keep unique records)
keepDup(iris, Sepal.Length)
keepDup(iris, Species)
keepDup(infert, case)

## End(Not run)

myominnoo/mStats_beta documentation built on Feb. 29, 2020, 8:17 a.m.