deduplicate: Deduplicate records

Description Usage Arguments Value Author(s) Examples

View source: R/deduplicate.R

Description

Deduplicate records

Usage

1
2
deduplicate(data, id.col, date.col, vars.collapse = "all",
  episode.length = Inf, break.value = ", ")

Arguments

data

a data.frame

id.col

a character specifying the column containing ID's

date.col

a character specifying the column containing dates

vars.collapse

a character vector specifying the columns to collapse together if there is differing information between records

episode.length

a numeric specifying the number of days a single episode lasts

break.value

a character specifying the value to insert as a break between collapsed variables

Value

a numeric vector containing age in years

Author(s)

Daniel Gardiner (daniel.gardiner@phe.gov.uk)

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# define dummy data

set.seed(7)

data = data.frame(id = c(1, 1, 2, 2, 3, 4, 5, 5, NA, NA),
                  date = sample(seq(as.Date("2019-01-01"),
                                    as.Date("2019-03-01"),
                                    by = 1), size = 10),
                  spec = c("arm", "leg", "hand", "arm", "head", "hand", "blood", "blood", "arm", "arm"),
                  lab = c("Lab A", "Lab D",  "Lab B", "Lab C", "Lab A", "Lab D", "Lab F", "Lab D", "Lab B", "Lab A"))


# apply deduplicate function

deduplicate(data,
            id.col = "id",
            date.col = "date")

deduplicate(data,
            id.col = "id",
            date.col = "date",
            vars.collapse = "lab")

deduplicate(data,
            id.col = "id",
            date.col = "date",
            vars.collapse = "lab",
            episode.length = 4)

deduplicate(data,
            id.col = "id",
            date.col = "date",
            vars.collapse = "lab",
            episode.length = 4,
            break.value = " | ")

DanielGardiner/EpiFunc documentation built on July 25, 2019, 10:53 p.m.