findDuplicates: Find patient record duplicates

Description Usage Arguments Author(s) Examples

View source: R/removeFns.R

Description

Find (and remove) duplicate same-patient entries on same-day of admission with missing event-type and observed (dead, alive or administrative censoring) because these have the same admission and discharge data as one another; making sure to retain all relevant information.

Usage

1
findDuplicates(total.data, nCodesInd)

Arguments

total.data

Full combined infected and non-infected patients data set

nCodesInd

The names of the risk factor or comorbidity column names in total.dat

Author(s)

N Green

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
##---- Should be DIRECTLY executable !! ----
##-- ==>  Define data, use random,
##--	or do  help(data=index)  for the standard data sets.

## The function is currently defined as
function (total.data, nCodesInd) 
{
    uniqueid <- unique(total.data$hes_ID)
    progressbar <- txtProgressBar(min = 0, max = length(uniqueid), 
        style = 3)
    counter = 0
    for (i in uniqueid) {
        counter = counter + 1
        setTxtProgressBar(progressbar, counter)
        ind <- which(total.data$hes_ID == i)
        uniquetime <- unique(total.data$hes_admdte[ind])
        for (j in uniquetime) {
            subrows <- ind[total.data$hes_admdte[ind] == j]
            if (length(subrows) > 1) {
                total.data[subrows, nCodesInd] <- matrix(apply(total.data[subrows, 
                  nCodesInd], MARGIN = 2, FUN = any), nrow = length(subrows), 
                  ncol = length(nCodesInd), byrow = TRUE)
                if (any(!is.na(total.data$lab_Specimendate[subrows]))) {
                  total.data[subrows, ] <- total.data[subrows, 
                    ][order(total.data$lab_Specimendate[subrows]), 
                    ]
                }
                if (any(is.na(total.data[subrows, "lab_opieid"])) | 
                  anyDuplicated(total.data[subrows, "lab_opieid"])) {
                  total.data <- rmDuplicates(total.data, subrows)
                }
            }
        }
    }
    close(progressbar)
    total.data
  }

n8thangreen/HESmanip documentation built on March 21, 2020, 12:20 a.m.