findCondition | R Documentation |
This functions is useful for the very common task of selecting cases based on a code which has complete or partial match to a vector of character vari- ables.
The function is designed to search a group of variables (character) for multiple conditions defined in a list of named character vectors. The func- tion will produce a data.table with selected variables for cases where a match is found. In addition a list of names character vectors can have exclusions from the search. This last facility is useful if e.g. all cancer except non melanoma skin cancer is sought. In that case inclusion can have all cancers and the exclusions just the non-melanoma skin cancer.
See examples for common use of the output
findCondition(data, vars, keep, conditions, exclusions=NULL,
match="contain",condition.name="X")
data |
Data in which to search for conditions |
vars |
Name(s) of variable(s) in which to search. |
keep |
a character vector of the columns in Data.table to keep in output |
conditions |
A named list of (vectors of) search strings. See examples. |
exclusions |
A names list of (vectors of) search strings to exclude from the output. |
match |
A variable to tell how to use the character vectors: "exact"=exactly matches the search string, "contains"=contains the search string, "start"=Starts with the search string, "end"=Ends with the search string |
condition.name |
Name of variable(s) where values define conditions. The values of this variable are the names from parameter "conditions". |
A data table that includes the "keep-variables" and a variable named
condition.name
which #' identifies the condition searched for
Christian Torp-Pedersen <ctp@heart.dk>, Thomas A. Gerds <tag@biostat.ku.dk>
library(heaven)
library(data.table)
# find all diagnoses that start with "DT"
set.seed(800); adm <- simAdmissionData(800)
x <- findCondition(adm,vars=c("diag"),
keep=c("pnr","inddto","uddto"),
conditions=list(THIS=c("DT")),
match="start",condition.name="THAT")
x
# restrict to first by pnr
x[x[,.I[1],by=list(pnr)]$V1]
# restrict to last by pnr
x[x[,.I[.N],by=list(pnr)]$V1]
opr <- data.table(
pnr=1:100,opr=paste0(rep(c('A','B'),50),seq(0,100,10)),
oprtil=paste0(rep(c('A','C'),50),seq(0,100,10)),
odto=101:200
)
search <- list(Cond1=c('A1','A2'),Cond2=c('B10','B40','B5'),
Cond3=c('A1','C20','B2'))
excl <- list(Cond2='B100')
out <- findCondition(opr,vars=c("opr","oprtil"),
keep=c("pnr","odto"),
conditions=search, exclusions=excl,
match="start",condition.name="cond")
### And how to use the result:
# Find first occurence of each condition and then use "dcast" to create
# a data.table with vectors corresponding to each condition.
test <- out[,list(min=min(odto)),by=c("pnr","cond")]
# provide a list of variables with one value each
test2 <- dcast(pnr~cond,data=test,value.var="min")
test2 # A datatable with first dates of each condition for each pnr, but only
# for pnr with at least one condition
# Define a condition as present when before a certain index date
dates <- data.table (pnr=1:100,basedate=sample(0:200,size=100,replace=TRUE))
test3 <- merge(out,dates,by="pnr")
test3[,before:=as.numeric(odto<=basedate)] # 1 when condition fulfille
test3 <- test3[,list(before=max(before)),by=c("pnr","cond")]
test4 <- dcast(pnr~cond,value.var="before",data=test3)
test4[is.na(test4)] <- 0 # Converts NAs to zero
test4[]
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.