dlabel | R Documentation |
This function adds a label to a data frame based on the distance between records as
defined by the distance
column and calculated by dfilter. The label
is a boolean stating whether the distance between records is within the specified
range.
bc_contamination
calls dlabel
with default parameters for the
labelling of possibly contaminated blood cultures.
dlabel(
df,
id,
category,
distance,
min.dist,
max.dist,
temporal.unit,
label_name = "dlabel",
invert = FALSE,
df_filter = NULL
)
bc_contamination(
...,
min.dist = 5,
max.dist = 7 * 60 * 24,
temporal.unit = "minutes",
label_name = "contamination",
invert = TRUE
)
df |
A data frame. |
id |
A character string specifying the column name of the id. |
category |
A character string specifying the column name of the category. |
distance |
A character string specifying the column name of the distance column; see dfilter for details. |
min.dist |
A numeric value specifying the minimum distance between records; see dfilter for details. |
max.dist |
A numeric value specifying the maximum distance between records; see dfilter for details. |
temporal.unit |
A character string specifying the temporal unit of the distance; see dfilter for details. |
label_name |
A character string specifying the name of the label column. |
invert |
A logical value specifying whether to invert the label. |
df_filter |
A character string specifying the filter expression to be applied to the data frame before the distance label is calculated. |
... |
Arguments to be passed to |
A data frame with the distance label added.
# create test data
set.seed(123)
dl.test <- data.frame(id = sample(1:10, 30, replace = TRUE),
category = sample(letters[1:4], 30, replace = TRUE),
timestamp = as.POSIXct(runif(30, 1704063600, 1711922400),
origin = "1970-01-01"))
# test: dlabel will reveal three id-category combinations with temporal
# distances within the range of 2 to 40 days pertaining to category 'a'
test <- dlabel(dl.test, id = "id", category = "category",
distance = "timestamp",
min.dist = 2, max.dist = 40,
temporal.unit = "days",
label_name = "within_range",
df_filter = "category == 'a'")
set.seed(123)
bugs <- data.frame(species = c("S. epidermidis", "C. acnes", "S. aureus", "E. coli"),
category = c("skin flora", "skin flora", "pathogen", "pathogen"))
blood_cultures <- data.frame(lab_no = 1:50,
patient = sample(1:10, 50, replace = TRUE),
species = sample(bugs$species, 50, replace = TRUE),
timestamp = as.POSIXct(runif(50, 1704063600, 1711000000),
origin = "1970-01-01"))
blood_cultures <- blood_cultures %>% left_join(bugs, by = "species")
bc_conta <- bc_contamination(blood_cultures,
id = "patient",
category = "species",
distance = "timestamp",
df_filter = "category == 'skin flora'")
# Patient 9 has 5 cultures with skin flora, which, despite revealing
# skin flora, could correspond to infection (field contamination equals
# to FALSE), as these cultures satisfy the temporal distance criterion
# given by min.dist and max.dist.
# check:
# bc_conta %>% filter(category=="skin flora" & !contamination)
# The remaining samples yielding skin flora likely represent
# contamination, as their temporal occurrence is outside the range
# given by min.dist and max.dist.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.