View source: R/epi_clean_add_rep_num.R
epi_clean_add_rep_num | R Documentation |
Add a column with count of duplicate/replicate (ie for repeated screening, replicate counts, etc.). Assumes the dataframe passed is sorted logically with repeating IDs next to each other and if date is used as a second ordering criteria for example, then earlier dates are first. Useful for data with repeated measurements but without a column/variable clearly identifying them as such.
epi_clean_add_rep_num(df = NULL, var_id = NULL, var_to_rep = "")
df |
a dataframe object as input |
var_id |
Column to use as ID, read as dataframe vector, can be index or string. This will be used to test if row is duplicate, if it is it will add a replicate number. |
var_to_rep |
Column variable that can distinguish replicates (eg date, 'baseline' vs 'treated', etc.) |
Returns a dataframe with one column which can be merged with existing dataframe.
Facilitates spreading a dataframe and extracting baseline vs repeated measurement rows for example
Antonio Berlanga-Taylor <\url{https://github.com/AntonioJBT/episcout}>
## Not run:
n <- 20
df <- data.frame(
var_id = rep(1:(n / 2), each = 2),
var_to_rep = rep(c('Pre', 'Post'), n / 2),
x = rnorm(n),
y = rbinom(n, 1, 0.50),
z = rpois(n, 2)
)
var_id <- 'var_id'
var_to_rep <- 'var_to_rep'
reps <- epi_clean_add_rep_num(df, 'var_id', 'var_to_rep')
reps
# Sanity check:
identical(as.character(reps[[var_id]]),
as.character(df[[var_id]])) # should be TRUE
# Bind:
df2 <- as.tibble(cbind(df, 'rep_num' = reps$rep_num))
# merge() adds all rows from both data frames as there are duplicates
# so use cbind after making sure order is exact
epi_head_and_tail(df2, rows = 3)
epi_head_and_tail(df2, rows = 3, last_cols = TRUE)
df2
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.