sum_up
prints detailed summary statistics (corresponds to Stata summarize
)
N <- 100 df <- tibble( id = 1:N, v1 = sample(5, N, TRUE), v2 = sample(1e6, N, TRUE) ) sum_up(df) df %>% sum_up(starts_with("v"), d = TRUE) df %>% group_by(v1) %>% sum_up()
tab
prints distinct rows with their count. Compared to the dplyr function count
, this command adds frequency, percent, and cumulative percent.
N <- 1e2 ; K = 10 df <- tibble( id = sample(c(NA,1:5), N/K, TRUE), v1 = sample(1:5, N/K, TRUE) ) tab(df, id) tab(df, id, na.rm = TRUE) tab(df, id, v1)
join
is a wrapper for dplyr merge functionalities, with two added functions
check
checks there are no duplicates in the master or using data.tables (as in Stata).r
# merge m:1 v1
join(x, y, kind = "full", check = m~1)
- The option gen
specifies the name of a new variable that identifies non matched and matched rows (as in Stata).
r
# merge m:1 v1, gen(_merge)
join(x, y, kind = "full", gen = "_merge")
update
allows to update missing values of the master dataset by the value in the using datasetAdd the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.