collapse_dt: Collapse redundant rows of a df using the Simplify function

Description Usage Arguments Value Examples

Description

This function performs similar to aggregate.data.frame, but with several conveniences. This version also improves on the previous CollapseDF by temporarily coercing into a data.table structure, making it handle Big Data much better. For simplicity it currently only allows grouping by columns that exist in df by explicit column name. Collapse columns are moved to the front of the df.

Usage

1
collapse_dt(df, column.names, unique = F)

Arguments

df

DataFrame containing column.names

column.names

character vector of column names used for grouping rows. Performs a similar function as "by=" in aggregate()

Value

collapsed data.table. This can be used as a data.frame or returned as dt

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
df <- data.frame(
  Patient = c(1,   1,  2,  2,  3,  4),
  Age     = c(31, 31, 32, NA, 33, NA),
  Score   = c( 9, 10,  8,  8, "",  4))
collapse_dt(df, "Patient")
 #   Patient   Age   Score
 # 1       1    31   10; 9
 # 2       2    32       8
 # 3       3    33
 # 4       4    NA       4

dkrozelle/toolboxR documentation built on May 15, 2019, 9:13 a.m.