Aggregate: Aggregate numeric and categorical variables by an ID

Description Usage Arguments Value Author(s) References See Also Examples

View source: R/Aggregate.R

Description

The Aggregate function (not to be confounded with aggregate) prepares a data frame for merging by computing the sum, mean and variance of all continuous (integer and numeric) variables by a given ID variable. It also creates dummies for all categorical variables (character and factor) and subsequently computes the sum by a given ID variable. This functions aims at maximal information extraction with a minimal amount of code.

Usage

1

Arguments

x

A data frame without the ID. Categorical variables have to be of type character or factor and continuous variables have to be of type integer or numeric.

by

A vector containing ID”s.

Value

A data frame with the aforementioned variables aggregated by the given ID variables.

Author(s)

Dirk Van den Poel, Michel Ballings, Andrey Volkov, Jeroen D”haen, Michiel Van Herwegen

Maintainer: Michel Ballings <Michel.Ballings@GMail.com>

References

Van den Poel, D., Ballings, M., Volkov, A., D”haen, J., Van Herwegen, M., Predictive Analytics for analytical Customer Relationship Management using SAS, Oracle and R, Springer, Forthcoming.

See Also

Other functions in this package: imputeMissings, Aggregate, cocktailEnsemble, predict.cocktailEnsemble

Examples

1
2
3
4
5
6
#Create some data
data <- data.frame(V1=as.factor(c('yes','no','no','yes','yes','yes','yes')),
                    V2=as.character(c(1,2,3,4,4,4,4)),V3=c(1:7),V4=as.numeric(c(7:1)))
ID=as.character(c(1,1,1,1,2,2,2))
#Demonstrate function
Aggregate(x=data,by=ID)

Example output

aCRM 0.1.1
Type aCRMNews() to see the change log
  ID V1_no_sum V1_yes_sum V2_1_sum V2_2_sum V2_3_sum V2_4_sum V1_no_last
1  1         2          2        1        1        1        1          0
2  2         0          3        0        0        0        3          0
  V1_yes_last V2_1_last V2_2_last V2_3_last V2_4_last V3_sum V4_sum V3_mean
1           1         0         0         0         1     10     22     2.5
2           1         0         0         0         1     18      6     6.0
  V4_mean   V3_var   V4_var
1     5.5 1.666667 1.666667
2     2.0 1.000000 1.000000

aCRM documentation built on May 1, 2019, 8:29 p.m.