reshapeData: Reshape data - create User Journey

Description Usage Arguments Value Examples

View source: R/data_transformation.R

Description

This function creates a rectangle (wide) version of the User Journey based on specific columns that need to be provided. The rows then represent observations of different variables at a specific point in time for a particular user. The chosen time window between observation is based on the date provided by the user.

Usage

1
reshapeData(input, additional, extraCol, handling, handlingExtra, na.rm)

Arguments

input

an object of type data.frame including the mandatory column names 'id', 'type', 'value', and 'date'.

additional

an optional character string defining an additional column or multiple columns of input to be considered. Then, each unique level of the corresponding column are treated as additional feature in the output. If multiple observations occur at the same point in time for a specific user, the features will be aggregated as defined in parameter handling.

extraCol

an optional character string defining an additional column or multiple columns of input to be considered. Then, each column becomes a feature with its corresponding values in the output. These extra features will be handled as specified in handlingExtra.

handling

an optional character string that defines how multiple observations of same type at same point in time are handled. Can be "first", "sum", or "mean". Default is the consideration of the most recent value only. For categorical variables, if "sum" or "mean" are specified, the most recent value is utilized.

handlingExtra

an optional character string that defines how multiple observations of same type at same point in time are handled for the extra features. Can be "first", "sum", or "mean". Default is the consideration of the most recent value only. For categorical variables, if "sum" or "mean" are specified, the most recent value is utilized.

na.rm

an optional logical value True or False. If True, all rows with missing values are omitted. Default is False.

Value

An object of type data.frame.

Examples

1
2
3
4
5
6
7
8
# create data frame with mandatory columns
data = data.frame('id'=rep(c(1:5), each=6),
      'type'=rep(c('Var1', 'Var2', 'Var3'), times=10),
      'value'=rep(c(1:5), times=6),
      'date'=rep(seq(as.Date("2000/1/1"), by = "day", length.out=15), each=2))

# use function to create rectangle version of user journey
dat = reshapeData(data, na.rm=F)

LoneWolf6/UJ-Analysis documentation built on Sept. 16, 2020, 4:59 a.m.