df2SPMFBasket: Basket creating function

Description Usage Arguments Details Value Examples

Description

df2SPMFBasket creates baskets from a dataframe, compliant with SPMF formats.

Usage

1
2
df2SPMFBasket(df, ID, itemset = "", event = "", time = "",
  timeFormat = "", timestep = 1, parallel = F)

Arguments

df

a data frame from which to create baskets

ID

the name of the column of IDs. They allow the grouping by 'customer'.

itemset

the name of the column of itemsets, that is of the product bought together. You need to provide at least one of itemset or time parameters.

time

the name of the column where the time of an event is stored. You need to provide at least one of itemset or time parameters.

timeFormat

the format in which the time column is encoded (example "%d-%m-%Y") If provided df2SPMFBasket will assume you want time to be taken into account

parallel

if TRUE, then the function will use all the cores of your system and parallelize the creation of your baskets. Default is F because the gain depends on the number of cores and the length of the dataframe.

Details

It then calls the functions ToBasket or ToTimedBasket. I outputs a list with two elements : toSendSPMF contains a variable basket respecting the input format for SPMF frequent itemset mining, with or without time. The second element, evLev is the matching table to the original item names.

Value

df2SPMFBasket returns a list. toSendSPMF contains a dataframe whose slot basket contains all the basket in the proper format to export them to a txt file readable by the spmf java library.

Examples

1
2
3
seqDF is a dataframe to test the functions. It contains the variables ID, jour, ITEMSETS and PRODUITSnum to be used as an example.
test<-df2SPMFBasket(seqDF,ID="ID",time="jour",event="PRODUITSnum",
itemset="ITEMSETS",timeFormat="\%d",parallel = F)

MGousseff/r2spmf documentation built on May 26, 2019, 11:58 p.m.