FrequentSequences: A function to find frequent sequences using the wonderful...

Description Usage Arguments Value

Description

This function sends a list of sequences to spmf and gets back the most frequent sequences with their support and frequence.

Usage

1
2
FrequentSequences(x, algo, minsup = 0.5, nbEventMax = "", showID = "",
  clean = T, minTime = "", maxTime = "", minWhole = "", maxWhole = "")

Arguments

x

is the output of the function df2SPMFSequence function. It must contains a toSendSPMF storing a tibble with a basket column. This basket is sent to spmf java library.

algo

is the name of the algorithm to be used. The possible values are : Please check SPMF documentation at http://www.philippe-fournier-viger.com/spmf/index.php?link=documentation.php

minsup

is the value of the minimum support for a sequence to be seen as frequent. To get all the sequences present in more than 40 percent of the database sequences, type either "40%" or "0.4"

clean

if clean=F the text files sent to and received from spmf will be kept in the working directory.

minTime

is the minimum time interval for 2 items to be considered as belonging to the same sequence

maxTime

is the maximum time interval for 2 items to be considered as belonging to the same sequence

minWhole

is the minimum length of time for a whole sequence to be counted

maxWhole

is the maximum length of time for a whole sequence to be counted

Those

time parameters are only considered when the input is a timed sequence. The only algorithme to be used are then "Fournier08-Closed+time" and "HirateYamana"

Value

a dataframe with three columns. sequence contains all the frequent sequences. support is the number of times this sequence occurs, and frequence is support divided by the total number of sequences


MGousseff/r2spmf documentation built on May 26, 2019, 11:58 p.m.