Description Usage Arguments Value Note Examples
A parallel PrefixSpan algorithm to mine frequent sequential patterns.
spark.findFrequentSequentialPatterns
returns a complete set of frequent sequential
patterns.
For more details, see
PrefixSpan.
1 2 3 4 5 6 7 8 9 10 | spark.findFrequentSequentialPatterns(data, ...)
## S4 method for signature 'SparkDataFrame'
spark.findFrequentSequentialPatterns(
data,
minSupport = 0.1,
maxPatternLength = 10L,
maxLocalProjDBSize = 32000000L,
sequenceCol = "sequence"
)
|
data |
A SparkDataFrame. |
... |
additional argument(s) passed to the method. |
minSupport |
Minimal support level. |
maxPatternLength |
Maximal pattern length. |
maxLocalProjDBSize |
Maximum number of items (including delimiters used in the internal storage format) allowed in a projected database before local processing. |
sequenceCol |
name of the sequence column in dataset. |
A complete set of frequent sequential patterns in the input sequences of itemsets.
The returned SparkDataFrame
contains columns of sequence and corresponding
frequency. The schema of it will be:
sequence: ArrayType(ArrayType(T))
, freq: integer
where T is the item type
spark.findFrequentSequentialPatterns(SparkDataFrame) since 3.0.0
1 2 3 4 5 6 7 8 9 10 11 | ## Not run:
df <- createDataFrame(list(list(list(list(1L, 2L), list(3L))),
list(list(list(1L), list(3L, 2L), list(1L, 2L))),
list(list(list(1L, 2L), list(5L))),
list(list(list(6L)))),
schema = c("sequence"))
frequency <- spark.findFrequentSequentialPatterns(df, minSupport = 0.5, maxPatternLength = 5L,
maxLocalProjDBSize = 32000000L)
showDF(frequency)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.