generate_edges: Generate edges from event logs

Description Usage Arguments Value Examples

View source: R/generate_edges.R

Description

eventlog should be a data.frame or data.table, which contains, at least, following columns:

Usage

1
generate_edges(eventlog, distinct_case = FALSE, target_categories = NULL)

Arguments

eventlog

Event logs

distinct_case

Whether should only count unique case

target_categories

A vector contains the target activity categories. By default, it's NULL, which means every paths count. If it's contains the target activity category, then only paths reaches the target activity count.

Value

a data.frame of edges with from, to and amount columns.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
# -----------------------------------------------------
# Generating edges and count every paths no matter whether
# it's from the same case or not.
# -----------------------------------------------------
eventlog <- generate_eventlog()
edges <- generate_edges(eventlog)
# Have a look on generated edges
head(edges)
# # A tibble: 6 x 7
#   from    to     amount mean_duration median_duration max_duration min_duration
#   <chr>   <chr>   <int> <chr>         <chr>           <chr>        <chr>
# 1 Activi… Activ…     12 1.02 weeks    1.12 weeks      1.91 weeks   2 hours
# 2 Activi… Activ…      5 3.59 days     4.41 days       1.03 weeks   17.33 hours
# 3 Activi… Activ…     10 1.13 weeks    7 days          2.93 weeks   1.51 days
# 4 Activi… Activ…      1 2.46 weeks    2.46 weeks      2.46 weeks   2.46 weeks
# 5 Activi… Activ…      3 1.72 weeks    1.2 weeks       2.79 weeks   1.16 weeks
# 6 Activi… Activ…      8 3.38 days     2.2 days        1.85 weeks   15.1 hours

str(edges)
# Classes ‘tbl_df’, ‘tbl’ and 'data.frame':       161 obs. of  7 variables:
#  $ from           : chr  "Activity 1 (normal)" "Activity 1 (normal)" "Activity 1 (normal)"
# "Activity 1 (normal)" ...
#  $ to             : chr  "Activity 1 (normal)" "Activity 10 (phone)" "Activity 11 (phone)"
# "Activity 12 (phone)" ...
#  $ amount         : int  12 5 10 1 3 8 2 15 9 15 ...
#  $ mean_duration  : chr  "1.02 weeks" "3.59 days" "1.13 weeks" "2.46 weeks" ...
#  $ median_duration: chr  "1.12 weeks" "4.41 days" "7 days" "2.46 weeks" ...
#  $ max_duration   : chr  "1.91 weeks" "1.03 weeks" "2.93 weeks" "2.46 weeks" ...
#  $ min_duration   : chr  "2 hours" "17.33 hours" "1.51 days" "2.46 weeks" ...
#
# -----------------------------------------------------
# Generate edges by specify the target categories, and the paths
# not reaching the target category activities will be ignored.
# -----------------------------------------------------
edges <- generate_edges(eventlog, target_categories = c("target"))
str(edges)
# Classes ‘tbl_df’, ‘tbl’ and 'data.frame':       115 obs. of  7 variables:
#  $ from           : chr  "Activity 1 (normal)" "Activity 1 (normal)" "Activity 1 (normal)"
# "Activity 1 (normal)" ...
#  $ to             : chr  "Activity 1 (normal)" "Activity 11 (phone)" "Activity 13 (target)"
# "Activity 14 (target)" ...
#  $ amount         : int  1 3 1 4 2 6 1 1 3 1 ...
#  $ mean_duration  : chr  "4.4 days" "2.12 weeks" "2.89 hours" "1.47 weeks" ...
#  $ median_duration: chr  "4.4 days" "2.15 weeks" "2.89 hours" "1.04 weeks" ...
#  $ max_duration   : chr  "4.4 days" "2.16 weeks" "2.89 hours" "3.53 weeks" ...
#  $ min_duration   : chr  "4.4 days" "2.03 weeks" "2.89 hours" "1.76 days" ...
#

twang2218/pmap documentation built on Nov. 3, 2021, 11:25 p.m.