makepath: Create a pathway variable

Description Usage Arguments Examples

View source: R/makepath.R

Description

Feature Requests:

  1. a time variable and business rules based on date times;

  2. step grouping (IE: step1, step2, step3 = phase1; step4, step5 = phase2; etc.). Takes a dataframe column you want to group by, and a column you want to make a pathway out of and returns a pathway vector the size of your original data. Used when you want to know unique combinations of steps in order to count or group by. A medical pathway or business process steps are good use cases.

Usage

1
2
3
makepath(groupcol, pathcol, sep = "-", subset = FALSE, keepvalues,
  ordered = TRUE, keepconsec = TRUE,
  n.cores = parallel::detectCores() - 1)

Arguments

groupcol

The column you want to group by. Generally it's a person or employee.

pathcol

The column you want to create a path from. IE: service_type, location, step

sep

The seperator that goes between the parts of the pathway. The default is hyphen (-).

subset

A boolean flag to indicate if you want to use every possible part/step in the pathway or if you just want to track certain steps. Default is FALSE (use all values). Must use the keepvalues parameter if the subset flag is TRUE (use certain values).

keepvalues

A character vector of the pathway parts/steps you want to use. Only use when the subset flag is TRUE.

ordered

A boolean flag to indicate whether or not the path should care about occurence order (when the step occured). Default is TRUE. If flag is set to FALSE the pathway vector will be sorted alphabetically.

keepconsec

A boolean flag to indicate if you want to keep or remove duplicated steps in the pathway. Default is TRUE.

n.cores

An integer value that indicates the number of cores you want to run the process on. The default is 1 less than the total number of available cores on the CPU for UNIX flavored OSs, while the only option (currently) on Windows OS is 1.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
asd <- data.frame(
    id               = rep(letters, times = 4)
  , service          = sample(
      c('ps1', 'ps2', 'ps3', 'ps4', 'ps5', 'ps6', 'ps7'
        , 'install1', 'install2', 'install3', 'other'
        )
    , size    = 26 * 4
    , replace = TRUE
    )
  , stringsAsFactors = FALSE
  )

asd$path1 <- makepath(
    groupcol = asd$id
  , pathcol  = asd$service
  , n.cores  = 1
  )
asd$path2 <- makepath(
    groupcol   = asd$id
  , pathcol    = asd$service
  , subset     = TRUE
  , keepvalues = c('ps1', 'ps2', 'ps3')
  , n.cores    = 1
  )
asd$path3 <- makepath(
    groupcol   = asd$id
  , pathcol    = asd$service
  , subset     = TRUE
  , keepvalues = c('ps1', 'ps2', 'ps3')
  , ordered    = FALSE
  , n.cores    = 1
  )
asd$path4 <- makepath(
    groupcol   = asd$id
  , pathcol    = asd$service
  , subset     = TRUE
  , keepvalues = c('ps1', 'ps2', 'ps3')
  , ordered    = FALSE
  , keepconsec = TRUE
  , n.cores    = 1
  )

asd

Paul-James/pjames documentation built on Aug. 9, 2019, 12:18 p.m.