workflowImportanceHP: Computes the contribution to dissimilarity of each variable...

Description Usage Arguments Details Value Author(s)

View source: R/workflowImportanceHP.R

Description

This workflow executes the following steps:

Usage

1
2
3
4
5
6
7
workflowImportanceHP(
  sequences = NULL,
  grouping.column = NULL,
  time.column = NULL,
  exclude.columns = NULL,
  parallel.execution = TRUE
  )

Arguments

sequences

dataframe with multiple sequences identified by a grouping column generated by prepareSequences.

grouping.column

character string, name of the column in sequences to be used to identify separates sequences within the file.

time.column

character string, name of the column with time/depth/rank data.

exclude.columns

character string or character vector with column names in sequences to be excluded from the analysis.

parallel.execution

boolean, if TRUE (default), execution is parallelized, and serialized if FALSE.

Details

If we consider the question "what variable contributes the most to the dissimilarity between two sequences?" the answer "the one dropping dissimilarity the most when excluded from the analysis" sounds like a reasonable answer. This workflow attempts to reach that answer by computing psi while removing one variable at a time.

Value

A list with two slots named psi and psi.drop. The former contains the dissimilarity values when removing each variable, while the latter contains the drop in dissimilarity (as a percentage of psi computed on all variables) that happens when each variable is removed. Positive values indicate that dissimilarity drops when the variable is removed, while negative values indicate that similarity drops when the variable is removed.

Author(s)

Blas Benito <blasbenito@gmail.com>


distantia documentation built on Oct. 30, 2019, 10:05 a.m.