plot: Data visualizations of anomaly score locally around a...

Description Usage Arguments Details Value

Description

Data visualizations of anomaly score locally around a specific data point

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
## S3 method for class 'stranger'
plot(x, type = "cluster", id = ".id", score = NULL,
  anomaly_id = NULL, ...)

## S3 method for class 'fortifiedanomaly'
plot(x, type = "feature_importance", id = ".id",
  anomaly_id = NULL, score = NULL, ...)

## S3 method for class 'anomalies'
plot(x, type = "feature_importance", id = ".id",
  anomaly_id = NULL, ...)

## S3 method for class 'singular'
plot(x, type = "cluster", id = ".id", score = NULL,
  anomaly_id = NULL, ...)

Arguments

x

is either of class dataframe, stranger or anomaly. It contains the observations; each row represents an observation and each variable is stored in one column. It must have at least one column with IDs and one column with the anomaly score for each ID.

type

is the name of the visualization; (1) A hierarchical clustering, named "cluster", showing among the top n-anomaly which records belongs to the same cluster a specific record. Finding the commun pattern amoung the cluster may lead to the orign of of the specifi record score. (2) A dots plot, named "neighbours", showing the relationship between the anomly score and each feature for the k nearest neighbours of a specific record. (3) A bar chart, named "feature_importance", showing how sensitive is the anomaly score of a specific record to each of feature. This may help to identify the features behind the score. (4) A dots plot, names "score_decline", showing the decrease in anomaly score among the k nearest neighbours of a specific record. The shape indicates how extrem and how frequent is the anomaly score of a speicif record amoung its neighbours. (5) A Regression tree, named "regression_tree", showing the roots to high score around a specific record.

id

is the colname with records IDs

score

is the colname which contains the anomaly score

anomaly_id

is the record ID you want to investigate

...

Additional parameters to pass

Details

Function that produces visualizations to understand the anomaly score locally around a specific data point. We believe this should help people to trust scores a made by models even if they don’t fully understand them. Today, 5 visualisazions are implemented; (1) A hierarchical clustering, named "cluster", showing among the top n-anomaly which records belongs to the same cluster a specific record. Finding the commun pattern amoung the cluster may lead to the orign of of the specifi record score. (2) A dots plot, named "neighbours", showing the relationship between the anomly score and each feature for the k nearest neighbours of a specific record. (3) A bar chart, named "feature_importance", showing how sensitive is the anomaly score of a specific record to each of feature. This may help to identify the features behind the score. (4) A dots plot, names "score_decline", showing the decrease in anomaly score among the k nearest neighbours of a specific record. The shape indicates how extrem and how frequent is the anomaly score of a speicif record amoung its neighbours. (5) A Regression tree, named "regression_tree", showing the roots to high score around a specific record.

Extra parameters that can be used in ...:

Value

A plot


stranger documentation built on March 18, 2018, 2:01 p.m.