partial_dep.feature: Partial Dependency, output analyzer

Description Usage Arguments Details Value

Description

This function is a helper for analyzing partial dependency output. It uses information theory to assess the score of a predictor against the label.

Usage

1
partial_dep.feature(grid_data, method = "emp", in_depth = FALSE)

Arguments

grid_data

Type: data.table. A partial_dep grid_exp output.

method

Type: character. The method to use to compute information theory -related values. Defaults to "emp", see details for more information.

in_depth

Type: logical. Whether to perform Partial Mann-Kendall test.

Details

There are multiple outputs, all information theory -related values (entropy, mutual information, synergy, total correlation) are provided with the empirical probability distribution by default. One can change them using the parmaeter method, but it is recommended to leave the default on unless you know what you are doing:

"emp"

(Default) Entropy of the empirical probability distribution.

"mm"

Miller-Madow asymptotic bias corrected empirical estimator.

"shrink"

Shrinkage estimate of the entropy of a Dirichlet probability distribution.

"sg"

Schurmann-Grassberger estimate of the entropy of a Dirichlet probability distribution.

Use the function infotheo::natstobits(value) to convert from nats (base e) to bits (base 2).

Table "Features":

"Feature"

Column name assessed.

"Unique_Values"

Count of unique values of the feature.

"Entropy"

Entropy of the feature.

"Label_Mutual_Info"

Mutual Information between Feature and Label.

"Evolution_Mutual_Info"

Mutual Information between Feature and Evolution.

"p.Mann_Kendall"

in_depth = TRUE p-value of the Partial Mann-Kendall (multivariate) test to detect non-parametric monotonic trends in potentially seasonal data. Low value means confidence in a trend.

"p.Partial_Spearman"

in_depth = TRUE p-value of the Partial Spearman Correlation trend test. Low value means confidence in a correlation greater than 0.

Table "Global":

"Without_Feature"

Column name NOT assessed.

"Synergy"

Synergy/Complementarity (inter information) provided by the data without the feature.

"Total_Correlation"

Total Correlation (multi information) provided by the data without the feature.

Value

Value statistics of the input for variability per column.


Laurae2/Laurae documentation built on May 8, 2019, 7:59 p.m.