df_get_feature_distribution: Calculate the bin-level PSI index for the specified numeric...

View source: R/df_get_feature_distribution.R

df_get_feature_distributionR Documentation

Calculate the bin-level PSI index for the specified numeric features

Description

This function takes two matrices as input. One should contain the features with their expected values. The other should contain the features with their actual values. Example... if we're comparing Oct '18 to Nov '18 features, Oct '18 would be expected and Nov '18 would be actual.

Usage

df_get_feature_distribution(expected_, actual_, features_)

Arguments

expected_

Required: A matrix containing features with the expected (old) data.

actual_

Required: A matrix containing features from with the actual (new) data.

features_

Optional: A vector of the feature names to validate. Note, the feature names must exist in both expected_ and actual_ and be of the same data type in each data frame. If not features are provided, all features in expected_ will be used.

Details

NOTE: This function currently only supports NUMERIC and/or CHARACTER datatypes

Value

A matrix containing the feature name, bin, min value, max value, expected count, expected


BrandonRCopeland/DataScience documentation built on Oct. 14, 2023, 9:45 a.m.