fasi: Create a ranking score model to impliment the fasi...

View source: R/fasi.R

fasiR Documentation

Create a ranking score model to impliment the fasi classification algorithm.

Description

This function impliments the Fair Adjusted Selective Inference method. It assumes that you have an observed data set that includes all variables needed for your ranking score model and class labels. The user is able to pick from a set of popular ML algorithms when estimating the ranking scores and is able to provide the algorithm their own model. If desired, the user can also directly provide their own ranking scores without using the functions pre-set algorithms. These will be directly used in the predict step when estimating the r-scores.

Usage

fasi(
  observed_data,
  model_formula,
  split_p = 0.5,
  alg = "gam",
  class_label = "y",
  niter_adaboost = 10
)

Arguments

observed_data

The observed data set that will be split into a testing and calibration data for you by proportion split_p - which is user-specified. If you are providing your own ranking scores, the observed data should just be the calibration data.

model_formula

A formula that will be provided to a specified ML model used to produce ranking scores. Please be sure to follow the exact notation of each package and wrap your formula in the as.formula function.

split_p

The proportion of your observed data that should be used for the training data set.

alg

A specified algorithm used to produce ranking scores. The options are "gam", "logit", "adaboost", "nonparametric_nb", and "user-provided".

class_label

The name of the class label variable in your data set. Defaults to "y".

niter_adaboost

The number of weak learners you want to use for the adaboost algorithm. Defaults to 10. This parameter is useless if you did not select the adaboost algorithm.

Value

A list where the first element is the observed data with an extra variable denoting which observation was selected for the training and calibration data set, second is the model fit, third the training data. fourth the calibration data and lastly the chosen ranking score algorithm.

Author(s)

Bradley Rava. PhD Candidate at the University of Southern California's Marshall School of Business. Department of Data Sciences and Operations.

Examples


fasi(observed_data, model_formula, split_p=0.5, alg="gam", class_label="y")


bradleyrava/fasi documentation built on May 12, 2024, 6:23 a.m.