tsp_model_builder: Build and cross-validate a TSP-based model

Description Usage Arguments Details Value

Description

This function takes training/test data and pairs generated via empirical control feature selection and builds a decision tree model. It also cross-validates to get an out-of-sample accuracy estimate

Usage

1
2
tsp_model_builder(train, train_outcome, train_covar, pairs, test, test_covar,
  npair, predtype)

Arguments

train

p x n training data matrix

train_outcome

Outcome data of length n

train_covar

n x q additional covariates for training data (optional)

pairs

r x n matrix of TSP generated via empirical controls

test

p x m test data matrix, where p columns and column names match up with train

test_covar

m x s additional covariates for training data (necessary if train_covar specified; column names must match)

npair

Number of pairs desired in the model

predtype

Type of predictions to make - "class" if initial outcome is factor, "vector" if initial outcome is non-factor

Details

This is a wrapper for a series of model-building steps. The main output of this function is the TSP decision tree model. We incorporate a second feature selection step (after empirical controls, done separately) that chooses from the candidate pairs. Pairs are chosen based on how much additional predictive value they provide on top of pairs already selected (and non-pair covariates, if specified). We also cross-validate this entire procedure five times to get an estimate of out-of-sample accuracy of our model.

Value

A list contaiing the following attributes:

tree

The final decision tree built on training data

p_train

Model predictions on training data

p_test

Model predictions on test data

final_names

Pair names in final model

pair_names

Pair name aliases for pretty tree printing

acc

Out-of-sample accuracy calculated via cross-validation


prpatil/tdsm documentation built on May 26, 2019, 10:32 a.m.