optimSplit_dichotom: Optimal Dichotomizing Predictors via Repeated Sample Splits
In Qindex: Continuous and Dichotomized Index Predictors Based on Distribution Quantiles

optimSplit_dichotom

R Documentation

Optimal Dichotomizing Predictors via Repeated Sample Splits

Description

To identify the optimal dichotomizing predictors using repeated sample splits.

Usage

optimSplit_dichotom(
  formula,
  data,
  include = quote(p1 > 0.15 & p1 < 0.85),
  top = 1L,
  nsplit,
  ...
)

split_dichotom(y, x, id, ...)

splits_dichotom(y, x, ids = rSplit(y, ...), ...)

## S3 method for class 'splits_dichotom'
quantile(x, probs = 0.5, ...)

Arguments

`formula`, `y`, `x`	formula, e.g., `y~X` or `y~x1+x2`. Response `y` may be double, logical and Surv. Candidate numeric predictors `x`'s may be specified as the columns of one matrix column, e.g., `y~X`; or as several vector columns, e.g., `y~x1+x2`. In helper functions, `x` is a numeric vector.
`data`	data.frame
`include`	(optional) language, inclusion criteria. Default `(p1>.15 & p1<.85)` specifies a user-desired range of `p_1` for the candidate dichotomizing predictors. See explanation of `p_1` in section Returns of Helper Functions.
`top`	positive integer scalar, number of optimal dichotomizing predictors, default `1L`
`nsplit`, `...`	additional parameters for function rSplit
`id`	logical vector for helper function split_dichotom, indices of training (`TRUE`) and test (`FALSE`) subjects
`ids`	(optional) list of logical vectors for helper function splits_dichotom, multiple copies of indices of repeated training-test sample splits.
`probs`	double scalar for helper function quantile.splits_dichotom, see quantile

Details

Function optimSplit_dichotom identifies the optimal dichotomizing predictors via repeated sample splits. Specifically,

Generate multiple, i.e., repeated, training-test sample splits (via rSplit)
For each candidate predictor x_i, find the median-split-dichotomized regression model based on the repeated sample splits, see details in section Details on Helper Functions
Limit the selection of the candidate predictors x's to a user-desired range of p_1 of the split-dichotomized regression models, see explanations of p_1 in section Returns of Helper Functions
Rank the candidate predictors x's by the decreasing order of the absolute values of the regression coefficient estimate of the median-split-dichotomized regression models. On the top of this rank are the optimal dichotomizing predictors.

Value

Function optimSplit_dichotom returns an object of class 'optimSplit_dichotom', which is a list of dichotomizing functions, with the input formula and data as additional attributes.

Details on Helper Functions

Split-Dichotomized Regression Model

Helper function split_dichotom performs a univariable regression model on the test set with a dichotomized predictor, using a dichotomizing rule determined by a recursive partitioning of the training set. Specifically, given a training-test sample split,

find the dichotomizing rule \mathcal{D} of the predictor x_0 given the response y_0 in the training set (via rpartD);
fit a univariable regression model of the response y_1 with the dichotomized predictor \mathcal{D}(x_1) in the test set.

Currently the Cox proportional hazards (coxph) regression for Surv response, logistic (glm) regression for logical response and linear (lm) regression for gaussian response are supported.

Split-Dichotomized Regression Models based on Repeated Training-Test Sample Splits

Helper function splits_dichotom fits multiple split-dichotomized regression models split_dichotom on the response y and predictor x, based on each copy of the repeated training-test sample splits.

Quantile of Split-Dichotomized Regression Models

Helper function quantile.splits_dichotom is a method dispatch of the S3 generic function quantile on splits_dichotom object. Specifically,

collect the univariable regression coefficient estimate from each one of the split-dichotomized regression models;
find the nearest-even (i.e., type = 3) quantile of the coefficients from Step 1. By default, we use the median (i.e., prob = .5);
the split-dichotomized regression model corresponding to the selected coefficient quantile in Step 2, is returned.

Returns of Helper Functions

Helper function split_dichotom returns a split-dichotomized regression model, which is either a Cox proportional hazards (coxph), a logistic (glm), or a linear (lm) regression model, with additional attributes

attr(,'rule'): function, dichotomizing rule \mathcal{D} based on the training set
attr(,'text'): character scalar, human-friendly description of \mathcal{D}
attr(,'p1'): double scalar, p_1 = \text{Pr}(\mathcal{D}(x_1)=1)
attr(,'coef'): double scalar, univariable regression coefficient estimate of y_1\sim\mathcal{D}(x_1)

Helper function splits_dichotom returns a list of split-dichotomized regression models (split_dichotom).

Helper function quantile.splits_dichotom returns a split-dichotomized regression model (split_dichotom).

Examples

# see ?`Qindex-package`

Qindex documentation built on April 4, 2025, 2:14 a.m.

Qindex index

Package overview

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Qindex
Continuous and Dichotomized Index Predictors Based on Distribution Quantiles

optimSplit_dichotom: Optimal Dichotomizing Predictors via Repeated Sample Splits
In Qindex: Continuous and Dichotomized Index Predictors Based on Distribution Quantiles

Optimal Dichotomizing Predictors via Repeated Sample Splits

Description

Usage

Arguments

Details

Value

Details on Helper Functions

Split-Dichotomized Regression Model

Split-Dichotomized Regression Models based on Repeated Training-Test Sample Splits

Quantile of Split-Dichotomized Regression Models

Returns of Helper Functions

Examples

Related to optimSplit_dichotom in Qindex...

R Package Documentation

Browse R Packages

We want your feedback!

Qindex Continuous and Dichotomized Index Predictors Based on Distribution Quantiles

optimSplit_dichotom: Optimal Dichotomizing Predictors via Repeated Sample Splits In Qindex: Continuous and Dichotomized Index Predictors Based on Distribution Quantiles

Optimal Dichotomizing Predictors via Repeated Sample Splits

Description

Usage

Arguments

Details

Value

Details on Helper Functions

Split-Dichotomized Regression Model

Split-Dichotomized Regression Models based on Repeated Training-Test Sample Splits

Quantile of Split-Dichotomized Regression Models

Returns of Helper Functions

Examples

Related to optimSplit_dichotom in Qindex...

R Package Documentation

Browse R Packages

We want your feedback!

Qindex
Continuous and Dichotomized Index Predictors Based on Distribution Quantiles

optimSplit_dichotom: Optimal Dichotomizing Predictors via Repeated Sample Splits
In Qindex: Continuous and Dichotomized Index Predictors Based on Distribution Quantiles