RunCrossValidation: Runs the cross-validation end-to-end using the following...

View source: R/crossvalfunctions.R

RunCrossValidationR Documentation

Runs the cross-validation end-to-end using the following steps: 1. Create multiple cross-validation folds from the data. 2. Filter each fold using the filtering criteria applied to the entire dataset. 3. Run IntLIM for all folds. 4. Process the results for all folds.

Description

Runs the cross-validation end-to-end using the following steps: 1. Create multiple cross-validation folds from the data. 2. Filter each fold using the filtering criteria applied to the entire dataset. 3. Run IntLIM for all folds. 4. Process the results for all folds.

Usage

RunCrossValidation(
  inputData,
  folds,
  analyteType1perc = 0,
  analyteType2perc = 0,
  analyteMiss = 0,
  cov.cutoff = 0,
  stype = "",
  outcome = c(1),
  covar = c(),
  continuous = FALSE,
  save.covar.pvals = FALSE,
  independent.var.type = c(1),
  remove.duplicates = FALSE,
  pvalcutoff = 0.05,
  interactionCoeffPercentile = 0,
  rsquaredCutoff = 0,
  treecuts = 0,
  suppressWarnings = FALSE
)

Arguments

inputData

IntLimData object (output of ReadData()) with analylte levels and associated meta-data

folds

number of folds to create

analyteType1perc

percentile cutoff (0-1) for filtering analyte type 1 (e.g. remove analytes with mean values < 'analyteType1perc' percentile) (default: 0)

analyteType2perc

percentile cutoff (0-1) for filtering analyte type 2 (default: no filtering of analytes) (default:0)

analyteMiss

missing value percent cutoff (0-1) for filtering analytes (analytes with > 80% missing values will be removed) (default:0)

cov.cutoff

percentile cutoff (0-1) for the covariances of the anaytes (default: 0.30)

stype

column name that represents sample type (by default, it will be used in the interaction term). Only 2 categories are currently supported.

outcome

list of outcomes to run. '1' or '2' must be set as outcome/independent variable (default is '1')

covar

Additional variables from the phenotypic data that be integrated into linear model

continuous

boolean to indicate whether the data is continuous or discrete

save.covar.pvals

boolean to indicate whether or not to save the p-values of all covariates, which can be analyzed later but will also lengthen computation time. The default is FALSE.

independent.var.type

list of independent variable types to run. '1' or '2' must be set as independent variable (default is '1')

remove.duplicates

boolean to indicate whether or not to remove the pair with the highest p-value across two duplicate models (e.g. m1~m2 and m2~m1)

pvalcutoff

cutoff of FDR-adjusted p-value for filtering (default 0.05)

interactionCoeffPercentile

percentile cutoff for interaction coefficient

rsquaredCutoff

cutoff for lowest r-squared value

treecuts

user-selected number of clusters (of pairs) to cut the tree into

suppressWarnings

whether to suppress warnings

Value

List of IntResults object with model results (now includes correlations)


IntLIM documentation built on Aug. 22, 2022, 5:05 p.m.