RunCrossValidation: Runs the cross-validation end-to-end using the following...
In IntLIM: Integration of Omics Data Using Linear Modeling

RunCrossValidation

R Documentation

Runs the cross-validation end-to-end using the following steps: 1. Create multiple cross-validation folds from the data. 2. Filter each fold using the filtering criteria applied to the entire dataset. 3. Run IntLIM for all folds. 4. Process the results for all folds.

Description

Runs the cross-validation end-to-end using the following steps: 1. Create multiple cross-validation folds from the data. 2. Filter each fold using the filtering criteria applied to the entire dataset. 3. Run IntLIM for all folds. 4. Process the results for all folds.

Usage

RunCrossValidation(
  inputData,
  folds,
  analyteType1perc = 0,
  analyteType2perc = 0,
  analyteMiss = 0,
  cov.cutoff = 0,
  stype = "",
  outcome = c(1),
  covar = c(),
  continuous = FALSE,
  save.covar.pvals = FALSE,
  independent.var.type = c(1),
  remove.duplicates = FALSE,
  pvalcutoff = 0.05,
  interactionCoeffPercentile = 0,
  rsquaredCutoff = 0,
  treecuts = 0,
  suppressWarnings = FALSE
)

Arguments

`inputData`	IntLimData object (output of ReadData()) with analylte levels and associated meta-data
`folds`	number of folds to create
`analyteType1perc`	percentile cutoff (0-1) for filtering analyte type 1 (e.g. remove analytes with mean values < 'analyteType1perc' percentile) (default: 0)
`analyteType2perc`	percentile cutoff (0-1) for filtering analyte type 2 (default: no filtering of analytes) (default:0)
`analyteMiss`	missing value percent cutoff (0-1) for filtering analytes (analytes with > 80% missing values will be removed) (default:0)
`cov.cutoff`	percentile cutoff (0-1) for the covariances of the anaytes (default: 0.30)
`stype`	column name that represents sample type (by default, it will be used in the interaction term). Only 2 categories are currently supported.
`outcome`	list of outcomes to run. '1' or '2' must be set as outcome/independent variable (default is '1')
`covar`	Additional variables from the phenotypic data that be integrated into linear model
`continuous`	boolean to indicate whether the data is continuous or discrete
`save.covar.pvals`	boolean to indicate whether or not to save the p-values of all covariates, which can be analyzed later but will also lengthen computation time. The default is FALSE.
`independent.var.type`	list of independent variable types to run. '1' or '2' must be set as independent variable (default is '1')
`remove.duplicates`	boolean to indicate whether or not to remove the pair with the highest p-value across two duplicate models (e.g. m1~m2 and m2~m1)
`pvalcutoff`	cutoff of FDR-adjusted p-value for filtering (default 0.05)
`interactionCoeffPercentile`	percentile cutoff for interaction coefficient
`rsquaredCutoff`	cutoff for lowest r-squared value
`treecuts`	user-selected number of clusters (of pairs) to cut the tree into
`suppressWarnings`	whether to suppress warnings