calcPhenotype: Calculates phenotype from microarray data.
In xlucpu/MOVICS: Multi-Omics integration and VIsualization in Cancer Subtyping

Description Usage Arguments Value Author(s)

This function uses ridge regression to calculate a phenotype from an gene expression, given a gene expression matrix where the phenotype is already known. The function also integrates the two datasets using a user-defined procedure, power transforms the known phenotype and provides several other options for flexible and powerful prediction from a gene expression matrix.

calcPhenotype(
  trainingExprData,
  trainingPtype,
  testExprData,
  batchCorrect = "eb",
  powerTransformPhenotype = TRUE,
  removeLowVaryingGenes = 0.2,
  minNumSamples = 10,
  selection = -1,
  printOutput = TRUE,
  removeLowVaringGenesFrom = "homogenizeData"
)

`trainingExprData`	The training data. A matrix of expression levels, rows contain genes and columns contain samples, "rownames()" must be specified and must contain the same type of gene ids as "testExprData"
`trainingPtype`	The known phenotype for "trainingExprData". A numeric vector which MUST be the same length as the number of columns of "trainingExprData".
`testExprData`	The test data where the phenotype will be estimted. It is a matrix of expression levels, rows contain genes and columns contain samples, "rownames()" must be specified and must contain the same type of gene ids as "trainingExprData".
`batchCorrect`	How should training and test data matrices be homogenized. Choices are "eb" (default) for ComBat, "qn" for quantiles normalization or "none" for no homogenization.
`powerTransformPhenotype`	Should the phenotype be power transformed before we fit the regression model? Default to TRUE, set to FALSE if the phenotype is already known to be highly normal.
`removeLowVaryingGenes`	What proportion of low varying genes should be removed? 20 percent be default
`minNumSamples`	How many training and test samples are requried. Print an error if below this threshold
`selection`	How should duplicate gene ids be handled. Default is -1 which asks the user. 1 to summarize by their or 2 to disguard all duplicates.
`printOutput`	Set to FALSE to supress output
`removeLowVaringGenesFrom`	what kind of genes should be removed