calcPhenotype: Calculates phenotype from microarray data.

Description Usage Arguments Value Author(s)

View source: R/pRRophetic.R

Description

This function uses ridge regression to calculate a phenotype from an gene expression, given a gene expression matrix where the phenotype is already known. The function also integrates the two datasets using a user-defined procedure, power transforms the known phenotype and provides several other options for flexible and powerful prediction from a gene expression matrix.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
calcPhenotype(
  trainingExprData,
  trainingPtype,
  testExprData,
  batchCorrect = "eb",
  powerTransformPhenotype = TRUE,
  removeLowVaryingGenes = 0.2,
  minNumSamples = 10,
  selection = -1,
  printOutput = TRUE,
  removeLowVaringGenesFrom = "homogenizeData"
)

Arguments

trainingExprData

The training data. A matrix of expression levels, rows contain genes and columns contain samples, "rownames()" must be specified and must contain the same type of gene ids as "testExprData"

trainingPtype

The known phenotype for "trainingExprData". A numeric vector which MUST be the same length as the number of columns of "trainingExprData".

testExprData

The test data where the phenotype will be estimted. It is a matrix of expression levels, rows contain genes and columns contain samples, "rownames()" must be specified and must contain the same type of gene ids as "trainingExprData".

batchCorrect

How should training and test data matrices be homogenized. Choices are "eb" (default) for ComBat, "qn" for quantiles normalization or "none" for no homogenization.

powerTransformPhenotype

Should the phenotype be power transformed before we fit the regression model? Default to TRUE, set to FALSE if the phenotype is already known to be highly normal.

removeLowVaryingGenes

What proportion of low varying genes should be removed? 20 percent be default

minNumSamples

How many training and test samples are requried. Print an error if below this threshold

selection

How should duplicate gene ids be handled. Default is -1 which asks the user. 1 to summarize by their or 2 to disguard all duplicates.

printOutput

Set to FALSE to supress output

removeLowVaringGenesFrom

what kind of genes should be removed

Value

A vector of the estimated phenotype, in the same order as the columns of "testExprData".

Author(s)

Paul Geeleher, Nancy Cox, R. Stephanie Huang


xlucpu/MOVICS documentation built on July 24, 2021, 9:23 p.m.