estimateDataCha: A function to estimate data characteristics
In SPreFuGED: Selecting a Predictive Function for a Given Gene Expression Data

Description Usage Arguments Details Value Author(s) References See Also Examples

This function fits limma models (or univariate Cox's models t) to determine DE (informative) genes and then computes the proportion of DE (informative) genes, log2FC (coefficients or betas), pairwise correlation of DE (informative) and noisy genes, genes' variances, sample sizes and proportion of events (for survival data).

1	estimateDataCha(data, dataY, type = "Binary")

`data`	a matrix of expression values with rows corresponding to genes and columns to samples
`dataY`	a binary vector of class labels or a survival outcome as produced by Surv. Its length must be equal to the number of columns of data.
`type`	takes Binary(Default) or Survival as values and correspond to binary classification or survival prediction

At the moment, only binary classification has been implemented.

A 1x7 (for Binary) or 1x8 (for Survival) matrix containing the estimates (row) of the data characteristics (columns)

Victor Lih Jong

Jong VL, Novianti PW, Roes KCB & Eijkemans MJC. Selecting a classification function for class prediction with gene expression data. Bioinformatics (2016) 32(12): 1814-1822

fitLMEModel, SPreFu and plotSPreFu

#Let us consider a single simulated train data as our real-life dataset
myCov<-covMat(pAll=100, lambda=2, corrDE=0.75, sigma=0.25);
myData<-generateGED(covAll=myCov, nTrain=30, nTest=10);
data<-myData[[1]]$trainData;
dataY<-myData[[1]]$trainLabels;
myDataCha<-estimateDataCha(data, dataY);