rankInverseNormalDataFrame: rank-based inverse normal transformation of the data
In FRESA.CAD: Feature Selection Algorithms for Computer Aided Diagnosis

View source: R/rankInverseNormalDataFrame.R

rankInverseNormalDataFrame

R Documentation

rank-based inverse normal transformation of the data

Description

This function takes a data frame and a reference control population to return a z-transformed data set conditioned to the reference population. Each sample data for each feature column in the data frame is conditionally z-transformed using a rank-based inverse normal transformation, based on the rank of the sample in the reference frame.

Usage

	rankInverseNormalDataFrame(variableList,
	                           data,
	                           referenceframe,
	                           strata=NA)

Arguments

`variableList`	A data frame with two columns. The first one must have the names of the candidate variables and the other one the description of such variables
`data`	A data frame where all variables are stored in different columns
`referenceframe`	A data frame similar to `data`, but with only the control population
`strata`	The name of the column in `data` that stores the variable that will be used to stratify the model

Value

A data frame where each observation has been conditionally z-transformed, given control data

Author(s)

Jose G. Tamez-Pena and Antonio Martinez-Torteya

Examples

	## Not run: 
	# Start the graphics device driver to save all plots in a pdf format
	pdf(file = "Example.pdf")
	# Get the stage C prostate cancer data from the rpart package
	library(rpart)
	data(stagec)
	# Split the stages into several columns
	dataCancer <- cbind(stagec[,c(1:3,5:6)],
	                    gleason4 = 1*(stagec[,7] == 4),
	                    gleason5 = 1*(stagec[,7] == 5),
	                    gleason6 = 1*(stagec[,7] == 6),
	                    gleason7 = 1*(stagec[,7] == 7),
	                    gleason8 = 1*(stagec[,7] == 8),
	                    gleason910 = 1*(stagec[,7] >= 9),
	                    eet = 1*(stagec[,4] == 2),
	                    diploid = 1*(stagec[,8] == "diploid"),
	                    tetraploid = 1*(stagec[,8] == "tetraploid"),
	                    notAneuploid = 1-1*(stagec[,8] == "aneuploid"))
	# Remove the incomplete cases
	dataCancer <- dataCancer[complete.cases(dataCancer),]
	# Load a pre-established data frame with the names and descriptions of all variables
	data(cancerVarNames)
	# Set the group of no progression
	noProgress <- subset(dataCancer,pgstat==0)
	# z-transform g2 values using the no-progression group as reference
	dataCancerZTransform <- rankInverseNormalDataFrame(variableList = cancerVarNames[2,],
	                                                   data = dataCancer,
	                                                   referenceframe = noProgress)
	# Shut down the graphics device driver
	dev.off()
## End(Not run)

FRESA.CAD documentation built on June 26, 2024, 1:07 a.m.