rankInverseNormalDataFrame: rank-based inverse normal transformation of the data

View source: R/rankInverseNormalDataFrame.R

rankInverseNormalDataFrameR Documentation

rank-based inverse normal transformation of the data

Description

This function takes a data frame and a reference control population to return a z-transformed data set conditioned to the reference population. Each sample data for each feature column in the data frame is conditionally z-transformed using a rank-based inverse normal transformation, based on the rank of the sample in the reference frame.

Usage

	rankInverseNormalDataFrame(variableList,
	                           data,
	                           referenceframe,
	                           strata=NA)

Arguments

variableList

A data frame with two columns. The first one must have the names of the candidate variables and the other one the description of such variables

data

A data frame where all variables are stored in different columns

referenceframe

A data frame similar to data, but with only the control population

strata

The name of the column in data that stores the variable that will be used to stratify the model

Value

A data frame where each observation has been conditionally z-transformed, given control data

Author(s)

Jose G. Tamez-Pena and Antonio Martinez-Torteya

Examples

	## Not run: 
	# Start the graphics device driver to save all plots in a pdf format
	pdf(file = "Example.pdf")
	# Get the stage C prostate cancer data from the rpart package
	library(rpart)
	data(stagec)
	# Split the stages into several columns
	dataCancer <- cbind(stagec[,c(1:3,5:6)],
	                    gleason4 = 1*(stagec[,7] == 4),
	                    gleason5 = 1*(stagec[,7] == 5),
	                    gleason6 = 1*(stagec[,7] == 6),
	                    gleason7 = 1*(stagec[,7] == 7),
	                    gleason8 = 1*(stagec[,7] == 8),
	                    gleason910 = 1*(stagec[,7] >= 9),
	                    eet = 1*(stagec[,4] == 2),
	                    diploid = 1*(stagec[,8] == "diploid"),
	                    tetraploid = 1*(stagec[,8] == "tetraploid"),
	                    notAneuploid = 1-1*(stagec[,8] == "aneuploid"))
	# Remove the incomplete cases
	dataCancer <- dataCancer[complete.cases(dataCancer),]
	# Load a pre-established data frame with the names and descriptions of all variables
	data(cancerVarNames)
	# Set the group of no progression
	noProgress <- subset(dataCancer,pgstat==0)
	# z-transform g2 values using the no-progression group as reference
	dataCancerZTransform <- rankInverseNormalDataFrame(variableList = cancerVarNames[2,],
	                                                   data = dataCancer,
	                                                   referenceframe = noProgress)
	# Shut down the graphics device driver
	dev.off()
## End(Not run)

FRESA.CAD documentation built on Nov. 25, 2023, 1:07 a.m.