PreProcess: Preprocessing the design matrix, preparing it for variable...

Description Usage Arguments Value Author(s) Examples

View source: R/PreProcess.R

Description

This function preprocesses the design matrix by removing those columns that contain NA's or are all zero. It also standardizes non-binary columns to have mean zero and variance one.

Usage

1

Arguments

X

The n times p design matrix. The columns should represent genes and rows represent the observations. The column names are used as gene names so they should not be left as NULL. Note that the input matrix X should NOT contain vector of 1's representing the intercept.

Value

It returns a list having the following objects:

X

The filtered design matrix which can be used in variable selection procedure. Binary columns are moved to the end of the design matrix.

gnames

Gene names read from the column names of the filtered design matrix.

Author(s)

Amir Nikooienejad

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
### Constructing a synthetic design matrix for the purpose of preprocessing
### imposing columns with different scales
n <- 40
p1 <- 50
p2 <- 150
p <- p1 + p2
X1 <- matrix(rnorm(n*p1, 1, 2), ncol = p1)
X2 <- matrix(rnorm(n*p2), ncol = p2)
X <- cbind(X1, X2)

### putting NA elements in the matrix
X[3,85] <- NA
X[25,85] <- NA
X[35,43] <- NA
X[15,128] <- NA
colnames(X) <- paste("gene_",c(1:p),sep="")

### Running the function. Note the intercept column that is added as the
### first column in the "logistic" family
Xout <- PreProcess(X)
dim(Xout$X)[2] == (p + 1) ## 1 is added because intercept column is included
## This is FALSE because of the removal of columns with NA elements

BVSNLP documentation built on May 17, 2018, 9:05 a.m.