The function `preprocess`

performs a preprocessing of microarray data.

1 2 |

`Xtrain` |
a (ntrain x p) data matrix of predictors. |

`Xtest` |
a (ntest x p) matrix containing the predictors for the test data
set. |

`Threshold` |
a vector of length 2 containing the values (threshmin,threshmax) for
thresholding data in preprocess. Data is thresholded to value threshmin and ceiled to value
threshmax. If |

`Filtering` |
a vector of length 2 containing the values (FiltMin,FiltMax) for filtering genes
in preprocess. Genes with max/min$<= FiltMin$ and (max-min)$<= FiltMax$ are excluded.
If |

`log10.scale` |
a logical value equal to TRUE if a log10-transformation has to be done. |

`row.stand` |
a logical value equal to TRUE if a standardisation in row has to be done. |

The pre-processing steps recommended by Dudoit et al. (2002) are performed. The default values are those adapted for Colon data.

A list with the following components:

`pXtrain` |
the (ntrain x p') matrix containing the preprocessed train data. |

`pXtest` |
the (ntest x p') matrix containing the preprocessed test data. |

Sophie Lambert-Lacroix (http://membres-timc.imag.fr/Sophie.Lambert/) and Julie Peyre (http://www-lmc.imag.fr/lmc-sms/Julie.Peyre/).

Dudoit, S. and Fridlyand, J. and Speed, T. (2002). Comparison of discrimination methods for the classification of tumors using gene expression data, Journal of the American Statistical Association, 97, 77–87.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | ```
# load plsgenomics library
library(plsgenomics)
# load Colon data
data(Colon)
IndexLearn <- c(sample(which(Colon$Y==2),27),sample(which(Colon$Y==1),14))
Xtrain <- Colon$X[IndexLearn,]
Ytrain <- Colon$Y[IndexLearn]
Xtest <- Colon$X[-IndexLearn,]
# preprocess data
resP <- preprocess(Xtrain= Xtrain, Xtest=Xtest,Threshold = c(100,16000),Filtering=c(5,500),
log10.scale=TRUE,row.stand=TRUE)
# how many genes after preprocess ?
dim(resP$pXtrain)[2]
``` |

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.

Please suggest features or report bugs with the GitHub issue tracker.

All documentation is copyright its authors; we didn't write any of that.