DataPre: Preprocess the input data

Description Usage Arguments Details Value Note Author(s) References See Also Examples

Description

Preprocess the input data. Variables with a lot of zeros and outliers may be removed. Missing values may be imputed and filled. Data may be transformed by logarithm transformation.

Usage

1
DataPre(tes,mv="min",rz=80,sv=TRUE,log=FALSE,filepath=getwd())

Arguments

tes

The data under pretreatment (data frame with required format). The first row should be column names. The first and the second column of the first row should be "Name" and "ID", and you can set 2 more tags at the third and the fourth column of the first row, such as "m.z" and "RT.min." or anything you like. From the fifth column till the end, sample indexes or names are expected. The first row of the data frame should be the gender information."1"means male, and "2" means female. The second row of the data frame should be the group information. The first column of the second row should be "group", and you can add group indexes of the data from the fifth column at the second row. The format of group number should be "0"(pre-dose). "1","2","3","4"...(post-dose). The third row of the data frame should be the information of timepoints. Please see the demo data for detailed format.

rz

The percentage of zeros for variable elimination (Default:80). Variables with zero numbers higher than rz

mv

The method of missing values imputation (Default: "min"). mv=c ( "min", "knn", "qrilc").

sv

A logical value indicating whether to remove the outliers (Default: TRUE). The data which distance to the mean is bigger than 1.5 times of the difference value between lower quartile and upper quartile, should be identified as an outlier. And it will be replaced by the mean value of corresponding row.

log

A logical value indicating whether to take the logarithm on the data (Default: FALSE).

filepath

A character string indicating the path where the results may be saved in.

Details

nothing

Value

A data frame of the prepocessed data

A folder named "preprocessed-data" containing a file of the prepocessed datasets will be created automatically. The file's name is "preprocessed-data.xlsx".

Note

nothing

Author(s)

Mengci Li, Shouli Wang, Guoxiang Xie, Tianlu Chen and Wei Jia

References

Hastie, Botstein, et al. Imputing Missing Data for Gene Expression Arrays, Stanford University Statistics Department Technical report (1999)

See Also

nothing

Examples

1
2
3
data("preData")
DataPre(preData)
##The result will be saved at your current working directory of the R process.

polyPK documentation built on May 2, 2019, 12:37 a.m.