champ.SVD: Singular Value Decomposition analysis for batch effects...

Description Usage Arguments Author(s) References Examples

Description

New modification: We have added a new plot scree plot (proposed by rasmus.rydbirk@regionh.dk), to help user to judge the importance of deconvoluted components. After SVD deconvolution, each components would "explain" part of variances existing in origin data matrix, in other word, your beta matrix. Thus we hope to see some top components (normally 3-5) would have captured most variances existing in your original data. Thus, after champ.SVD(), you may check the PDF file, and see how many components needs to be considered in following anlaysis. For example, if component 1 has captured 80 percent of variance, and it is highly correlated with the phenotype you want to research, you may ignore following components' batch effect. Runs Singular Value Decomposition on a dataset to estimate the impact of batch effects. This function would run SVD deconvolution on beta matrix, get components explain most variance in original data set. Then use Random Matrix Theory to estimate numbers of latent variables. Then each significant components would be correlated with each phenotype, to see if this phenotype show significant correlation with this component. All suitable factors in your pd(Sample_Sheet.csv) file will be analysed. After champ.SVD(), used would get a heatmap indicating effect of factors on original data set. And decide if some batch effect shall be corrected before future analysis. Not all factors in your pd file woule be analysis though, name information like Sample_Name, Pool_ID... would be discarded, covariates contain less then 2 variances shall be discarded as well. Note that numeric covariates like age would be calculated with linear regression, while factors and character covariates like Sample_Group would be calculated with Krustal Test. Thus please check your input pd file carefully as well. We have added legend on plot. In the plot generated by champ.SVD(), color indicates different levels of significance. The darker the color is, the more significant your deconvoluted components are correlated with your phenotype. Also, we modified the number of x axis (number of component) as dimentions of latent variables detected by EstDimRMT() function from "isva" package, however if this function estimated too many components, say more than 20 components, champ.SVD() would automatically selected only top 20 components.

Usage

1
2
3
4
5
6
7
    champ.SVD(beta = myNorm,
              rgSet=NULL,
              pd=myLoad$pd,
              RGEffect=FALSE,
              PDFplot=TRUE,
              Rplot=TRUE,
              resultsDir="./CHAMP_SVDimages/")

Arguments

beta

beta matrix waiting to be analysed, better to be one get Probe-Type normalized and imputed. (default = myNorm)

rgSet

An rgSet object that was created when data was loaded the data from the .idat files, which contains green and red color information of original data set, might be used if RGEffect set TRUE. (default = myLoad$rgSet)

pd

This data.frame includes the information from the sample sheet. (default = myLoad$pd)

RGEffect

If Green and Red color control probes would be calculated. (default = FALSE)

PDFplot

If PDFplot would be generated and save in resultsDir. (default = TRUE)

Rplot

If Rplot would be generated and save in resultsDir. Note if you are doing analysis on a server remotely, please make sure the server could connect your local graph applications. (For example X11 for linux.) (default = TRUE)

Rplot

If Splot is true, generates Scree plot (elbow plot). If PDFPlot is also true, would be generated and save in resultsDir. (default = TRUE)

resultsDir

The directory where PDF files would be saved. (default = "./CHAMP_SVDimages/")

Author(s)

Teschendorff, A
adapted by Yuan Tian

References

Teschendorff, A. E., Menon, U., Gentry-Maharaj, A., Ramus, S. J., Gayther, S. A., Apostolidou, S., Jones, A., Lechner, M., Beck, S., Jacobs, I. J., and Widschwendter, M. (2009). An epigenetic signature in peripheral blood predicts active ovarian cancer. PLoS One, 4(12), e8274

Examples

1
2
3
4
5
6
    ## Not run: 
        myLoad <- champ.load(directory=system.file("extdata",package="ChAMPdata"))
        myNorm <- champ.norm()
        champ.SVD()
        
## End(Not run)

ucl-medical-genomics/ChAMP documentation built on June 26, 2019, 12:11 a.m.