fullscan: A full scan of the input data 'm' using a collection of...

View source: R/locStra.r

fullscanR Documentation

A full scan of the input data m using a collection of windows given by the two-column matrix windows. For each window, the data is processed using the function matrixFunction (this could be, e.g., the covMatrix function), then the processed data is summarized using the function summaryFunction (e.g., the largest eigenvector computed with the function powerMethod), and finally the global and local summaries are compared using the function comparisonFunction (e.g., the vector correlation with R's function cor). The function returns a two-column matrix which contains per row the global summary statistics (e.g., the correlation between the global and local eigenvectors) and the local summary statistics (e.g., the correlation between the local eigenvectors of the previous and current windows) for each window.

Description

A full scan of the input data m using a collection of windows given by the two-column matrix windows. For each window, the data is processed using the function matrixFunction (this could be, e.g., the covMatrix function), then the processed data is summarized using the function summaryFunction (e.g., the largest eigenvector computed with the function powerMethod), and finally the global and local summaries are compared using the function comparisonFunction (e.g., the vector correlation with R's function cor). The function returns a two-column matrix which contains per row the global summary statistics (e.g., the correlation between the global and local eigenvectors) and the local summary statistics (e.g., the correlation between the local eigenvectors of the previous and current windows) for each window.

Usage

fullscan(m, windows, matrixFunction, summaryFunction, comparisonFunction)

Arguments

m

A (sparse) matrix for which the full scan is sought. The input matrix is assumed to be oriented to contain the data for one individual per column.

windows

A two-column matrix containing per column the windows on which the data is scanned. The windows can be overlapping. The windows can be computed using the function makeWindows.

matrixFunction

Function on one matrix argument to process the data for each window (e.g., the covariance matrix).

summaryFunction

Function on one argument to summarize the output of the function matrixFunction (e.g., the largest eigenvector).

comparisonFunction

Function on two inputs to compute a comparison measure for the output of the function summaryFunction (e.g., vector correlation, or matrix norm).

Value

A two-column matrix containing per row the global and local summary statistics for each window. Plotting the correlation data of the returned matrix gives a figure analogously to the figure shown here, which was generated with the example code below.

fig.pdf

References

Dmitry Prokopenko, Julian Hecker, Edwin Silverman, Marcello Pagano, Markus Noethen, Christian Dina, Christoph Lange and Heide Fier (2016). Utilizing the Jaccard index to reveal population stratification in sequencing data: a simulation study and an application to the 1000 Genomes Project. Bioinformatics, 32(9):1366-1372.

Examples

require(locStra)
require(Matrix)
data(testdata)
cor2 <- function(x,y) ifelse(sum(x)==0 | sum(y)==0, 0, cor(x,y))
windowSize <- 10000
w <- makeWindows(nrow(testdata),windowSize,windowSize)
resCov <- fullscan(testdata,w,covMatrix,powerMethod,cor2)
resJac <- fullscan(testdata,w,jaccardMatrix,powerMethod,cor2)
resSMx <- fullscan(testdata,w,sMatrix,powerMethod,cor2)
resGRM <- fullscan(testdata,w,grMatrix,powerMethod,cor2)
resAll <- cbind(resCov[,1], resJac[,1], resSMx[,1], resGRM[,1])
xlabel <- "SNP position"
ylabel <- "correlation between global and local eigenvectors"
mainlabel <- paste("window size",windowSize)
matplot(w[,1],abs(resAll),type="b",xlab=xlabel,ylab=ylabel,ylim=c(0,1),main=mainlabel)
legend("topright",legend=c("Cov","Jaccard","s-Matrix","GRM"),pch=paste(1:ncol(resAll)))


locStra documentation built on April 13, 2022, 1:07 a.m.

Related to fullscan in locStra...