For a vector of principal components the Q-residuals of a DNA symbolical sequence are calculated using a PCA model of the numerical DNA motif. First sequences are converted to numerical DNA sequences and then the model is applied. A matrix is returned for each number of PC with the value of the Q-residuals in each position and a label indicating if the sequence belong to a binding site or not
Options described in the MEET function
DNA sequences used to construct the motif model
Alignment method has to be installed in your computer.
As a list, for each nPCs
matrix with two colums, in the first one the Q-residuals for each studied sequence, and the second one indicates if the sequence belong to a TFBS
Erola Pairo <epeiroatibec.pcb.ub.es>
Jolliffe I.T. Principal Component Analysis, Series: Springer Series in Statistics, 2nd ed., Springer, NY, 2002, XXIX, 487 p. 28 illus. ISBN 978-0-387-95442-4 Stacklies, Wolfram, Redestig, Henning, Scholz, Matthias, Walther, Dirk, and Selbig, Joachim: pcaMethods a bioconductor package providing PCA methods for incomplete data, Bioinformatics 23(9), volume 23, 1164-1167, 2007
MEET, kfold.Entropy, kfold.MATCH, kfold.Divergence, PCanalysis, llegir_DNA, convertDNA, numericalDNA.
1 2 3 4 5 6 7 8 9 10