Description Usage Arguments Details Value Methods Compatibility Author(s) References See Also Examples
The function linc
can be considered as the main function of this package. It converts a given input object into a LINCmatrix
. This process includes (I) statistical analysis and (II) correction of the input, (III) separation of coding and non-coding genes and (IV) computation of a correlation matrix.The input could be for instance a gene expression matrix. Rows correspond to genes; columns represent samples.Besides a suitable object
a vector identifying the protein-coding genes is required.
1 2 3 4 5 6 7 8 9 |
object |
a |
codingGenes |
a |
corMethod |
a method for the correlation function; has to be one of |
batchGroups |
a vector naming the batch conditions. The length of this vector has to match the number of samples supplied in |
nsv |
a single |
rmPC |
a vector of principle components (PCs) which should be removed. PCs are counted staring from 1 up to the maximal count of samples. |
outlier |
a method for the genewise removal of single outliers; has to be one of |
userFun |
a function or its name that should be used to calculate the correlation between coding and non-coding genes. This argument has to be used in combination with |
verbose |
whether to give messages about the progression of the function |
object
can be a matrix
, a data.frame
or an ExpressionSet
with rows
corresponding to genes and columns to samples, the assumed co-expression conditions. Genes with duplicated names, genes having 0 variance plus genes with to many missing or infinite values will be removed from the input. For inputs showing a high inter-sample variance (ANOVA) in combination with many single outliers a warning message will appear.
By default Spearman's rank correlation will be computed between protein-coding to non-coding genes. For this method a time-efficient C++ implementation will be called. Longer computation times occur for genes > 5000 and samples > 100. Missing values are handled in a manner that only pairwise complete observations will be compared. A customized correlation function can be applied supplying the function in userFun
and requires the formal arguments x
and x
. This has priority over corMethod
.
A number of statistical methods are available in order to remove effects from a given input expression matrix which depend on the used platform or technology and may hide relevant biology.
The argument batchGroups
works as a rapper of the SVA package calling sva::svaseq
. The number of hidden surrogate variables is set to nsv = 1
by default; it can be estimated utilizing the function sva::num.sv
. For this model to work the description of at least two different batches are required in batchGroups
.
Principle Component Analysis (PCA) can be performed by rmPC = c(...)
where ...
represents a vector of principle components. The command rmPC = c(2:ncol(object))
will remove the first PC from the input. This method can be used to determine whether observations are due to the main variance in the dataset i.e. main groups or subtypes.
Outliers are handled genewise. The extreme Studentized deviate (ESD) test by Rosner, Bernard (1983) will detect one up to four outliers in a gene and replace them by NA
values. The alternative zscore
will perform a robust zscore test suggested by Boris Iglewicz and David Hoaglin (1993) and detect a single outlier in a gene if |Z| > 3.5.
A LINCmatrix
can be recalculated with the command linc(LINCmatrix, ...))
in order to change further arguments. plotlinc(LINCmatrix, ...))
will plot a figure depicting the statistical analysis and correlation values. As for most objects of the LINC
class manipulation of the last slot linCenvir
will likely result in unexpected errors.
an object of the class 'LINCmatrix' (S4) with 6 Slots
results |
a |
assignment |
a |
correlation |
a |
expression |
the original expression matrix |
history |
a storage environment of important methods, objects and parameters used to create the object |
linCenvir |
a storage environment ensuring the compatibility to other objects of the |
signature(object = "data.frame", codingGenes = "ANY")
(see details)
signature(object = "ExpressionSet", codingGenes = "ANY")
(see details)
signature(object = "LINCmatrix", codingGenes = "missing")
(see details)
signature(object = "matrix", codingGenes = "ANY")
(see details)
plotlinc(LINCmatrix, ...)
, clusterlinc(LINCmatrix, ...)
, singlelinc(LINCmatrix, ...)
, ...
Manuel Goepferich
[1] https://www.bioconductor.org/packages/release/bioc/html/sva.html
[2] Rosner, Bernard (May 1983), Percentage Points for a Generalized ESD Many-Outlier Procedure,Technometrics, 25(2), pp. 165-172.
[3] Boris Iglewicz and David Hoaglin (1993), Volume 16:How to Detect and Handle Outliers", The ASQC Basic References in Quality Control: Statistical Techniques, Edward F. Mykytka, Ph.D., Editor.
justlinc
;
clusterlinc
;
singlelinc
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | data(BRAIN_EXPR)
# call 'linc' with no further arguments
crbl_matrix <- linc(cerebellum, codingGenes = pcgenes_crbl)
# remove first seven principle components
crbl_matrix_pc <- linc(cerebellum, codingGenes = pcgenes_crbl, rmPC = c(1:7))
# negative correlation by using 'userFun'
crbl_matrix_ncor <- linc(cerebellum, codingGenes = pcgenes_crbl,
userFun = function(x,y){ -cor(x,y) })
# remove outliers using the ESD method
crbl_matrix_esd <- linc(cerebellum, codingGenes = pcgenes_crbl, outlier = "esd")
# plot this object
plotlinc(crbl_matrix_esd)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.