Description Usage Arguments Details Value Methods Compatibility Author(s) References See Also Examples
The function clusterlinc
will give an overview of ncRNAs in a dataset. An input LINCmatrix
will be converted to a LINCcluster
. The following steps are conducted (I) computation of a correlation test, (II) setup of a distance matrix, (III) calculation of a dendrogram and (IV) selection of co-expressed genes for each query. The result is a cluster of ncRNAs and their associated protein-coding genes.
1 2 3 4 5 6 7 8 | clusterlinc(linc,
distMethod = "dicedist",
clustMethod = "average",
pvalCutOff = 0.05,
padjust = "none",
coExprCut = NULL,
cddCutOff = 0.05,
verbose = TRUE)
|
linc |
an object of the class |
distMethod |
a method to compute the distance between ncRNAs; has to be one of |
clustMethod |
an algorithm to compute the dendrogram, has to be one of |
pvalCutOff |
a threshold for the selection of co-expressed genes. Only protein-coding genes showing a significance in the correlation test lower than |
padjust |
one of |
coExprCut |
a single |
cddCutOff |
a threshold that is only relevant for |
verbose |
whether to give messages about the progression of the function |
As a first step clusterlinc
conducts the correlation test (stats::cor.test
) using the correlation method and handeling of missing values inherited from the input LINCmatrix
. Resulting p-values indicate the statistical robustness of correlations instead of absolute correlation values. Co-expression of ncRNAs to protein-coding genes is assumed if the p-value from the cor.test
is lower than the given pvalCutOff
. An alternative way to select co-expressed genes is provided by coExprCut
. This argument has priority over pvalCutOff
and can be used to pick the n
genes with the lowest p-value for each ncRNA. In contrast to pvalCutOff
, this will result in an equal number of assigned co-expressed genes. The argument padjust
can be used for multiple testing correction. In most cases this is not compatible with distMethod = "dicedist"
.
For the computation of the distance matrix of ncRNA genes three methods can be applied. The first method "correlation"
uses 1 - correlation
as distance measure. In contrast, "pvalue"
considers not the absolute correlation values, but p-values from the correlation test. A third method is termed "dicedist"
and takes the Czekanovski dice distance [1] as distance measure. Here, the number of shared interaction partners between ncRNAs determines their relation to each other. The argument cddCutOff
is an option to decide which p-values in the correlation matrix can be considered as interaction. A low threshold, for instance, will consider only interactions of ncRNAs and protein-coding genes supported by a p-value lower than the supplied threshold and therfore a robust correlation of these two genes. Based on the distance matrix a cluster of the ncRNAs will be computed by stats::hclust
. Argument clustMethod
defines which clustering method should be applied.
A LINCcluster
can be recalculated with the command clusterlinc(LINCcluster, ...))
in order to change further arguments. plotlinc(LINCcluster, ...))
will plot a figure that shows the cluster of ncRNAs (dendrogram) and the number of co-expressed genes with respect to different thresholds. getbio(LINCcluster, ...))
will derive the biological terms associated with the co-expressed genes. Due to the correlation test longer calculation times can occur. A faster alternative to this function is singlelinc()
. User-defined correlation test functions are supported for singlelinc()
but not for clusterlinc()
.
an object of the class 'LINCmatrix' (S4) with 6 Slots
results |
a |
assignment |
a |
correlation |
a |
expression |
the original expression matrix |
history |
a storage environment of important methods, objects and parameters used to create the object |
linCenvir |
a storage environment ensuring the compatibility to other objects of the |
signature(linc = "LINCcluster")
(see details)
signature(linc = "LINCmatrix")
(see details)
plotlinc(LINCcluster, ...)
, getbio(LINCcluster, ...)
, ...
Manuel Goepferich
[1] Christine Brun, Francois Chevenet, David Martin, Jerome Wojcik, Alain Guenoche and Bernard Jacq" Functional classification of proteins for the prediction of cellular function from a protein-protein interaction network" (2003) Genome Biology, 5:R6.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | data(BRAIN_EXPR)
class(crbl_matrix)
# call 'clusterlinc' with no further arguments
crbl_cluster <- clusterlinc(crbl_matrix)
# apply the distance method "correlation instead of "dicedist"
crbl_cluster_cor <- clusterlinc(crbl_matrix, distMethod = "correlation" )
# do the same as recursive call using the 'LINCcluster' object
# crbl_cluster_cor <- clusterlinc(crbl_cluster, distMethod = "correlation")
# select 25 genes with lowest p-values for each query
crbl_cluster_25 <- clusterlinc(crbl_matrix, coExprCut = 25)
# select onyl those with a p-value < 5e-5
crbl_cluster_5e5 <- clusterlinc(crbl_matrix, pvalCutOff = 5e-5)
# adjust for multiple testing
crbl_cluster_hochberg <- clusterlinc(crbl_matrix, distMethod = "correlation",
padjust = "hochberg", pvalCutOff = 0.05)
# comparing two distance methods
plotlinc(crbl_cluster)
plotlinc(crbl_cluster_cor)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.