View source: R/ConnectedComponent-class.R
ConnectedComponents | R Documentation |
Connected components are a useful representation when exploring identification data. They represent the relation between proteins (the connected components) and how they form groups of proteins as defined by shared peptides.
Connected components are stored as ConnectedComponents
objects
that can be generated using the ConnectedComponents()
function.
ConnectedComponents(object, ...)
ccMatrix(x)
connectedComponents(x, i, simplify = TRUE)
## S4 method for signature 'ConnectedComponents'
length(x)
## S4 method for signature 'ConnectedComponents'
dims(x)
## S4 method for signature 'ConnectedComponents'
ncols(x)
## S4 method for signature 'ConnectedComponents'
nrows(x)
## S4 method for signature 'ConnectedComponents,integer,ANY,ANY'
x[i, j, ..., drop = FALSE]
## S4 method for signature 'ConnectedComponents,logical,ANY,ANY'
x[i, j, ..., drop = FALSE]
## S4 method for signature 'ConnectedComponents,numeric,ANY,ANY'
x[i, j, ..., drop = FALSE]
prioritiseConnectedComponents(x)
prioritizeConnectedComponents(x)
## S4 method for signature 'ConnectedComponents'
adjacencyMatrix(object)
object |
For the |
... |
Additional arguments passed to
|
x |
An object of class |
i |
|
simplify |
|
j |
ignored |
drop |
ignore |
The ConnectedComponents()
constructor returns an
instance of class ConnectedComponents
. The Creating and
manipulating objects section describes the return values of
the functions that manipulate ConnectedComponents
objects.
adjMatrix
The sparse adjacency matrix (class Matrix
) of
dimension p peptides by m proteins that was used to
generate the object.
ccMatrix
The sparse connected components matrix (class
Matrix
) of dimension m by m proteins.
adjMatrices
A List
containing adjacency matrices of each
connected components.
Instances of the class are created with the
ConnectedComponent()
constructor from a PSM()
object or
directly from a sparse adjacency matrix of class Matrix
. Note
that if using the latter, the rows and columns must be named.
The sparse peptide-by-protein adjacency matrix is stored in the
ConnectedComponent
instance and can be accessed with the
adjacencyMatrix()
function.
The protein-by-protein connected components sparse matrix of
object x
can be accessed with the ccMatrix(x)
function.
The number of connected components of object x
can be
retrieved with length(x)
.
The size of the connected components of object x
, i.e the
number of proteins in each component, can be retrieved with
ncols(x)
. The number of peptides defining the connected
components can be retrieved with nrows(x)
. Both can be
accessed with dims(x)
.
The connectedComponents(x, i, simplify = TRUE)
function
returns the peptide-by-protein sparse adjacency matrix (or
List
of matrices, if length(i) > 1
), i.e. the subset of the
adjacency matrix defined by the proteins in connected
component(s) i
. i
is the numeric index (between 1 and
length(x)
) of the connected connected. If simplify is TRUE
(default), then a matrix is returned instead of a List
of
matrices of length 1. If set to FALSE
, a List
is always
returned, irrespective of its length.
To help with the exploration of individual connected Components,
the prioritiseConnectedComponents()
function will take an
instance of ConnectedComponents
and return a data.frame
where
the component indices are ordered based on their potential to
clean up/flag some peptides and split protein groups in small
groups or individual proteins, or simply explore them. The
prioritisation is based on a set of metrics computed from the
component's adjacency matrix, including its dimensions, row and
col sums maxima and minima, its sparsity and the number of
communities and their modularity that quantifies how well the
communities separate (see modularity.igraph()
. Note that
trivial components, i.e. those composed of a single peptide and
protein are excluded from the prioritised results. This
data.frame
is ideally suited for a principal component
analysis (using for instance prcomp()
) for further inspection
for component visualisation with plotAdjacencyMatrix()
.
## --------------------------------
## From an adjacency matrix
## --------------------------------
library(Matrix)
adj <- sparseMatrix(i = c(1, 2, 3, 3, 4, 4, 5),
j = c(1, 2, 3, 4, 3, 4, 5),
x = 1,
dimnames = list(paste0("Pep", 1:5),
paste0("Prot", 1:5)))
adj
cc <- ConnectedComponents(adj)
cc
length(cc)
ncols(cc)
adjacencyMatrix(cc) ## same as adj above
ccMatrix(cc)
connectedComponents(cc)
connectedComponents(cc, 3) ## a singel matrix
connectedComponents(cc, 1:2) ## a List
## --------------------------------
## From an PSM object
## --------------------------------
f <- msdata::ident(full.names = TRUE, pattern = "TMT")
f
psm <- PSM(f) |>
filterPsmDecoy() |>
filterPsmRank()
cc <- ConnectedComponents(psm)
cc
length(cc)
table(ncols(cc))
(i <- which(ncols(cc) == 4))
ccomp <- connectedComponents(cc, i)
## A group of 4 proteins that all share peptide RTRYQAEVR
ccomp[[1]]
## Visualise the adjacency matrix - here, we see how the single
## peptides (white node) 'unites' the four proteins (blue nodes)
plotAdjacencyMatrix(ccomp[[1]])
## A group of 4 proteins formed by 7 peptides: THPAERKPRRRKKR is
## found in the two first proteins, KPTARRRKRK was found twice in
## ECA3389, VVPVGLRALVWVQR was found in all 4 proteins, KLKPRRR
## is specific to ECA3399, ...
ccomp[[3]]
## See how VVPVGLRALVWVQR is shared by ECA3406 ECA3415 ECA3389 and
## links the three other componennts, namely ECA3399, ECA3389 and
## (ECA3415, ECA3406). Filtering that peptide out would split that
## protein group in three.
plotAdjacencyMatrix(ccomp[[3]])
## Colour protein node based on protein names similarity
plotAdjacencyMatrix(ccomp[[3]], 1)
## To select non-trivial components of size > 1
cc2 <- cc[ncols(cc) > 1]
cc2
## Use components features to prioritise their exploration
pri_cc <- prioritiseConnectedComponents(cc)
pri_cc
plotAdjacencyMatrix(connectedComponents(cc, 1082), 1)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.