Description Usage Arguments Details Value Examples
ClusterONE strives to discover densely connected and possibly overlapping regions within the Cytoscape network you are working with. The interpretation of these regions depends on the context (i.e. what the network represents) and it is left up to you. For instance, in protein-protein interaction networks derived from high-throughput AP-MS experiments, these dense regions usually correspond to protein complexes or fractions of them. ClusterONE works by "growing" dense regions out of small seeds (typically one or two vertices), driven by a quality function called cohesiveness.
1 2 3 4 5 6 7 | clusterOneR(inputFile = paste0(system.file("extdata", package =
"ClusterOneR"), "/Weighted_edge_lists.tsv"),
inputFormat = c("edge_list", "sif"), outputFormat = c("plain", "csv",
"genepro"), minDensity = "auto", minSize = 3, fluff = NULL,
haircut = NULL, maxOverlap = 0.8, mergeMethod = c("single",
"multi"), similarity = "match", noFluff = TRUE, noMerge = FALSE,
penalty = 2, seedMethod = NULL)
|
inputFile |
the network edge file name. The each column of this file is seperated by a tab. And the elements in the first row of this file is considered column names. |
inputFormat |
specifies the format of the input file ("sif" or "edge_list"). Use this option only if ClusterONE failed to detect the format automatically. |
outputFormat |
specifies the format of the output file ("plain", "csv" or "genepro"). |
minDensity |
sets the minimum density of predicted complexes. "auto" means that the density threshold will be set automatically based on whether the graph is weighted or not, and if not, what its clustering coefficient is. Weighted graphs will have a default density threshold of 0.3, unweighted graphs will have a density threshold of 0.5, unless their global clustering coefficient is less than 0.1, in which case the density threshold is set to 0.6. |
minSize |
sets the minimum size of the predicted complexes. |
fluff |
fluffs the clusters as a post-processing step. This is not used in the published algorithm, but it may be useful for your specific problem. The idea is to check whether the external boundary nodes of each cluster connect to more than two third of the internal nodes; if so, such external boundary nodes are added to the cluster. Fluffing is applied before the size and density filters. |
haircut |
apply a haircut transformation as a post-processing step on the detected clusters. This is not used in the published algorithm either, but it may be useful for your specific problem. A haircut transformation removes dangling nodes from a cluster: if the total weight of connections from a node to the rest of the cluster is less than x times the average node weight in the cluster (where x is the argument of the switch), the node will be removed. The process is repeated iteratively until there are no more nodes to be removed. Haircut is applied before the size and density filters. |
maxOverlap |
specifies the maximum allowed overlap between two clusters, as measured by the match coefficient, which takes the size of the overlap squared, divided by the product of the sizes of the two clusters being considered, as in the paper of Bader and Hogue. |
mergeMethod |
specifies the method to be used to merge highly
overlapping complexes. The following values are accepted:
|
similarity |
specifies the similarity function to be used in
the merging step. More precisely, this switch controls which scoring
function is used to decide whether two complexes overlap significantly
or not. The following values are accepted:
|
noFluff |
don't fluff the clusters, this is the default. For more details about fluffing, see the –fluff switch above. |
noMerge |
don't merge highly overlapping clusters (in other words, skip the last merging phase). This is useful for debugging purposes only. |
penalty |
sets a penalty value for the inclusion of each node. When you set this option to x, ClusterONE will assume that each node has an extra boundary weight of x when it considers the addition of the node to a cluster. It can be used to model the possibility of uncharted connections for each node, so nodes with only a single weak connection to a cluster will not be added to the cluster as the penalty value will outweigh the benefits of adding the node. The default penalty value is 2. |
seedMethod |
specifies the seed generation method to use.
The following values are accepted:
|
The following input file formats are recognised:
*Cytoscape SIF files*
When the extension of the input file is .sif, ClusterONE will
automatically try to parse the file according to the SIF format of
Cytoscape. Each line of the file must be according to the following
format:
id1 type id2
where id1 and id2 are the IDs of the two interacting proteins and
type is the interaction type (which will silently be ignored by
ClusterONE). Each edge will have unit weight. The columns of the
input file may be separated by spaces or tabs; however, make sure
that you do not mix these separator characters.
*Weighted edge lists*
This is the default file format assumed by ClusterONE unless the
file extension suggests otherwise. Each line of the file has the
following format:
id1 id2 weight
where id1 and id2 are the IDs of the interaction proteins and weight
is the associated confidence value between 0 and 1. If the weight is
omitted, it is considered to be equal to 1. Lines starting with hash
marks (#) or percentage signs (%) are considered as comments and they
are silently ignored.
If ClusterONE fails to recognise the input format of your file, feel
free to specify it using the "inputFormat" option.
The following output file formats are available:
*Plain text output (plain)*
A simple and easy-to-parse output format, where each line represents a
cluster. Members of the clusters are separated by Tab characters.
*CSV output (csv)*
This format is suitable is you need more details about each cluster
and/or you want to import the clusters to Microsoft Excel or OpenOffice.
Each line corresponds to a cluster and contain the size, density, total
internal and boundary weight, the value of the quality function, a P-value
and the list of members for each cluster. Columns are separated by commas,
and each individual column may optionally be quoted within quotation marks
if necessary.
*GenePro output (genepro)* Use this format if you want to visualize the clusters later on using the [GenePro](http://wodaklab.org/genepro) plugin of Cytoscape.
A matrix of complex, where each row represents the proteins in a single complex.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | {
## Not run:
# Run on an example network edges in the package
file = paste0(system.file('extdata', package = 'ClusterOneR'),
"/Weighted_edge_lists.tsv")
head(file)
y = clusterOneR(file)
View(y)
# Run on your own file "/my/path/myEdgeFile.tsv", which is a
"weighted edge lists" file type.
file = "/my/path/myEdgeFile.tsv"
y = clusterOneR(file, inputFormat = "edge_list")
View(y)
# Run on a SIF file (Standard Interaction Format)
file = "/my/path/myEdgeFile.tsv"
y = clusterOneR(file, inputFormat = "edge_list")
View(y)
## End(Not run)
}
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.