View source: R/2.clusterData.R
clusterData | R Documentation |
Cluster Data Based on Different Methods
clusterData(
obj = NULL,
scaleData = TRUE,
cluster.method = c("mfuzz", "TCseq", "kmeans", "wgcna"),
TCseq_params_list = list(),
object = NULL,
min.std = 0,
cluster.num = NULL,
subcluster = NULL,
seed = 5201314,
...
)
obj |
An input object that can take one of two types: - A cell_data_set object for trajectory analysis. - A matrix or data.frame containing expression data. |
scaleData |
Logical. Whether to scale the data (e.g., z-score normalization). |
cluster.method |
Character. Clustering method to use.
Options are one of |
TCseq_params_list |
A list of additional parameters passed to the |
object |
A pre-calculated object required when using |
min.std |
Numeric. Minimum standard deviation for filtering expression data. |
cluster.num |
Integer. The number of clusters to identify. |
subcluster |
A numeric vector of specific cluster IDs to include in the results.
If |
seed |
An integer seed for reproducibility in clustering operations. |
... |
Additional arguments passed to internal functions such as |
Depending on the selected cluster.method
, different clustering algorithms are used:
"mfuzz"
: Applies Mfuzz soft clustering method, suitable for identifying overlapping clusters.
"TCseq"
: Uses TCseq clustering for time-series expression data with support for additional parameters.
"kmeans"
: Employs standard k-means clustering via base R's stats::kmeans
.
"wgcna"
: Leverages pre-calculated WGCNA (Weighted Gene Co-expression Network Analysis) networks.
The function is designed to be flexible, allowing preprocessing (e.g., filtering by min.std
),
scaling the data (scaleData = TRUE
), and generating results compatible with data visualization pipelines.
A list containing the following clustering results:
wide.res: A wide-format data frame with clusters and normalized expression levels.
long.res: A long-format data frame for visualizations, containing cluster information, normalized values, cluster names, and memberships.
cluster.list: A list where each element contains genes belonging to a specific cluster.
type: The clustering method used ("mfuzz"
, "TCseq"
, "kmeans"
, or "wgcna"
).
geneMode: Currently set to "none"
(reserved for future use).
geneType: Currently set to "none"
(reserved for future use).
If the WGCNA method is selected, the object
parameter must contain a pre-calculated WGCNA network object.
This is typically obtained using the WGCNA package functions.
Use the subcluster
parameter to focus on specific clusters. Cluster IDs not included in the
subcluster
vector will be excluded from the final results.
JunZhang
This function performs clustering on input data using one of four methods: mfuzz, TCseq, kmeans, or wgcna. The clustering results include metadata, normalized data, and cluster memberships.
data("exps")
# kmeans
ck <- clusterData(obj = exps,
cluster.method = "kmeans",
cluster.num = 8)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.