generate_clusters: Create the clusters.
In JasonPBennett/GECO: A metric to determine the biological quality of gene clusters

Description Usage Arguments Value Examples

View source: R/generate_clusters.R

This function takes the normalized data (TPM/FPKM & feature scaled) and uses the k-means function to generate an iterative series of clusters to identify a potentially optimal number of clusters for the dataset. For reproducible clusters, it is highly recommended that a seed value is used prior to generating the clusters using the set.seed function.

1	generate_clusters(df, kmin, kmax, ktot, num_iter, km_algo)

`df`	A dataframe containing the normalized reads
`kmin`	An integer indicating the minimum number of clusters to generate. By default, this is set to 10.
`kmax`	An integer indicating the maximum number of clusters to generate. By default, this is set to 150.
`ktot`	An integer indicating how many unique k-values to generate. By default, this is set to 15. This produces 15 values ranging from kmin up to kmax. Increasing this number will significantly impact performance.
`num_iter`	An integer indicating the number or cluster iterations to generate. By default, this is set to 10. This will perform the same k-means clustering multiple times to account for the stochastic nature of the k-means algorithm, resulting in a mean quality value in the final step that is more reliable than a single iteration would be. Lowering this value will negatively affect the GECO quality assessment, raising it will impact performance.
`km_algo`	A string indicating which k-means algorithm to use. By default, this is set to 'Hartigan-Wong'.

A list containing each iteration of the clustering performed. Within each of the iterations are the kmeans objects for use in the second step e.g. score_clusters(clusters).

# Create a pseudo RNA-seq counts table
df <- data.frame(replicate(10,sample(-1:10,200,rep=TRUE)))
rownames(df) <- paste0(rep("Gene.", 200), seq(1:200))
# Generate clusters
clusters <- generate_clusters(df)

JasonPBennett/GECO documentation built on Aug. 30, 2021, 4:30 p.m.

JasonPBennett/GECO index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

JasonPBennett/GECO
A metric to determine the biological quality of gene clusters

generate_clusters: Create the clusters.
In JasonPBennett/GECO: A metric to determine the biological quality of gene clusters

Description

Usage

Arguments

Value

Examples

Related to generate_clusters in JasonPBennett/GECO...

R Package Documentation

Browse R Packages

We want your feedback!

JasonPBennett/GECO A metric to determine the biological quality of gene clusters

generate_clusters: Create the clusters. In JasonPBennett/GECO: A metric to determine the biological quality of gene clusters

Description

Usage

Arguments

Value

Examples

Related to generate_clusters in JasonPBennett/GECO...

R Package Documentation

Browse R Packages

We want your feedback!

JasonPBennett/GECO
A metric to determine the biological quality of gene clusters

generate_clusters: Create the clusters.
In JasonPBennett/GECO: A metric to determine the biological quality of gene clusters