score_clusters: Score the k-means clusters.

Description Usage Arguments Value Examples

View source: R/score_clusters.R

Description

This function takes the k-means clusters generated by generate_clusters and computes the GECO scores for each gene within each cluster. This scores table is meant to be used with the final function in the GECO process: assess_quality.

Usage

1
score_clusters(km_clusters, GT_dir)

Arguments

km_clusters

The k-means clusters generated by generate_clusters

GT_dir

The directory holding the ground truth .csv files

Value

The scores table containing the GECO scored clusters

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
clusters <- list()
# This replicates the structure of the clusters returned by 'generate_clusters'
df <- data.frame(replicate(10,sample(-1:10,200,rep=TRUE)))
rownames(df) <- paste0(rep("Gene.", 200), seq(1:200))
clusters$`Iteration 1`$`10` <- kmeans(df, centers = 10)
# This replicates a ground truth set
gt_set <- c("Ground Truth Set X", "Gene.1", "Gene.10", "Gene.50")
# Store the Ground Truth set in its own directory
dir.create("./GECOdata")
write.table(gt_set, file = "./GECOData/gs_genes.csv", row.names = FALSE, col.names = FALSE)
# Score the clusters using the ground truth sets found in /GECOdata
scores <- score_clusters(clusters, "./GECOdata")

JasonPBennett/GECO documentation built on Aug. 30, 2021, 4:30 p.m.