readRDS("benchmark_res.rds") -> res knitr::opts_chunk$set( collapse = TRUE, eval = FALSE, comment = "#>" )
Based on community detection to automatically classify the keywords, \CRANpkg{akc} can utilize different algorithms for clustering. In this vignette, a benchmark is provided to show the difference for various algorithms on multiple sizes of networks.
First, we'll load the needed packages.
library(akc) library(dplyr)
Then, we prepare the needed data. The built-in data table biblio_data_table
would be used here.
bibli_data_table %>% keyword_clean() %>% keyword_merge() -> clean_data
Next, a combination of network size and community detection algorithms are designed to be tested:
100:300 -> topn_sample ls("package:akc") %>% str_extract("^group.+") %>% na.omit() %>% setdiff(c("group_biconnected_component", "group_components", "group_optimal")) -> com_detect_fun_list
Finally, we'll implement the computation and record the results.
all = tibble() for(i in com_detect_fun_list){ for(j in topn_sample){ system.time({ clean_data %>% keyword_group(top = j,com_detect_fun = get(i)) %>% as_tibble -> grouped_network_table }) %>% na.omit-> time_info grouped_network_table %>% nrow -> node_no grouped_network_table %>% distinct(group) %>% nrow -> group_no grouped_network_table %>% count(group) %>% summarise(mean(n)) %>% .[[1]] -> group_avg_node_no grouped_network_table %>% count(group) %>% summarise(sd(n)) %>% .[[1]] -> group_sd_node_no c(com_detect_fun = i, topn = j, node_no = node_no,group_no = group_no, avg = group_avg_node_no, sd = group_sd_node_no,time_info[1:3]) %>% bind_rows(all,.) -> all } } res = all %>% mutate_at(2:9,function(x) as.numeric(x) %>% round(2)) %>% distinct(com_detect_fun,node_no,.keep_all = T) %>% select(-topn,-contains("self")) %>% setNames(c("com_detect_fun","No. of total nodes","No. of total groups", "Average node number in each group","Standard deviation of node number", "Computer running time for keyword_group function"))
The results are displayed in the following table.
knitr::kable(res)
The session information is displayed as below:
sessionInfo()
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.