calculate_coclustering | R Documentation |
Calculate coclustering data.
calculate_coclustering(subsample_solutions, sol_df, verbose = FALSE)
subsample_solutions |
A list of containing cluster solutions from
distinct subsamples of the data. This object is generated by the function
|
sol_df |
A solutions data frame. This object is generated by the
function |
verbose |
If TRUE, output time remaining estimates to console. |
A list containing the following components:
cocluster_dfs: A list of data frames, one per cluster solution, that shows the number of times that every pair of observations in the original cluster solution occurred in the same subsample, the number of times that every pair clustered together in a subsample, and the corresponding fraction of times that every pair clustered together in a subsample.
cocluster_ss_mats: The number of times every pair of observations occurred in the same subsample, formatted as a pairwise matrix.
cocluster_sc_mats: The number of times every pair of observations occurred in the same cluster, formatted as a pairwise matrix.
cocluster_cf_mats: The fraction of times every pair of observations occurred in the same cluster, formatted as a pairwise matrix.
cocluster_summary: Specifically among pairs of observations that clustered together in the original full cluster solution, what fraction of those pairs remained clustered together throughout the subsample solutions. This information is formatted as a data frame with one row per cluster solution.
# my_dl <- data_list(
# list(subc_v, "subcortical_volume", "neuroimaging", "continuous"),
# list(income, "household_income", "demographics", "continuous"),
# list(pubertal, "pubertal_status", "demographics", "continuous"),
# uid = "unique_id"
# )
#
# sc <- snf_config(my_dl, n_solutions = 5, max_k = 40)
#
# sol_df <- batch_snf(my_dl, sc)
#
# my_dl_subsamples <- subsample_dl(
# my_dl,
# n_subsamples = 20,
# subsample_fraction = 0.85
# )
#
# batch_subsample_results <- batch_snf_subsamples(
# my_dl_subsamples,
# sc,
# verbose = TRUE
# )
#
# coclustering_results <- calculate_coclustering(
# batch_subsample_results,
# sol_df,
# verbose = TRUE
# )
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.