knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "README-" )
The package implements the 3C-strategy for the refinement of disease diagnoses in medical research.
The first step in the analysis pipeline is manual Categorization of the feature set to: (i) current clinical measures [CM] , (ii) potential biomarkers [PB] and (iii) assigned diagnosis [DX] (one variable).
In the beginning of the second step (Clustering), a subset of the clinical measures is selected via supervised algorithm (Random Forest, LASSO or else) with the assigned diagnosis as the target variable. Then, the selected measures are used to determine the number of clusters for the clustering. Next, the clustering algorithm is applied (K-means, K-medoids or Hierarchical clustering).
In the third step, new model is trained using the potential biomarkers as features, to Classify the data according to the cluster labels created in step 2.
You can install CCC from github with:
# install.packages("devtools") devtools::install_github("HBPMedical/CCC")
This is a basic example of the analysis pipeline:
library(CCC) data(c3_sample1) data(c3_sample1_categories) head(c3_sample1) table(c3_sample1_categories[,"varCategory"]) x <- get_xy_from_DATA_C2(c3_sample1, c3_sample1_categories)$x y <- get_xy_from_DATA_C2(c3_sample1, c3_sample1_categories)$y C2_results <- C2(x, y, feature_selection_method="RF", num_clusters_method="Manhattan", clustering_method="Manhattan", plot.num.clus=TRUE, plot.clustering=TRUE, k=6) C2_results PBx <- get_PBx_from_DATA_C3(c3_sample1, c3_sample1_categories) new_y <- C2_results[[3]] C3_results <- C3(PBx = PBx, newy = new_y, feature_selection_method = "RF", classification_method="RF") table(new_y, C3_results[[2]])
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.