View source: R/treesourceClusterMethod.R
treesource | R Documentation |
A function that can be used to get insight into a clusterforest solution, in the case that there are known sources of variation underlying the forest. In case of a categorical covariate, it visualizes the number of trees from each value of the covariate that belong to each cluster. In case of a continuous covariate, it returns the mean and standard deviation of the covariate in each cluster.
treesource(clusterforest, solution)
clusterforest |
The clusterforest object |
solution |
The solution |
multiplot |
In case of categorical covariate, for each value of the covariate, a bar plot with the number of trees that belong to each cluster |
heatmap |
In case of a categorical covariate, a heatmap with for each value of the covariate, the number of trees that belong to each cluster |
clustermeans |
In case of a continuous covariate, the mean of the covariate in each cluster |
clusterstds |
In case of a continuous covariate, the standard deviation of the covariate in each cluster |
require(rpart) data_Amphet <-drugs[,c ("Amphet","Age", "Gender", "Edu", "Neuro", "Extr", "Open", "Agree", "Consc", "Impul","Sensat")] data_cocaine <-drugs[,c ("Coke","Age", "Gender", "Edu", "Neuro", "Extr", "Open", "Agree", "Consc", "Impul","Sensat")] #Function to draw a bootstrap sample from a dataset DrawBoots <- function(dataset, i){ set.seed(2394 + i) Boot <- dataset[sample(1:nrow(dataset), size = nrow(dataset), replace = TRUE),] return(Boot) } #Function to grow a tree using rpart on a dataset GrowTree <- function(x,y,BootsSample, minsplit = 40, minbucket = 20, maxdepth =3){ controlrpart <- rpart.control(minsplit = minsplit, minbucket = minbucket, maxdepth = maxdepth, maxsurrogate = 0, maxcompete = 0) tree <- rpart(as.formula(paste(noquote(paste(y, "~")), noquote(paste(x, collapse="+")))), data = BootsSample, control = controlrpart) return(tree) } #Draw bootstrap samples and grow trees BootsA<- lapply(1:5, function(k) DrawBoots(data_Amphet,k)) BootsC<- lapply(1:5, function(k) DrawBoots(data_cocaine,k)) Boots = c(BootsA,BootsC) TreesA <- lapply(1:5, function (i) GrowTree(x=c ("Age", "Gender", "Edu", "Neuro", "Extr", "Open", "Agree","Consc", "Impul","Sensat"), y="Amphet", BootsA[[i]] )) TreesC <- lapply(1:5, function (i) GrowTree(x=c ( "Age", "Gender", "Edu", "Neuro", "Extr", "Open", "Agree", "Consc", "Impul","Sensat"), y="Coke", BootsC[[i]] )) Trees=c(TreesA,TreesC) #Cluster the trees ClusterForest<- clusterforest(observeddata=drugs,treedata=Boots,trees=Trees,m=1, fromclus=2, toclus=2, treecov=rep(c("Amphet","Coke"),each=5), sameobs=FALSE) #Link cluster result to known source of variation treesource(ClusterForest, 2)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.