Description Usage Arguments Value See Also Examples
View source: R/classification.R
CVPredictionsRandomForest() is used on each subset of the data. For each subset of the data: A random forest model with a specified fold CV is generated. This random forest model is used to make a single column in the final matrix which can be used to generate a pheatmap. The inputted data should already have the rows shuffled. If LOOCV should be done on each subsetted data, then this can be indicated by using -1 for the number.of.folds.
1 2 3 4 5 6 7 8 9 | CVRandomForestClassificationMatrixForPheatmap(
input.data,
factor.name.for.subsetting,
name.of.predictors.to.use,
target.column.name,
seed,
percentile.threshold.to.keep,
number.of.folds
)
|
input.data |
A dataframe which should already have the rows shuffled. |
factor.name.for.subsetting |
String to specify name of column to use for subsetting. The column should be a factor. Each column of the generated pheatmap will correspond to a level in the factor. |
name.of.predictors.to.use |
A vector of strings to specify name of columns to use as predictors for random forest model. Each column should be numeric. |
target.column.name |
A string to specify the column with values the random forest model is trying to predict for. The column should be a factor. |
seed |
A number to set for random number generation. |
percentile.threshold.to.keep |
A number from 0-1 indicating the percentile to use for feature selection. This is used by the CVPredictionsRandomForest() function. |
number.of.folds |
An integer to specify the fold for CV. If This number is set to -1, then the function will use LOOCV for each subset data set. |
A list with three objects:
A numerical matrix that can be used for pheatmap generation.
A list of subset data sets for each column in the heatmap matrix.
A list of vectors containing the predictions for each column in the heatmap matrix. These values are used to calculate the MCC.
Other Classification functions:
CVPredictionsRandomForest()
,
GenerateExampleDataMachinelearnr()
,
LOOCVPredictionsRandomForestAutomaticMtryAndNtree()
,
LOOCVRandomForestClassificationMatrixForPheatmap()
,
RandomForestAutomaticMtryAndNtree()
,
RandomForestClassificationGiniMatrixForPheatmap()
,
RandomForestClassificationPercentileMatrixForPheatmap()
,
eval.classification.results()
,
find.best.number.of.trees()
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | example.data <- GenerateExampleDataMachinelearnr()
set.seed(1)
example.data.shuffled <- example.data[sample(nrow(example.data)),]
result.two.fold.CV <- CVRandomForestClassificationMatrixForPheatmap(
input.data = example.data.shuffled,
factor.name.for.subsetting = "sep.xy.ab",
name.of.predictors.to.use = c("x", "y", "a", "b"),
target.column.name = "actual",
seed = 2,
percentile.threshold.to.keep = 0.5,
number.of.folds = 2)
matrix.for.pheatmap <- result.two.fold.CV[[1]]
pheatmap_RF <- pheatmap::pheatmap(matrix.for.pheatmap, fontsize_col = 12, fontsize_row=12)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.