Author: Yong-Han Hank Cheng
This package provides functions for: 1. Preprocessing data. 2. Clustering with K-means or hierarchical clustering. 3. Classification with random forest.
The main functions for clustering are:
The main functions for classification are:
RandomForestAutomaticMtryAndNtree(): This uses the randomForest::randomForest function to create a random forest classifier. Additional code is included for optimizing mtry and ntree. Importantly, the default values for mtry and ntree are sensible values and actually is not required to be adjusted. For more rigorous explanations on hyperparameter tuning, use this paper and the R package provided by the authors: https://doi.org/10.1002/widm.1301, “Hyperparameters and tuning strategies for random forest” by Probst et al, 2019.
RandomForestClassificationPercentileMatrixForPheatmap(): This
creates random forest model on several smaller subset data sets of a
larger data set and outputs results in the form of a pheatmap so
that the random forest performance in terms of prediction accuracy
and feature importance can be compared between each subset.
CVPredictionsRandomForest(): This performs random forest classification using default hyperparameters and also uses cross-validation. The cross-validation fold can be specified by the user.The performance (predictions and feature importance determination) of the model are outputted.
CVRandomForestClassificationMatrixForPheatmap(): This uses
CVPredictionsRandomForest() on several smaller subset data sets of a
larger data set and outputs results in the form of a pheatmap so
that the random forest performance in terms of prediction accuracy
and feature importance can be compared between each subset.
# Install the package from GitHub
devtools::install_github("yhhc2/machinelearnr")
# Load package
library("machinelearnr")
Source code: https://github.com/yhhc2/machinelearnr
Visit the package’s website: https://yhhc2.github.io/machinelearnr/
Function reference is located here: https://yhhc2.github.io/machinelearnr/reference/index.html
Visit this vignette for example output for each function usage: https://yhhc2.github.io/machinelearnr/articles/Examples.html
The machinelearnr package is licensed under the GPL (>=3) license. The logo is licensed under the CC BY 4.0 license.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.