Man pages for SchlossLab/mikropml
User-Friendly R Package for Supervised Machine Learning Pipelines

abort_packages_not_installedThrow error if required packages are not installed.
bootstrap_performanceCalculate a bootstrap confidence interval for the performance...
boundsGet the lower and upper bounds for an empirical confidence...
calc_balanced_precisionCalculate balanced precision given actual and baseline...
calc_baseline_precisionCalculate the fraction of positives, i.e. baseline precision...
calc_mean_perfGeneric function to calculate mean performance curves for...
calc_perf_bootstrap_splitCalculate performance for a single split from...
calc_perf_metricsGet performance metrics for test data
calc_pvalueCalculate the p-value for a permutation test
change_to_numChange columns to numeric if possible
check_allCheck all params that don't return a value
check_cat_featsCheck if any features are categorical
check_corr_threshcheck that corr_thresh is either NULL or a number between 0...
check_datasetCheck that the dataset is not empty and has more than 1...
check_featuresCheck features
check_group_partitionsCheck the validity of the group_partitions list
check_groupsCheck grouping vector
check_kfoldCheck that kfold is an integer of reasonable size
check_methodCheck if the method is supported. If not, throws error.
check_ntreeCheck ntree
check_outcome_columnCheck that outcome column exists. Pick outcome column if not...
check_outcome_valueCheck that the outcome variable is valid. Pick outcome value...
check_packages_installedCheck whether package(s) are installed
check_perf_metric_functionCheck perf_metric_function is NULL or a function
check_perf_metric_nameCheck perf_metric_name is NULL or a function
check_permuteCheck that permute is a logical
check_remove_varCheck remove_var
check_seedcheck that the seed is either NA or a number
check_training_fracCheck that the training fraction is between 0 and 1
check_training_indicesCheck the validity of the training indices
cluster_corr_matCluster a matrix of correlated features
collapse_correlated_featuresCollapse correlated features
combine_hp_performanceCombine hyperparameter performance metrics for multiple...
compare_modelsPerform permutation tests to compare the performance metric...
create_grouped_data_partitionSplit into train and test set while splitting by groups. When...
create_grouped_k_multifoldsSplitting into folds for cross-validation when using groups
define_cvDefine cross-validation scheme and training parameters
find_permuted_perf_metricGet permuted performance metric difference for a single...
flatten_corr_matFlatten correlation matrix to pairs
get_binary_corr_matIdentify correlated features as a binary matrix
get_caret_dummyvars_dfGet dummyvars dataframe (i.e. design matrix)
get_caret_processed_dfGet preprocessed dataframe for continuous variables
get_corr_featsIdentify correlated features
get_differenceCalculate the difference in the mean of the metric for two...
get_feature_importanceGet feature importance using the permutation method
get_groups_from_clustersAssign features to groups
get_hp_performanceGet hyperparameter performance metrics
get_hyperparams_from_dfSplit hyperparameters dataframe into named lists for each...
get_hyperparams_listSet hyperparameters based on ML method and dataset...
get_outcome_typeGet outcome type.
get_partition_indicesSelect indices to partition the data into training & testing...
get_perf_metric_fnGet default performance metric function
get_perf_metric_nameGet default performance metric name
get_performance_tblGet model performance metrics as a one-row tibble
get_seeds_trainControlGet seeds for 'caret::trainControl()'
get_tuning_gridGenerate the tuning grid for tuning hyperparameters
group_correlated_featuresGroup correlated features
is_whole_numberCheck whether a numeric vector contains whole numbers.
keep_groups_in_cv_partitionsWhether groups can be kept together in partitions during...
mikropml-packagemikropml: User-Friendly R Package for Robust Machine Learning...
mutate_all_typesMutate all columns with 'utils::type.convert()'.'
otu_data_preprocMini OTU abundance dataset - preprocessed
otu_mini_binMini OTU abundance dataset
otu_mini_bin_results_glmnetResults from running the pipeline with L2 logistic regression...
otu_mini_bin_results_rfResults from running the pipeline with random forest on...
otu_mini_bin_results_rpart2Results from running the pipeline with rpart2 on...
otu_mini_bin_results_svmRadialResults from running the pipeline with svmRadial on...
otu_mini_bin_results_xgbTreeResults from running the pipeline with xbgTree on...
otu_mini_cont_results_glmnetResults from running the pipeline with glmnet on...
otu_mini_cont_results_nocvResults from running the pipeline with glmnet on...
otu_mini_cvCross validation on 'train_data_mini' with grouped features.
otu_mini_multiMini OTU abundance dataset with 3 categorical variables
otu_mini_multi_groupGroups for otu_mini_multi
otu_mini_multi_results_glmnetResults from running the pipeline with glmnet on...
otu_smallSmall OTU abundance dataset
pbtickUpdate progress if the progress bar is not 'NULL'.
permute_p_valueCalculated a permuted p-value comparing two models
plot_curvesPlot ROC and PRC curves
plot_hp_performancePlot hyperparameter performance metrics
plot_model_performancePlot performance metrics for multiple ML runs with different...
preprocess_dataPreprocess data prior to running machine learning
process_cat_featsProcess categorical features
process_cont_featsPreprocess continuous features
process_novar_featsProcess features with no variation
radix_sortCall 'sort()' with 'method = 'radix"
randomize_feature_orderRandomize feature order to eliminate any position-dependent...
reexportscaret contr.ltfr
remove_singleton_columnsRemove columns appearing in only 'threshold' row(s) or fewer.
replace_spacesReplace spaces in all elements of a character vector with...
rm_missing_outcomeRemove missing outcome values
run_mlRun the machine learning pipeline
select_applyUse future apply if available
sensspecCalculate and summarize performance for ROC and PRC plots
set_hparams_glmnetSet hyperparameters for regression models for use with glmnet
set_hparams_rfSet hyparameters for random forest models
set_hparams_rpart2Set hyperparameters for decision tree models
set_hparams_svmRadialSet hyperparameters for SVM with radial kernel
set_hparams_xgbTreeSet hyperparameters for SVM with radial kernel
shared_ggprotosGet plot layers shared by 'plot_mean_roc' and 'plot_mean_prc'
shuffle_groupShuffle the rows in a column
split_outcome_featuresSplit dataset into outcome and features
tidy_perf_dataTidy the performance dataframe
train_modelTrain model using 'caret::train()'.
SchlossLab/mikropml documentation built on Aug. 24, 2023, 9:51 p.m.