Man pages for yhhc2/machinelearnr
Package with various functions for preprocessing, clustering, and classification

AddColBinnedToBinaryBin the values of a selected continuous column into 2 bins...
AddColBinnedToQuartilesBin the values of a selected continuous column into 4 bins...
AddPCsToEndPerform PCA
CalcOptimalNumClustersForKMeansGenerate plots to help decide optimal number of clusters for...
captureSessionInfoCapture session info
ConvertDataToPercentilesUse percentiles to assess for outliers in multidimensional...
CorAssoTestMultipleWithErrorHandlingTakes multiple vectors and do correlation/association testing...
correlation.association.testGiven two numerical data vector, determine the correlation
CVPredictionsRandomForestCreate random forest cross-validated model
CVRandomForestClassificationMatrixForPheatmapGenerate a random forest model under cross validation (CV)...
describeNumericalColumnsDescribe each numerical feature. Mean, stddev, median,...
describeNumericalColumnsWithLevelsFor each level, describe each numerical feature. Mean, sd,...
DownSampleDataframeDown sample an imbalanced dataset to get a balanced dataset
eval.classification.resultsDetermine the performance of classification
find.best.number.of.treesUsing the classification error rate for each number of trees,...
generate.2D.clustering.with.labeled.subgroupMake a 2D scatter plot that shows the data as represented by...
generate.3D.clustering.with.labeled.subgroupMake a 3D scatter plot that shows the data as represented by...
generate.descriptive.plotsUse histograms and boxplots to get an general idea of what...
generate.descriptive.plots.save.pdfUse histograms and boxplots to get an general idea of what...
GenerateElbowPlotPCACreate elbow plot to see how much total variance is explained...
GenerateExampleDataMachinelearnrProduce example data set for demonstrating package functions
GenerateParcoordForClustersGenerate parallel plot to show each observation and which...
GeneratePC1andPC2PlotsWithAndWithoutOutliersGenerate PC1 vs PC2 plots with and without outliers.
generate.plots.comparing.clustersCompare clusters
HierarchicalClusteringAutomated hierarchical clustering with labeling of...
Log2TargetDensityPlotComparisonDo Log2 transformation on a column, and then compare with and...
LOOCVPredictionsRandomForestAutomaticMtryAndNtreeCreate random forest leave-one-out-cross-validated model
LOOCVRandomForestClassificationMatrixForPheatmapGenerate a random forest model under...
LookAtPCFeatureLoadingsPrincipal component feature loadings
MultipleColumnsNormalCheckThenBoxCoxChecks multiple columns in a dataframe to see if each is...
NormalCheckThenBoxCoxTransformChecks if the data is normally distributed using Shapiro...
RandomForestAutomaticMtryAndNtreeCreate random forest classification model after optimizing...
RandomForestClassificationGiniMatrixForPheatmapGenerate a random forest model for different subsets of the...
RandomForestClassificationPercentileMatrixForPheatmapGenerate a random forest model for different subsets of the...
RanomlySelectOneRowForEachRandomly select one row
RecodeIdentifierRecode the identifier column of a dataset
RemoveColWithAllZerosRemove columns with all zeros
RemoveRowsBasedOnColRemove rows from the dataframe if the row contains a value in...
RemoveSamplesWithInstabilityRemove samples that have multiple values for a single column...
SplitIntoTrainTestSplit into train and test
StabilityTestingAcrossVisitsAssess stability of values that correspond to a single...
SubsetDataByContinuousColSubset data by two bounds on a continuous column
TwoSampleTTestPerforms two sample t-test on multiple features
ZScoreChallengeOutliersRemove outliers based on Z score of a particular variable
yhhc2/machinelearnr documentation built on Dec. 23, 2021, 7:19 p.m.