%\VignetteEngine{knitr::knitr} %\VignetteIndexEntry{Classyfire Cheat Sheet}

Install from CRAN

install.packages("classyfire")

Load the classyfire package within R

```
library(classyfire)
```

Get the classyfire help overview

```
??classyfire
```

Loading some test data, for instance the **iris** dataset

data(iris) irisClass <- iris[,5] irisData <- iris[,-5]

Construct a classification ensemble **in parallel** (using 4 cpus in this instance) that consists of 10 independent classification models (classifiers) optimised using 10 bootstrap iterations

ens <- cfBuild(inputData = irisData, inputClass = irisClass, bootNum = 10, ensNum = 10, parallel = TRUE, cpus = 4, type = "SOCK")

Similarly, **in sequence**:

ens <- cfBuild(inputData = irisData, inputClass = irisClass, bootNum = 10, ensNum = 10, parallel = FALSE)

The list of attributes available for each classifier in the ensemble is provided by the function:

```
attributes(ens)
```

Get the **overall** average test and train accuracy

getAvgAcc(ens)$Test getAvgAcc(ens)$Train

Get the **individual** test and train accuracies in the ensemble

ens$testAcc ens$trainAcc # Alternatively getAcc(ens)$Test getAcc(ens)$Train

In this instance, we are going to randomly generate test data (that represent a new input dataset of unknown classes) to find out their classes using the generated ensemble. The new dataset must have exactly the same number of columns as the inputData, passed as an argument in **cfBuild**. In the following example, 400 points are selected at random, which results in 100 samples (rows).

testMatr <- matrix(runif(400)*100, ncol = ncol(irisData)) predRes <- cfPredict(ens, testMatr)

Execute five permutation rounds; in each permutation test, an ensemble of 10 classifiers is constructed, each running 10 bootstrap iterations during the optimization process. The default values for permutation testing are ensNum, bootNum and permNum equal to 100.

permObj <- cfPermute(irisData, irisClass, bootNum = 10, ensNum = 10, permNum = 5, parallel = TRUE, cpus = 4, type = "SOCK")

Get the vector of averaged accuracies, one for each permutation (each permutation is an independent classification ensemble)

```
permObj$avgAcc
```

Get the overall elapsed time for the permutation process, and the vector of individual execution times for each permutation respectively

permObj$totalTime[3] permObj$execTime

Access the first ensemble in the permutation list

```
permObj$permList[[1]]
```

All the functions for descriptive statistics within classyfire start with the prefix "**get**". For example:

Get the average test and/or train accuracy of the ensemble

getAvgAcc(ens) getAvgAcc(ens)$Test getAvgAcc(ens)$Train

Get the vectors of test and/or train accuracies of the classifiers in the ensemble

getAcc(ens) getAcc(ens)$Test getAcc(ens)$Train

Get the confusion matrix summarising the performance of the ensemble

```
getConfMatr(ens)
```

Get the optimal SVM hyperparameters of the classification ensemble

optParam <- getOptParam(ens) optParam

Return the "five number summary", a descriptive statistic that consists of the minimum, first (lower) quartile, median, third (upper) quartile and maximum value of a given distribution. In this case, the function is applied directly on the output of permutation testing, generated by the **cfPermute** function.

getPerm5Num(permObj) getPerm5Num(permObj)$median getPerm5Num(permObj)$minimum getPerm5Num(permObj)$maximum getPerm5Num(permObj)$upperQ getPerm5Num(permObj)$lowerQ

All the functions for plotting within classyfire start with the prefix "**gg**" since the library **ggplot2** is in use. For example:

The **ggClasPred** function generates a barplot with the per class accuracies (%) for all the correctly classified and misclassified samples in the classification ensemble.

# Show the percentages of correctly classified samples in # a barplot with or without text respectively ggClassPred(ens) ggClassPred(ens, showText = TRUE) # Show the percentages of classified and missclassified samples # in a barplot simultaneously with and without text ggClassPred(ens, displayAll = TRUE) ggClassPred(ens, position = "stack", displayAll = TRUE) ggClassPred(ens, position = "stack", displayAll = TRUE, showText = TRUE) # Alernatively, using a dodge position ggClassPred(ens, position = "dodge", displayAll = TRUE) ggClassPred(ens, position = "dodge", displayAll = TRUE, showText = TRUE)

The **ggEnsTrend** function displays the average test accuracies for every new classifier added to the ensemble, as constructed by the **cfBuild** function.

ggEnsTrend(ens) # Plot with text ggEnsTrend(ens, showText = TRUE) # Plot with text; set different limits on y axis ggEnsTrend(ens, showText = TRUE, ylims=c(90, 100))

The **ggEnsHist** function generates a histogram of the ensemble results as generated by **cfBuild**.

ggEnsHist(ens) # Density plot of the test accuracies in the ensemble ggEnsHist(ens, density = TRUE) # Density plot that highlights additional descriptive statistics ggEnsHist(ens, density = TRUE, percentiles=TRUE) ggEnsHist(ens, density = TRUE, percentiles=TRUE, mean=TRUE) ggEnsHist(ens, density = TRUE, percentiles=TRUE, median=TRUE)

The **ggPermHist** function generates a histogram of the permutation results as generated by **cfPermute**.

ggPermHist(permObj) # Density plot ggPermHist(permObj, density=TRUE) # Density plot that highlights additional descriptive statistics ggPermHist(permObj, density=TRUE, percentiles = TRUE, mean = TRUE) ggPermHist(permObj, density=TRUE, percentiles = TRUE, median = TRUE)

Finally, the **ggFusedHist** function generates a histogram for simultaneous visual comparison of the classification and permutation distributions.

```
ggFusedHist(ensObj, permObj)
```

**Any scripts or data that you put into this service are public.**

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.