ExposureClassify: Classify samples by exposure levels

ExposureClassifyR Documentation

Classify samples by exposure levels

Description

Assign unlabeled samples to previously defined groups.

Usage

## S4 method for signature 'SignExp,character'
ExposureClassify(signexp_obj, labels, 
    method="knn", max_instances=200, k=3, weights=NA, plot_to_file=FALSE, 
    file="Classification_barplot.pdf", colors=NA_character_, min_agree=0.75,...)

Arguments

signexp_obj

A SignExp object returned by signeR function.

labels

Sample labels. Every sample labeled as NA will be classified according to its mutational profile and the profiles of labeled samples.

method

Classification algorithm used. Default is k-Nearest Neighbors (kNN). Any other algorithm may be used, as long as it is customized to satisfy the following conditions:
Input: a matrix of labeled samples, with one sample per line and one feature per column; a matrix of unlabeled samples to classify, with the same structure; an array of labels, with one entry for each labeled sample.
Output: an array of assigned labels, one for each unlabeled sample.

max_instances

Maximum number of the exposure matrix instances to be analyzed. If the number of available E instances is bigger than this parameter, a subset of those will be randomly selected for analysis.

k

Number of nearest neighbors considered for classification, used only if method="kNN". Default is 3.

weights

Vector of weights applied to the signatures when performing classification. Default is NA, which leads all the signatures to have weight=1.

plot_to_file

Whether to save the plot to the file parameter. Default is FALSE.

file

File that will be generated with classification graphic output.

colors

Array of color names, one for each sample class. Colors will be recycled if the length of this array is less than the number of classes.

min_agree

Minimum frequency of agreement among individual classifications. Samples showing a frequency of agreement below this value are considered as "undefined". Default is 0.75.

...

additional parameters for classification algorithm (defined by "method" above).

Value

A list with the following items:

class

The assigned classes for each unlabeled sample.

freq

Classification agreement for each unlabeled sample: the relative frequency of assignment of each sample to the group specified in "class".

'

allfreqs

Matrix with one column for each unlabeled sample and one row for each class label. Contains the assignment frequencies of each sample to each class.

probs

As above, a matrix with unlabeled samples in columns and class labels in rows. Contains the average probability, among repeated exposure classifications, of each sample belonging to each class.

Examples

# assuming signatures is the return value of signeR()


my_labels <- c("a","a","a","a",NA,"b","b","b","b",NA)
Class <- ExposureClassify(signatures$SignExposures, labels=my_labels)

# see also
vignette(package="signeR")

rvalieris/signeR documentation built on April 20, 2024, 2:08 p.m.