featureWeights: Feature Weights
In kebabs: Kernel-Based Analysis Of Biological Sequences

Description Usage Arguments Details Value Author(s) References See Also Examples

Compute Feature Weights for KeBABS Model

1 2	getFeatureWeights(model, exrep = NULL, features = NULL, weightLimit = .Machine$double.eps)

`model`	model object of class `KBModel` created by `kbsvm`.
`exrep`	optional explicit representation of the support vectors from which the feature weights should be computed. If no explicit representation is passed to the function the explicit representation is generated internally from the support vectors stored in the model. default=`NULL`
`features`	feature subset of the specified kernel in the form of a character vector. When a feature subset is passed to the function all other features in the feature space are not considered for the explicit representation. (see below) default=`NULL`
`weightLimit`	the feature weight limit is a single numeric value and allows pruning of feature weights. All feature weights with an absolute value below this limit are set to 0 and are not considered in the feature weights. Default=.Machine$double.eps

Overview

Feature weights represent the contribution to the decision value for a single occurance of the feature in the sequence. In this way they give a hint concerning the importance of the individual features for a given classification or regression task. Please consider that for a pattern length larger than 1 patterns at neighboring sequence positions overlap and are no longer independent from each other. Apart from the obvious overlapping possibility of patterns for e.g. gappy pair kernel, motif kernel or mixture kernels multiple patterns can be relevant for a single position. Therefore feature weights do not describe the relevance for individual features exactly.

Computation of feature weights

Feature weights can be computed automatically as part of the training (see parameter featureWeights in method kbsvm. In this case the function getFeatureWeights is called during training automatically. When this parameter is not set during training computation of feature weights after training is possible with the function getFeatureWeights. The function also supports pruning of feature weights (see parameter weightLimit allowing to test different prunings without retraining.

Usage of feature weights

Feature weights are used during prediction to speed up the prediction process. Prediction via feature weights is performed in KeBABS when feature weights are available in the model (see featureWeights). When feature weights are not available or for multiclass prediction KeBABS defaults to the native prediction in the SVM used during training.

Feature weights are also used during generation of prediction profiles (see getPredictionProfile). In the feature weights the general relevance of features is reflected. When generating prediction profiles for a given set of sequences from the feature weights the relevance of single sequence positions is shown for the individual sequences according to the given learning task.

Feature weights for position dependent kernels

For position dependent kernels the generation of feature weights is not possible during training. In this case the featureWeights slot in the model contains a data representation that allows simple computation of feature weights during prediction or during generation of prediction profiles.

Upon successful completion, the function returns the feature weights as numeric vector. For quadratic kernels a matrix of feature weights is returned giving the feature weights for pairs of features. In case of multiclass the function returns the feature weights for the pairwise SVMs as list of numeric vectors (or matrices for quadratic kernels).

Johannes Palme <kebabs@bioinf.jku.at>

http://www.bioinf.jku.at/software/kebabs

J. Palme, S. Hochreiter, and U. Bodenhofer (2015) KeBABS: an R package for kernel-based analysis of biological sequences. Bioinformatics, 31(15):2574-2576, 2015. DOI: 10.1093/bioinformatics/btv176.

kbsvm, predict, getPredictionProfile featureWeights, KBModel

## standard method to create feature weights automatically during training
## model <- kbsvm( .... , featureWeights="yes", .....)
## this example describes the case where feature weights were not created
## during training but should be added later to the model

## load example sequences and select a small set of sequences
## to speed up training for demonstration purpose
data(TFBS)
## create sample indices of training and test subset
train <- sample(1:length(yFB), 200)
test <- c(1:length(yFB))[-train]
## determin all labels
allLables <- unique(yFB)

## create a kernel object
gappyK1M4 <- gappyPairKernel(k=1, m=4)

## model is trainded with creation of feature weights
model <- kbsvm(enhancerFB[train], yFB[train], gappyK1M4,
               pkg="LiblineaR", svm="C-svc", cost=20)

## feature weights included in model
featureWeights(model)

## Not run: 
## model is originally trainded without creation of feature weights
model <- kbsvm(enhancerFB[train], yFB[train], gappyK1M4,
               pkg="LiblineaR", svm="C-svc", cost=20, featureWeights="no")

## no feature weights included in model
featureWeights(model)

## later after training add feature weights and model offset of model to
## KeBABS model
featureWeights(model) <- getFeatureWeights(model)
modelOffset(model) <- getSVMSlotValue("b", model)

## show a part of the feature weights and the model offset
featureWeights(model)[1:7]
modelOffset(model)

## another scenario for getFeatureWeights is to test the performance
## behavior of different prunings of the feature weights

## show histogram of full feature weights
hist(featureWeights(model), breaks=30)

## show number of features
length(featureWeights(model))

## first predict with full feature weights to see how performance
## when feature weights are included in the model prediction is always
## performed with the feature weights
## changes through pruning
pred <- predict(model, enhancerFB[test])
evaluatePrediction(pred, yFB[test], allLabels=allLables)

## add feature weights with pruning to absolute values larger than 0.6
## model offset was assigned above and is not impacted by pruning
featureWeights(model) <- getFeatureWeights(model, weightLimit=0.6)

## show histogram of full feature weights
hist(featureWeights(model), breaks=30)

## show reduced number of features
length(featureWeights(model))

## now predict with pruned feature weights
pred <- predict(model, enhancerFB, sel=test)
evaluatePrediction(pred, yFB[test], allLabels=allLables)

## End(Not run)