Compute Feature Weights for KeBABS Model
model object of class
optional explicit representation of the support vectors from
which the feature weights should be computed. If no explicit representation
is passed to the function the explicit representation is generated internally
from the support vectors stored in the model. default=
feature subset of the specified kernel in the form of a
character vector. When a feature subset is passed to the function all other
features in the feature space are not considered for the explicit
representation. (see below) default=
the feature weight limit is a single numeric value and allows pruning of feature weights. All feature weights with an absolute value below this limit are set to 0 and are not considered in the feature weights. Default=.Machine$double.eps
Feature weights represent the contribution to the decision value for a single occurance of the feature in the sequence. In this way they give a hint concerning the importance of the individual features for a given classification or regression task. Please consider that for a pattern length larger than 1 patterns at neighboring sequence positions overlap and are no longer independent from each other. Apart from the obvious overlapping possibility of patterns for e.g. gappy pair kernel, motif kernel or mixture kernels multiple patterns can be relevant for a single position. Therefore feature weights do not describe the relevance for individual features exactly.
Computation of feature weights
Feature weights can be computed automatically as part of the training (see parameter
featureWeights in method
kbsvm. In this case
the function getFeatureWeights is called during training automatically.
When this parameter is not set during training computation of feature weights
after training is possible with the function getFeatureWeights. The function
also supports pruning of feature weights (see parameter
allowing to test different prunings without retraining.
Usage of feature weights
Feature weights are used during prediction to speed up the prediction process. Prediction via feature weights is performed in KeBABS when feature weights are available in the model (see
When feature weights are not available or for multiclass prediction KeBABS
defaults to the native prediction in the SVM used during training.
Feature weights are also used during generation of prediction profiles (see
getPredictionProfile). In the feature weights the general
relevance of features is reflected. When generating prediction profiles
for a given set of sequences from the feature weights the relevance of
single sequence positions is shown for the individual sequences according
to the given learning task.
Feature weights for position dependent kernels
For position dependent kernels the generation of feature weights is not possible during training. In this case the featureWeights slot in the model contains a data representation that allows simple computation of feature weights during prediction or during generation of prediction profiles.
Upon successful completion, the function returns the feature weights as numeric vector. For quadratic kernels a matrix of feature weights is returned giving the feature weights for pairs of features. In case of multiclass the function returns the feature weights for the pairwise SVMs as list of numeric vectors (or matrices for quadratic kernels).
Johannes Palme <[email protected]>
J. Palme, S. Hochreiter, and U. Bodenhofer (2015) KeBABS: an R package for kernel-based analysis of biological sequences. Bioinformatics, 31(15):2574-2576, 2015. DOI: 10.1093/bioinformatics/btv176.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72
## standard method to create feature weights automatically during training ## model <- kbsvm( .... , featureWeights="yes", .....) ## this example describes the case where feature weights were not created ## during training but should be added later to the model ## load example sequences and select a small set of sequences ## to speed up training for demonstration purpose data(TFBS) ## create sample indices of training and test subset train <- sample(1:length(yFB), 200) test <- c(1:length(yFB))[-train] ## determin all labels allLables <- unique(yFB) ## create a kernel object gappyK1M4 <- gappyPairKernel(k=1, m=4) ## model is trainded with creation of feature weights model <- kbsvm(enhancerFB[train], yFB[train], gappyK1M4, pkg="LiblineaR", svm="C-svc", cost=20) ## feature weights included in model featureWeights(model) ## Not run: ## model is originally trainded without creation of feature weights model <- kbsvm(enhancerFB[train], yFB[train], gappyK1M4, pkg="LiblineaR", svm="C-svc", cost=20, featureWeights="no") ## no feature weights included in model featureWeights(model) ## later after training add feature weights and model offset of model to ## KeBABS model featureWeights(model) <- getFeatureWeights(model) modelOffset(model) <- getSVMSlotValue("b", model) ## show a part of the feature weights and the model offset featureWeights(model)[1:7] modelOffset(model) ## another scenario for getFeatureWeights is to test the performance ## behavior of different prunings of the feature weights ## show histogram of full feature weights hist(featureWeights(model), breaks=30) ## show number of features length(featureWeights(model)) ## first predict with full feature weights to see how performance ## when feature weights are included in the model prediction is always ## performed with the feature weights ## changes through pruning pred <- predict(model, enhancerFB[test]) evaluatePrediction(pred, yFB[test], allLabels=allLables) ## add feature weights with pruning to absolute values larger than 0.6 ## model offset was assigned above and is not impacted by pruning featureWeights(model) <- getFeatureWeights(model, weightLimit=0.6) ## show histogram of full feature weights hist(featureWeights(model), breaks=30) ## show reduced number of features length(featureWeights(model)) ## now predict with pruned feature weights pred <- predict(model, enhancerFB, sel=test) evaluatePrediction(pred, yFB[test], allLabels=allLables) ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.