predict-methods: KeBABS Prediction Methods
In kebabs: Kernel-Based Analysis Of Biological Sequences

Description Usage Arguments Details Value Author(s) References See Also Examples

predict response values for new biological sequences from a model trained with kbsvm

## S4 method for signature 'KBModel'
predict(object, x, predictionType = "response",
  sel = NULL, raw = FALSE, native = FALSE, predProfiles = FALSE,
  verbose = getOption("verbose"), ...)

`object`	model object of class `KBModel` created by `kbsvm`.
`x`	multiple biological sequences in the form of a `DNAStringSet`, `RNAStringSet`, `AAStringSet` (or as `BioVector`). Also a precomputed kernel matrix (see `getKernelMatrix` or a precomputed explicit representation (see `getExRep` can be used instead. The same type of input that was used for training the model should also be used for prediction. If the parameter `x` is missing the response is computed for the sequences used for SVM training.
`predictionType`	one character string of either "response", "probabilities" or "decision" which indicates the type of data returned by prediction: predicted response, class probabilities or decision values. Class probabilities can only be computed if a probability model was generated during the training (for details see parameter `probModel` in kbsvm). Default=`"response"`
`sel`	subset of indices into `x`. When this parameter is present the training is performed for the specified subset of samples only. Default=`integer(0)`
`raw`	when setting this boolean parameter to TRUE the prediction result is returned in raw form, i.e. in the SVM specific format. Default=FALSE
`native`	when setting this boolean parameter to TRUE the prediction is not preformed via feature weights in the KeBABS model but native in the SVM. Default=FALSE
`predProfiles`	when this boolean parameter is set to TRUE the prediction profiles are computed for the samples passed to `predict`. Default=FALSE
`verbose`	boolean value that indicates whether KeBABS should print additional messages showing the internal processing logic in a verbose manner. The default value depends on the R session verbosity option. Default=getOption("verbose")
`...`	additional parameters which are passed to SVM prediction transparently.

Prediction for KeBABS models

For the samples passed to the predict method the response (which corresponds to the predicted label in case of classification or the predicted target value in case of regression), the decision value (which is the value of decision function separating the classes in classification) or the class probability (probability for class membership in classification) is computed for the given model of class KBModel. (see also parameter predictionType). For sequence data this includes the generation of an explicit representation or kernel matrix dependent on the processing variant that was chosen for the training of the model. When feature weights were computed during training (see parameter featureWeights in kbsvm) the response is computed entirely in KeBABS via the feature weights in the model object. The prediction performance can be evaluated with the function evaluatePrediction.

If feature weights are not available in the model then native prediction is performed via the SVM which was used for training. The parameter native enforces native prediction even when feature weights are available. Instead of sequence data also a precomputed kernel matrix or a precomputed explicit representation can be passed to predict. Prediction via feature weights is not supported for kernel variants which do not support the generation of an explicit representation, e.g. the position dependent kernel variants.

Prediction with precomputed kernel matrix

When training was performed with a precomputed kernel matrix also in prediction a precomputed kernel matrix must be passed to the predict method. In contrast to the quadratic and symmetric kernel matrix used in training the kernel matrix for prediction is rectangular and contains the similarities of test samples (rows) against support vectors (columns). support vector indices can be read from the model with the accessor SVindex. Please not that these indices refer to the sample subset used in training. An example for training and prediction via precomputed kernel matrix is shown below.

Generation of prediction profiles

The parameter predProfiles controls whether prediction profiles (for details see getPredictionProfile) are generated during the prediction process for all predicted samples. They show the contribution of the individual sequence positions to the response value. For a subset of sequences prediction profiles can also be computed independent from predicition via the function getPredictionProfile.

predict.kbsvm: upon successful completion, dependent on the parameter predictionType the function returns either response values, decision values or probability values for class membership. When prediction profiles are also generated a list containing predictions and prediction profiles is passed back to the user.

Johannes Palme <kebabs@bioinf.jku.at>

http://www.bioinf.jku.at/software/kebabs

J. Palme, S. Hochreiter, and U. Bodenhofer (2015) KeBABS: an R package for kernel-based analysis of biological sequences. Bioinformatics, 31(15):2574-2576, 2015. DOI: 10.1093/bioinformatics/btv176.

KBModel, evaluatePrediction, kbsvm, getPredictionProfile, PredictionProfile

## load transcription factor binding site data
data(TFBS)
enhancerFB
## select 70% of the samples for training and the rest for test
train <- sample(1:length(enhancerFB), length(enhancerFB) * 0.7)
test <- c(1:length(enhancerFB))[-train]
## create the kernel object for gappy pair kernel with normalization
gappy <- gappyPairKernel(k=1, m=1)
## show details of kernel object
gappy

## run training with explicit representation
model <- kbsvm(x=enhancerFB[train], y=yFB[train], kernel=gappy,
               pkg="LiblineaR", svm="C-svc", cost=10)

## show feature weights in KeBABS model
featureWeights(model)[1:8]

## predict the test sequences
pred <- predict(model, enhancerFB[test])
evaluatePrediction(pred, yFB[test], allLabels=unique(yFB))
pred[1:10]

## output decision values instead
pred <- predict(model, enhancerFB[test], predictionType="decision")
pred[1:10]

## Not run: 
## example for training and prediction via precomputed kernel matrix

## compute quadratic kernel matrix of training samples
kmtrain <- getKernelMatrix(gappy, x=enhancerFB, selx=train)

## train model with kernel matrix
model <- kbsvm(x=kmtrain, y=yFB[train], kernel=gappy,
               pkg="e1071", svm="C-svc", cost=10)

## compute rectangular kernel matrix of test samples versus
## support vectors
kmtest <- getKernelMatrix(gappy, x=enhancerFB, y=enhancerFB,
                          selx=test, sely=train)

## predict with kernel matrix
pred <- predict(model, kmtest)
evaluatePrediction(pred, yFB[test], allLabels=unique(yFB))

## example for probability model generation during training

## compute probability model via Platt scaling during training
## and predict class membership probabilities
model <- kbsvm(x=enhancerFB[train], y=yFB[train], kernel=gappy,
               pkg="e1071", svm="C-svc", cost=10, probModel=TRUE)

## show parameters of the fitted probability model which are the parameters
## probA and probB for the fitted sigmoid function in case of classification
## and the value sigma of the fitted Laplacian in case of a regression
probabilityModel(model)

## predict class probabilities
prob <- predict(model, enhancerFB[test], predictionType="probabilities")
prob[1:10]

## End(Not run)