getPredictionProfile-methods: Calculation Of Predicition Profiles

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

compute prediction profiles for a given set of biological sequences from a model trained with /codekbsvm

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
## S4 method for signature 'BioVector'
getPredictionProfile(object, kernel, featureWeights, b,
  svmIndex = 1, sel = NULL, weightLimit = .Machine$double.eps)

## S4 method for signature 'XStringSet'
getPredictionProfile(object, kernel, featureWeights, b,
  svmIndex = 1, sel = NULL, weightLimit = .Machine$double.eps)

## S4 method for signature 'XString'
getPredictionProfile(object, kernel, featureWeights, b,
  svmIndex = 1, sel = NULL, weightLimit = .Machine$double.eps)

Arguments

object

a single biological sequence in the form of an DNAString, RNAString or AAString or multiple biological sequences as DNAStringSet, RNAStringSet, AAStringSet (or as BioVector).

kernel

a sequence kernel object of class SequenceKernel.

featureWeights

a feature weights matrix retrieved from a KeBABS model with the accessor featureWeights.

b

model intercept from a KeBABS model.

svmIndex

integer value selecting one of the pairwise SVMs in case of pairwise multiclass classification. Default=1

sel

subset of indices into x as integer vector. When this parameter is present the prediction profiles are computed for the specified subset of samples only. Default=integer(0)

weightLimit

the feature weight limit is a single numeric value and allows pruning of feature weights. All feature weights with an absolute value below this limit are set to 0 and are not considered for the prediction profile computation. This parameter is only relevant when feature weights are calculated in KeBABS during training. Default=.Machine$double.eps

Details

With this method prediction profiles can be generated explicitely for a given set of sequences with a given model represented through its feature weights and the model intercept b. A single prediction profile shows for each position of the sequence the contribution of the patterns at this position to the decision value. The prediciion profile also includes the kernel object used for the generation of the profile and the seqence data.

A single profile or a pair can be plotted with method plot showing the relevance of sequence positions for the prediction. Please consider that patterns occuring at neighboring sequence positions are not statistically independent which means that the relevance of a specific position is not only determined by the patterns at this position but is also influenced by the neighborhood around this position. Prediction profiles can also be generated implicitely during predction for the predicted samples (see parameter predProfiles in predict).

Value

getPredictionProfile: upon successful completion, the function returns a set of prediction profiles for the sequences as class PredictionProfile.

Author(s)

Johannes Palme <kebabs@bioinf.jku.at>

References

http://www.bioinf.jku.at/software/kebabs

(Mahrenholz, 2011) – C.C. Mahrenholz, I.G. Abfalter, U. Bodenhofer, R. Volkmer, and S. Hochreiter. Complex networks govern coiled coil oligomerization - predicting and profiling by means of a machine learning approach.

(Bodenhofer, 2009) – U. Bodenhofer, K. Schwarzbauer, S. Ionescu, and S. Hochreiter. Modeling Position Specificity in Sequence Kernels by Fuzzy Equivalence Relations.

J. Palme, S. Hochreiter, and U. Bodenhofer (2015) KeBABS: an R package for kernel-based analysis of biological sequences. Bioinformatics, 31(15):2574-2576, 2015. DOI: 10.1093/bioinformatics/btv176.

See Also

PredictionProfile, predict, plot, featureWeights, getPredProfMixture

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
## set random generator seed to make the results of this example
## reproducable
set.seed(123)

## load coiled coil data
data(CCoil)
gappya <- gappyPairKernel(k=1,m=11, annSpec=TRUE)
model <- kbsvm(x=ccseq, y=as.numeric(yCC), kernel=gappya,
               pkg="e1071", svm="C-svc", cost=15)

## show feature weights
featureWeights(model)[,1:5]

## define two new sequences to be predicted
GCN4 <- AAStringSet(c("MKQLEDKVEELLSKNYHLENEVARLKKLV",
                      "MKQLEDKVEELLSKYYHTENEVARLKKLV"))
names(GCN4) <- c("GCN4wt", "GCN_N16Y,L19T")
## assign annotation metadata
annCharset <- annotationCharset(ccseq)
annot <- c("abcdefgabcdefgabcdefgabcdefga",
           "abcdefgabcdefgabcdefgabcdefga")
annotationMetadata(GCN4, annCharset=annCharset) <- annot

## compute prediction profiles
predProf <- getPredictionProfile(GCN4, gappya,
           featureWeights(model), modelOffset(model))

## show prediction profiles
predProf

## plot prediction profile of first aa sequence
plot(predProf, sel=1, ylim=c(-0.4, 0.2), heptads=TRUE, annotate=TRUE)

## plot prediction profile of both aa sequences
plot(predProf, sel=c(1,2), ylim=c(-0.4, 0.2), heptads=TRUE, annotate=TRUE)

## prediction profiles can also be generated during prediction
## when setting the parameter predProf to TRUE
## plotting longer sequences to pdf is shown in the examples for the
## plot function

kebabs documentation built on Nov. 8, 2020, 7:38 p.m.