gkmsvm_trainCV: Training the SVM model, using repeated CV to tune parameter C...

gkmsvm_trainCVR Documentation

Training the SVM model, using repeated CV to tune parameter C and plot ROC curves

Description

Using the kernel matrix created by 'gkmsvm_kernel', this function trains the SVM classifier. It uses repeated CV to find optimum SVM parameter C. Also generates ROC and PRC curves.

Usage

gkmsvm_trainCV(kernelfn, posfn, negfn, svmfnprfx=NA, 
  nCV=5, nrepeat=1, cv=NA, Type="C-svc", C=1, shrinking=FALSE, 
  showPlots=TRUE, outputPDFfn=NA,  outputCVpredfn=NA, outputROCfn=NA, ...)

Arguments

kernelfn

kernel matrix file name

posfn

positive sequences file name

negfn

negative sequences file name

svmfnprfx

(optional) output SVM model file name prefix

nCV

(optional) number of CV folds

nrepeat

(optional) number of repeated CVs

cv

(optional) CV group label. An array of length (npos+nneg), containing CV group number (between 1 an nCV) for each sequence

Type

(optional) SVM type (default='C-svc'), see 'kernlab' documentation for more details.

C

(optional)a vector of all values of C (SVM parameter) to be tested. (default=1), see 'kernlab' documentation for more details.

shrinking

optional: shrinking parameter for kernlab (default=FALSE), see 'kernlab' documentation for more details.

showPlots

generate plots (default==TRUE)

outputPDFfn

filename for output PDF, default=NA (no PDF output)

outputCVpredfn

filename for output cvpred (predicted CV values), default=NA (no output)

outputROCfn

filename for output auROC (Area Under an ROC Curve) and auPRC (Area Under the Precision Recall Curve) values, default=NA (no output)

...

optional: additional SVM parameters, see 'kernlab' documentation for more details.

Details

Trains SVM classifier and generates two files: [svmfnprfx]_svalpha.out for SVM alphas and the other for the corresponding SV sequences ([svmfnprfx]_svseq.fa)

Author(s)

Mahmoud Ghandi

Examples

  #Input file names:  
  posfn= 'test_positives.fa'   #positive set (FASTA format)
  negfn= 'test_negatives.fa'   #negative set (FASTA format)
  testfn= 'test_testset.fa'    #test set (FASTA format)
  
  #Output file names:  
  kernelfn= 'test_kernel.txt' #kernel matrix
  svmfnprfx= 'test_svmtrain'  #SVM files 
  outfn =   'output.txt'      #output scores for sequences in the test set       

#  gkmsvm_kernel(posfn, negfn, kernelfn);                #computes kernel 
#  cvres = gkmsvm_trainCV(kernelfn,posfn, negfn, svmfnprfx, 
#      outputPDFfn='ROC.pdf', outputCVpredfn='cvpred.out');    
#      #trains SVM, plots ROC and PRC curves, and outputs model predictions.
#  gkmsvm_classify(testfn, svmfnprfx, outfn);            #scores test sequences 

gkmSVM documentation built on Aug. 21, 2023, 1:06 a.m.