Description Usage Arguments Value Examples
This function builds various classification models, including support vector machine (SVM), random forests (RF), and adaptive boosting (AdaBoost).
1 2 | buildTrainModel(ASdata, chooseNum = 1000, proTrain = 2/3, proTest = 1/3,
ASlength = 0, classifier = "rf", use.all = FALSE)
|
ASdata |
A data frame including the coordinates of splice sites, class label and the sequence around splice sites. The "type" column is a vector of class label comprising of "AltA","AltD","ES" and "IR". |
chooseNum |
A interger for the number of AS events from each AS type for building classification model. |
proTrain |
The proportion of training dataset using random sampling. |
proTest |
The proportion of testing dataset using random sampling. |
ASlength |
AS data is trimmed if AS length below a given threshold. |
classifier |
A string for the classification method. This must be one of the string "svm", "rf", "adaboost", not case sensitive. |
use.all |
Whether to use all alternative splicing dataset for building classificaiton model (default: FALSE). |
This function returns a fitted model with eight elements, including trainset, testset, model, predict, accuracy, confusion, evaluate, ROC. trainset is the training data set; testset is the testing data set; model is the fitted model; predict is the predicted classification results; accuracy is the prediction accuracy; confusion is the confusion matrix of the prediction; evaluate is the evaluation matrix of the classification, including precition, sp, recall, f1; ROC: A ROC curve.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 | ##Loading example alternative splicing data
path <- system.file("extdata","sample_riceAS.txt",package = "AStrap")
rice_ASdata <-read.table(path,sep="\t",head = TRUE,stringsAsFactors = FALSE)
head(rice_ASdata)
##Loading geneome using the package of BSgenome
library("BSgenome.Osativa.MSU.MSU7")
rice_ASdata<- extract_IsoSeq_ge(rice_ASdata,Osativa)
names(rice_ASdata)
##Classification model building based on random forest method
library(randomForest)
library(ROCR)
library(ggplot2)
model <- buildTrainModel(rice_ASdata, chooseNum = 100,
proTrain = 2/3, proTest = 1/3,ASlength =0,
classifier = "rf", use.all = FALSE)
##Performance evaluation
names(model)
model$evaluate
model$confusion
model$accuracy
##Or classification model building based on SVM method
library(e1071)
library(ROCR)
library(ggplot2)
model <- buildTrainModel(rice_ASdata, chooseNum = 100,
proTrain = 2/3, proTest = 1/3,ASlength =0,
classifier = "svm", use.all = FALSE)
##Or classification model building based on AdaBoost method
library(adabag)
library(ROCR)
library(ggplot2)
model <- buildTrainModel(rice_ASdata, chooseNum = 100,
proTrain = 2/3, proTest = 1/3,ASlength =0,
classifier = "adaboost", use.all = FALSE)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.