Description Usage Arguments Details Value Author(s) See Also Examples
Wrapper function for calculating classification
estimates using pre-defined data partitioning sets (valipars
and
trainind
). This function works
with two type of classifiers. First generic classifiers that fulfil R standards to define predictive techniques such as the ones available in packages like MASS, e1071 or randomForest
and nlda
are normally handle with accest
: the name of function (clmeth
in the accest
call) must be accompanied with an S3 method predict
; the later function should return a list with component 'class' (hard classification) and if possible 'prob' or 'posterior' for class probabilities. If the algorithm doesn't fulfil these requirements, two postions can be adopted: 1) define explicitly the algorithm so that it means R standards 2) define customised a function that returns necessary informations. The second ('quicky and dirty') approach is illustrated in an example given below. Unless the classifier can only cope with two-class tasks, this function allows the manipulation of any problem complexity. Three types
of estimates are given for each replication: accuracy, so-called margin and
AUC (see details). Data input can be
in the form of data
matrix + class
vector, following the classic formula type or
derived from dat.sel1
.
1 2 3 4 5 6 7 8 9 10 | accest(...)
## Default S3 method:
accest(dat, cl, clmeth, pars = NULL, tr.idx = NULL, verb=TRUE, clmpi=NULL, seed=NULL, ...)
## S3 method for class 'formula'
accest(formula, data = NULL, ..., subset, na.action = na.omit)
## S3 method for class 'dlist'
accest(dlist, clmeth,pars = NULL, tr.idx = NULL, ...)
|
formula |
A formula of the form |
data |
Data frame from which variables specified in |
dlist |
A matrix or data frame containing the explanatory variables if no formula is given as the principal argument. |
dat |
A matrix or data frame containing the explanatory variables if no formula is given as the principal argument. |
cl |
A factor specifying the class for each observation if no formula principal argument is given. |
clmeth |
Classifier function.
For details, see |
pars |
A list of parameters using by the resampling method such as
Leave-one-out cross-validation, Cross-validation,
Bootstrap and Randomised validation (holdout).
See |
tr.idx |
User defined index of training samples. Can be generated by |
verb |
Should iterations be printed out? |
clmpi |
snow cluster information |
seed |
Seed. |
... |
Additional parameters to be passed to |
subset |
Optional vector, specifying a subset of observations to be used. |
na.action |
Function which indicates what should happen when the data
contains |
Seexxxx for common details.
An object of class accest
, including the components:
clmeth |
Classification method used. |
acc |
Average accuracy. |
acc.iter |
Accuracy at each iteration. |
acc.std |
Standard derivation of accuracy. |
mar |
Average predictive margin. |
mar.iter |
Predictive margin of each iteration. |
auc |
Average area under receiver operating curve (AUC). |
auc.iter |
AUC of each iteration. |
sampling |
Sampling scheme used. |
niter |
Number of iterations. |
nreps |
Number of replications at each iteration. |
acc.boot |
Detailed bootstrap accuracy estimates when bootstrap validation method is employed. |
argfct |
Arguments passed to the classifier. |
pred.all |
For each iteration, list of the fold/bootstrap id and the true and predicted classes. |
cl.task |
Discrimination task. |
mod |
List of information return by the user defined classifier function. |
David Enot dle@aber.ac.uk and Wanchang Lin wll@aber.ac.uk
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 | ## -----------------------------------------------------------------
## simple customised function
## sameasrf simply reproduces the RF modelling task
sameasrf <- function(data,...){
dots <- list(...)
## Build RF model and predict dat.te
mod <- randomForest(data$tr,data$cl,...)
## Soft predictions (optional if ROC/margin analyses required)
prob <- predict(mod,data$te,type="vote")
## Hard predictions
pred <- predict(mod,data$te)
# For illustration, mod does not contain anything
list(mod=NULL,pred=pred,prob=prob,arg=dots)
}
## -----------------------------------------------------------------
## compare accest with randomForest
## and sameasrf
data(iris)
dat=as.matrix(iris[,1:4])
cl=as.factor(iris[,5])
pars <- valipars(sampling = "boot",niter = 2, nreps=10)
tr.idx <- trainind(iris$Species,pars)
set.seed(71)
acc.1 <- accest(dat,cl, clmeth = "sameasrf",
pars = pars,tr.idx = tr.idx,ntree = 200)
summary(acc.1)
set.seed(71)
acc.2 <- accest(dat,cl, clmeth = "randomForest",
pars = pars,tr.idx = tr.idx,ntree = 200)
summary(acc.2)
### compare acc.1 and acc.2 bootstrap error estimates
print(acc.1$acc.boot-acc.2$acc.boot)
#########################################
## Try formula type
set.seed(71)
acc.3 <- accest(Species~., data = iris, clmeth = "randomForest",
pars = pars,tr.idx = tr.idx,ntree = 200)
summary(acc.3)
## Try dlist type from dat.sel1
set.seed(71)
dat2=dat.sel1(dat,cl,pars=pars)
acc.4 <- accest(dat2[[1]], clmeth = "randomForest",
pars = pars,tr.idx = tr.idx,ntree = 200)
summary(acc.4)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.