abcrf: Create an ABC-RF object: a classification random forest from...
In abcrf: Approximate Bayesian Computation via Random Forests

abcrf

R Documentation

Create an ABC-RF object: a classification random forest from a reference table towards performing an ABC model choice

Description

abcrf constructs a random forest from a reference table towards performing an ABC model choice. Basically, the reference table (i.e. the dataset that will be treated with the present package) includes a column with the index of the models to be compared and additional columns corresponding to the values of the simulated summary statistics.

Usage

## S3 method for class 'formula'
abcrf(formula, data, group=list(), lda=TRUE, ntree=500, sampsize=min(1e5, nrow(data)),
paral=FALSE, ncores= if(paral) max(detectCores()-1,1) else 1, ...)

Arguments

`formula`	a formula: left of ~, variable representing the model index; right of ~, summary statistics of the reference table.
`data`	a data frame containing the reference table.
`group`	a list containing groups (at least 2) of model(s) on which the model choice will be performed. This is not necessarily a partition, one or more models can be excluded from the elements of the list and by default no grouping is done.
`lda`	should LDA scores be added to the list of summary statistics?
`ntree`	number of trees to grow in the forest, by default 500 trees.
`sampsize`	size of the sample from the reference table to grow a tree of the classification forest, by default the minimum between the number of elements of the reference table and 100,000.
`paral`	a boolean that indicates if the calculations of the classification random forest (forest used to assign a model to the observed dataset) should be parallelized.
`ncores`	the number of CPU cores to use. If paral=TRUE, it is used the number of CPU cores minus 1. If ncores is not specified and `detectCores` does not detect the number of CPU cores with success then 1 core is used.
`...`	additional arguments to be passed on to `ranger` used to construct the classification random forest that preditcs the selected model.

Value

An object of class abcrf, which is a list with the following components:

`call`	the original call to `abcrf`,
`lda`	a boolean indicating if LDA scores have been added to the list of summary statistics,
`formula`	the formula used to construct the classification random forest,
`group`	a list contining the groups of model(s) used. This list is empty if no grouping has been performed,
`model.rf`	an object of class `randomForest` containing the trained forest with the reference table,
`model.lda`	an object of class `lda` containing the Linear Discriminant Analysis based on the reference table,
`prior.err`	prior error rates of model selection on the reference table, estimated with the "out-of-bag" error of the forest.

References

Pudlo P., Marin J.-M., Estoup A., Cornuet J.-M., Gautier M. and Robert, C. P. (2016) Reliable ABC model choice via random forests Bioinformatics doi: 10.1093/bioinformatics/btv684

Estoup A., Raynal L., Verdu P. and Marin J.-M. (2018) Model choice using Approximate Bayesian Computation and Random Forests: analyses based on model grouping to make inferences about the genetic history of Pygmy human populations Jounal de la Société Française de Statistique http://journal-sfds.fr/article/view/709

Examples

data(snp)
modindex <- snp$modindex[1:500]
sumsta <- snp$sumsta[1:500,]
data1 <- data.frame(modindex, sumsta)
model.rf1 <- abcrf(modindex~., data = data1, ntree=100)
model.rf1
model.rf2 <- abcrf(modindex~., data = data1, group = list(c("1","2"),"3"), ntree=100)
model.rf2

abcrf documentation built on Aug. 9, 2022, 5:07 p.m.