select.mpt: Model Selection with MPTinR
In MPTinR: Analyze Multinomial Processing Tree Models

Description Usage Arguments Details Value Note Author(s) References See Also Examples

This function performs model selection for results produced by MPTinR's fit.mpt. It takes multiple results from fit.mpt as a list and returns a data.frame comparing the models using various model selection criteria (e.g., FIA) and AIC and BIC weights. For model selection of multi-dataset fits select.mpt will additionally count how often each model provided the best fit.

1	select.mpt(mpt.results, output = c("standard", "full"), round.digit = 6, dataset)

`mpt.results`	A `list` containing results from `fit.mpt`.
`output`	`"standard"` or `"full"`. If `"full"` additionally returns original FIA, AIC, and BIC values, and, for multi-individual fits, compares the model-selection criteria for the aggregated data.
`round.digit`	Integer specifying to which decimal place the results should be rounded. Default is 6. Is also used for rounding FIA, AIC, and BIC values before counting the best fitting values per individual datasets.
`dataset`	Integer vector specifying whether or not to restrict the individual comparison top certain dataset(s). Aggregated results will not be displayed if this argument is present.

select.mpt is the second major function of MPTinR, next to fit.mpt. It takes a list of results produced by fit.mpt and returns a data.frame comparing the models using the information criteria obtained by fit.mpt. That is, if FIA was not obtained for the models, select.mpt only uses AIC and BIC. We strongly recommend using FIA for model selection (see e.g., Gruenwald, 2000).

The outputs follows the same principle for all information criteria. The lowest value is taken as the reference value and the differences to this value (i.e., the delta) are reported for all models (e.g., delta.FIA). If one additionally wants the original values, output needs to be set to "full".

For AIC and BIC, AIC and BIC weights are reported as wAIC and wBIC (Wagenmakers & Farrell, 2004).

For multi-individual fit, select.mpt will additionally return how often each model provided the best fit (e.g., FIA.best). Values are rounded before determining which is the best fitting model. Note that there can be ties so that two models provide the best fit. Furthermore, if output is "standard", only results for the summed information criteria are returned (indicated by the postfix .sum). To obtain model selection results for the aggregated data (indicated by postfix .aggregated), output needs to be set to "full".

select.mpt will check if the data of the results returned from fit.mpt are equal. (If they are not equal model selection can not be done.)

Note that the values in the returned data.frame are rounded to the round.digitth decimal place.

A data.frame containing the model selection values:
model: Name or number of model (names are either taken from mpt.results or obtained via match.call).
n.parameters: Number of parameters for each model.
G.Squared: G.Squared values of the model (from summed fits for multiple datasets).
df: df values of the model (from summed fits for multiple datasets).
p.value: p values of the model (from summed fits for multiple datasets).
p.smaller.05: How many of the individual data sets have p < .05 (for multiple datasets only).
For the information criteria (i.e., FIA, AIC, BIC) X, delta.X, X.best, X, wX represent: The difference from the reference model, how often each model provided the best fit (only for multi-individual fit), the absolute value, the weights (only AIC and BIC).
For multi-indivudal fit the postfix indicates whether the results refer to the summed information criteria from individual fit .sum or the information criteria from the aggregated data .aggregated.

As of March 2015 BIC and FIA are calculated anew if the results are displayed for multiple data sets as BIC and FIA cannot directly be summed across participants due to the log(n) terms in their formula (while AIC can be summed). Instead one first needs to sum the G^2 values, n, and the number of parameters, and only then can BIC and FIA be calculated for those summed values.

If any of the models is fitted with fit.aggregated = FALSE no aggregated results are presented.

Henrik Singmann

Gruenwald, P.D. (2000). Model selection based on minimum description length. Journal of Mathematical Psychology, 44, 133-152.

Wagenmakers, E.J. & Farrell, S. (2004). AIC model selection using Akaike weights. Psychonomic Bulletin & Review, 11, 192-196.

fit.mpt for obtaining the results needed here and an example using multi-individual fit and FIA.

# This example compares the three versions of the model in 
# Riefer and Batchelder (1988, Figure 2)

data(rb.fig2.data)
model2 <- system.file("extdata", "rb.fig2.model", package = "MPTinR")
model2r.r.eq <- system.file("extdata", "rb.fig2.r.equal", package = "MPTinR")
model2r.c.eq <- system.file("extdata", "rb.fig2.c.equal", package = "MPTinR")

# The full (i.e., unconstrained) model
ref.model <- fit.mpt(rb.fig2.data, model2)
# All r equal
r.equal <- fit.mpt(rb.fig2.data, model2, model2r.r.eq)
# All c equal
c.equal <- fit.mpt(rb.fig2.data, model2, model2r.c.eq)

select.mpt(list(ref.model, r.equal, c.equal))



## Not run: 

# Example from Broder & Schutz (2009)

data(d.broeder, package = "MPTinR")
m.2htm <- system.file("extdata", "5points.2htm.model", package = "MPTinR")
r.2htm <- system.file("extdata", "broeder.2htm.restr", package = "MPTinR")
r.1htm <- system.file("extdata", "broeder.1htm.restr", package = "MPTinR")

br.2htm.fia <- fit.mpt(d.broeder, m.2htm, fia = 50000, fit.aggregated = FALSE)
br.2htm.res.fia <- fit.mpt(d.broeder, m.2htm, r.2htm, fia = 50000, fit.aggregated = FALSE)
br.1htm.fia <- fit.mpt(d.broeder, m.2htm, r.1htm, fia = 50000, fit.aggregated = FALSE)

select.mpt(list(br.2htm.fia, br.2htm.res.fia, br.1htm.fia))
# This table shows that the n (number of trials) is too small to correctly compute 
# FIA for the 1HT model (as the penalty for the 1HTM is larger than for the 2HTM, 
# although the former is nested in the latter).
# This problem with FIA can only be overcome by collecting more trials per participant,
# but NOT by collecting more participants (as the penalties are simply summed).

# using the dataset argument we see the same
select.mpt(list(br.2htm.fia, br.2htm.res.fia, br.1htm.fia), dataset = 4, output = "full")

select.mpt(list(br.2htm.fia, br.2htm.res.fia, br.1htm.fia),	dataset = 1:10)

## End(Not run)