Function to fit MPT models

Description

This generic function fits binary multinomial processing tree models (MPT models; e.g., Riefer & Batchelder, 1988) from an external model file and (optional) external restrictions. Additionally, measures for model selection (AIC, BIC, FIA) can be computed.

Usage

1
2
3
4
5
6
## S4 method for signature 'character'
fit.mpt(model, data, restrictions.filename = NULL, model.type = c("easy", "eqn", "eqn2"), start.parameters = NULL, ...)
## S4 method for signature 'bmpt.model'
fit.mpt(model, data, ci = 95, n.optim = list("auto", 5), start.parameters = NULL, ...)
## S4 method for signature 'mpt.model'
fit.mpt(model, data, n.optim = 5, ci = 95, start.parameters = NULL, method = c("L-BFGS-B", "nlminb"), multicore = c("none", "individual", "n.optim"), sfInit = FALSE, nCPU = 2, ...)

Arguments

model

Either a character string specifying the position of the model filename or a model object of class bmpt.model or mpt.model as produced by make.mpt.

data

Either a numeric vector for individual fit or a numeric matrix or data.frame for multi-individual fit. The data on each position (column for multi-individual fit) must correspond to the respective response category (e.g., line in model file).

restrictions.filename

NULL or a character vector specifying the location and name of the restrictions file. Default is NULL which corresponds to no restrictions.

model.type

Character vector specifying whether the model file is formatted in the easy way ("easy"; i.e., each line represents all branches corresponding to a response category) or the traditional EQN syntax ("eqn" or "eqn2"; see make.mpt). If model.filename ends with .eqn or .EQN, model.type is automatically set to "eqn". Default is "easy".

start.parameters

A numeric vector or NULL. See Details.

ci

A scalar corresponding to the size of the default confidence intervals for the parameter estimates. Default is 95 which corresponds to 95% confidence intervals. See also parameters-methods.

n.optim

List or numeric. Number of optimization runs. See Details

method

character. Only relevant for models not members of L-BMPT (see make.mpt), when fitting objects of class mpt.model (see also bmpt.model-class). "L-BFGS-B" is a quasi-Newton gradient-based general purpose optimization routine (see optim. "nlminb" is a gradient-based optimization routine that uses the PORT routines (see nlminb).

multicore

Character vector. If not "none", uses snowfall for parallelization (which needs to be installed separately via install.packages(snowfall)). If "individual", parallelizes the optimization for each individual (i.e., data needs to be a matrix or data.frame). If "n.optim", parallelizes the n.optim optimization runs. Default is "none" which corresponds to no parallelization. Note that you need to initialize snowfall in default settings. See sfInit and Details.

sfInit

Logical. Relevant if multicore is not "none". If TRUE, fit.mpt will initialize and close the multicore support. If FALSE, (the default) assumes that sfInit() was initialized before. See Details.

nCPU

Scalar. Only relevant if multicore is not "none" and sfInit is TRUE. Number of CPUs used by snowfall. Default is 2.

...

Used to pass arguments from the method for character to the other methods. Currently ignored in the other methods.

Details

For details on model.filename, restrictions.filename, or model.type, see make.mpt.

When calling fit.mpt with a filename as the model argument, make.mpt is called to create a model object (see bmpt.model-class). For models of class bmpt.model the fast Fortran routine implementing an EM-algorithm is used for fitting and the results are returned in an object of class bmpt. For models of class mpt.model the fitting routine that was already implemented in the first version of MPTinR using optim's L-BFGS-B is used (this version of MPTinR allows to use nlminb as an alternative to L-BFGS-B) an don object of class mpt is returned. Note that most of the advanced features of MPTinR, such as the FIA and parametric and non-parametric bootstrapped CIs (all methods currently not implemented), are only available for models that are members of L-BMPT (Purdy & Batchelder, 2009; i.e., of type bmpt.model).

The index of each datapoint (or the column for matrices) must correspond to the row or number of this response catgeory in the model file.

start.parameters are used as the initial values when fitting the model.
For models of class bmpt.model the start.parameters argument must be either NULL or a numeric vector of at least length = number of free parameters. If length(start.parameters) is larger than the number of free parameters it is truncated. The start.parameters are mapped on the free parameters based on the order (see check(model)[["free.parameters"]] where model can be either a model object or fitted object).

For models of class mpt.model multiple cores can be used for fitting via the multicore argument. Multicore fitting is achieved via the snowfall package and needs to be initialized via sfInit. As initialization needs some time, you can either initialize multicore facilities yourself using sfInit() and setting the sfInit argument to FALSE (the default) or let MPTinR initialize multicore facilities by setting the sfInit argument to TRUE. The former is recommended as initializing snowfall takes some time and only needs to be done once if you run fit.mpt multiple times. If there are any problems with multicore fitting, first try to initialize snowfall outside MPTinR (e.g., sfInit( parallel=TRUE, cpus=2 )). If this does not work, the problem is not related to MPTinR but to snowfall (for support and references visit: http://www.imbi.uni-freiburg.de/parallel/).
When using multicore facilties set argument nCPU to the desired (and available) number of CPUs.
Note that you should close snowfall via sfStop() after using MPTinR.

Value

An object of class bmpt or mpt. See bmpt-class for corresponding methods.

Note

Whenever possible try to write your model as a member of L-BMPT (Purdy & Batchelder, 2009) to use the full functionality of MPTinR.

Author(s)

Henrik Singmann and David Kellen.

The Fortran code was written by Karl Christoph Klauer

References

Baldi, P. & Batchelder, W. H. (2003). Bounds on variances of estimators for multinomial processing tree models. Journal of Mathematical Psychology, 47, 467-470.

Byrd, R. H., Lu, P., Nocedal, J., & Zhu, C. (1995). A limited memory algorithm for bound constrained optimization. SIAM J. Scientific Computing, 16, 1190-1208.

Knapp, B. R., & Batchelder, W. H. (2004). Representing parametric order constraints in multi-trial applications of multinomial processing tree models. Journal of Mathematical Psychology, 48, 215-229.

Moshagen, M. (2010). multiTree: A computer program for the analysis of multinomial processing tree models. Behavior Research Methods, 42, 42-54.

Purdy, B. P., & Batchelder, W. H. (2009). A context-free language for binary multinomial processing tree models. Journal of Mathematical Psychology, 53, 547-561.

Riefer, D. M., & Batchelder, W. H. (1988). Multinomial modeling and the measurement of cognitive processes. Psychological Review, 95, 318-339.

Stahl, C. & Klauer, K. C. (2007). HMMTree: A computer program for latent-class hierarchical multinomial processing tree models. Behavior Research Methods, 39, 267- 273.

Wu, H., Myung, J.I., & Batchelder, W.H. (2010a). Minimum description length model selection of multinomial processing tree models. Psychonomic Bulletin & Review, 17, 275-286.

Wu, H., Myung, J.I., & Batchelder, W.H. (2010b). On the minimum description length complexity of multinomial processing trees. Journal of Mathematical Psychology, 54, 291-303.

See Also

make.mpt for the function to create the model objects and bmpt.model-class for their methods.

bmpt-class for methods of objects returned by this function.

http://www.psychologie.uni-freiburg.de/Members/singmann/R/mptinr for additional information on model files and restriction files

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
## Not run: 
# The first example fits the MPT model presented in Riefer and Batchelder (1988, Figure 1)
# to the data presented in Riefer and Batchelder (1988, Table 1)
# Note that Riefer and Batchelder (1988, pp. 328) did some hypotheses tests, that are not done here.
# Rather, we use each condition (i.e., row in Table 1) as a different individual.


# load the data
data(rb.fig1.data, package = "MPTinR2")

#make model objects, once using the easy format, once using the eqn format.
(model1 <- make.mpt(system.file("extdata", "rb.fig1.model", package = "MPTinR2")))
(model1.eqn <- make.mpt(system.file("extdata", "rb.fig1.model.eqn", package = "MPTinR2")))
#both models are identical:
identical(model1, model1.eqn)

# specify the same model via textConnection
model1.txtCon <- make.mpt(textConnection("p * q * r
p * q * (1-r)
p * (1-q) * r
p * (1-q) * (1-r) + (1-p)"))
identical(model1, model1.txtCon)
# see ?make.mpt for more on how to specify a model and restrictions

# just fit the first "individual":
fit.mpt(model1, rb.fig1.data[1,])

#fit all "individuals":
fit.mpt(model1, rb.fig1.data)

#fit all "individuals" using the .EQN model file:
fit.mpt(model1.eqn, rb.fig1.data)

#fit all "individuals" using the .txtCon model file:
fit.mpt(model1.txtCon, rb.fig1.data)


# The second example fits the MPT model presented in Riefer and Batchelder (1988, Figure 2)
# to the data presented in Riefer and Batchelder (1988, Table 3)
# First, the model without restrictions is fitted: ref.model
# Next, the model with all r set equal is fitted: r.equal
# Then, the model with all c set equal is fitted: c.equal
# Finally, the inferential tests reported by Riefer & Batchelder, (1988, p. 332) are executed.


# get the data
data(rb.fig2.data, package = "MPTinR2")

# make model objects
model2.file <- system.file("extdata", "rb.fig2.model", package = "MPTinR2")
model2 <- make.mpt(model2.file)
model2.r.eq <- make.mpt(model2.file, system.file("extdata", "rb.fig2.r.equal", package = "MPTinR2"))
model2.c.eq <- make.mpt(model2.file, system.file("extdata", "rb.fig2.c.equal", package = "MPTinR2"))

# The full (i.e., unconstrained) model
(ref.model <- fit.mpt(model2, rb.fig2.data))


# All r equal
(r.equal <- fit.mpt(model2.r.eq, rb.fig2.data))

# All c equal
(c.equal <- fit.mpt(model2.c.eq, rb.fig2.data))

# is setting all r equal a good idea?
(g.sq.r.equal <- goodness.of.fit(r.equal)[["G.Squared"]] - goodness.of.fit(ref.model)[["G.Squared"]])
(df.r.equal <- goodness.of.fit(r.equal)[["df.model"]] - goodness.of.fit(ref.model)[["df.model"]])
(p.value.r.equal <- pchisq(g.sq.r.equal, df.r.equal , lower.tail = FALSE))

# is setting all c equal a good idea?
(g.sq.c.equal <- goodness.of.fit(c.equal)[["G.Squared"]] - goodness.of.fit(ref.model)[["G.Squared"]])
(df.c.equal <- goodness.of.fit(c.equal)[["df.model"]] - goodness.of.fit(ref.model)[["df.model"]])
(p.value.c.equal <- pchisq(g.sq.c.equal, df.c.equal , lower.tail = FALSE))

# Example from Broeder & Schuetz (2009)
# We fit the data from the 40 individuals from their Experiment 3
# We fit three different models:
# 1. Their 2HTM model: br.2htm
# 2. A restricted 2HTM model with Dn = Do: br.2htm.res
# 3. A 1HTM model (i.e., Dn = 0): br.1htm
# We fit the models with, as well as without, applied inequality restrictions (see Details)
# that is, for some models (.ineq) we impose: G1 < G2 < G3 < G4 < G5 
# As will be apparent, the inequality restrictions do not hold for all individuals.

data(d.broeder, package = "MPTinR2")
m.2htm <- system.file("extdata", "5points.2htm.model", package = "MPTinR2")
r.2htm <- system.file("extdata", "broeder.2htm.restr", package = "MPTinR2")
r.1htm <- system.file("extdata", "broeder.1htm.restr", package = "MPTinR2")
i.2htm <- system.file("extdata", "broeder.2htm.ineq", package = "MPTinR2")
ir.2htm <- system.file("extdata", "broeder.2htm.restr.ineq", package = "MPTinR2")
ir.1htm <- system.file("extdata", "broeder.1htm.restr.ineq", package = "MPTinR2")

# fit the original 2HTM
br.2htm <- fit.mpt(m.2htm, d.broeder)
br.2htm.ineq <- fit.mpt(m.2htm, d.broeder, i.2htm)

# do the inequalities hold for all participants?
parameters(br.2htm.ineq, sort.alphabetical = TRUE)[["individual"]][,"estimates",]
parameters(br.2htm)[["individual"]][,"estimates",]
# See the difference between forced and non-forced inequality restrictions:
round(parameters(br.2htm)[["individual"]][,"estimates",]-
parameters(br.2htm.ineq, sort.alphabetical = TRUE)[["individual"]][,"estimates",],2)

# The same for the other two models
# The restricted 2HTM
br.2htm.res <- fit.mpt(m.2htm, d.broeder, r.2htm)
br.2htm.res.ineq <- fit.mpt(m.2htm, d.broeder, ir.2htm)
round(parameters(br.2htm.res, sort.alphabetical = TRUE)[["individual"]][,"estimates",]-
parameters(br.2htm.res.ineq, sort.alphabetical = TRUE)[["individual"]][,"estimates",],2)

# The 1HTM
br.1htm <- fit.mpt(m.2htm, d.broeder, r.1htm)
br.1htm.ineq <- fit.mpt(m.2htm, d.broeder, ir.1htm)
round(parameters(br.1htm, sort.alphabetical = TRUE)[["individual"]][,"estimates",]-
parameters(br.1htm.ineq, sort.alphabetical = TRUE)[["individual"]][,"estimates",],2)

## End(Not run)