startNLR: Calculates starting values for non-linear regression DIF...
In difNLR: DIF and DDF Detection by Non-Linear Regression Models

startNLR

R Documentation

Calculates starting values for non-linear regression DIF models.

Description

Calculates starting values for the difNLR() function based on linear approximation.

Usage

startNLR(Data, group, model, constraints = NULL, match = "zscore",
         parameterization = "irt", simplify = FALSE)

Arguments

`Data`	data.frame or matrix: dataset in which rows represent scored examinee answers (`"1"` correct, `"0"` incorrect) and columns correspond to the items.
`group`	numeric: a binary vector of a group membership (`"0"` for the reference group, `"1"` for the focal group).
`model`	character: generalized logistic regression model for which starting values should be estimated. See Details.
`constraints`	character: which parameters should be the same for both groups. Possible values are any combinations of parameters `"a"`, `"b"`, `"c"`, and `"d"`. Default value is `NULL`.
`match`	character or numeric: matching criterion to be used as an estimate of the trait. It can be either `"zscore"` (default, standardized total score), `"score"` (total test score), or a numeric vector of the same length as a number of observations in the `Data`.
`parameterization`	character: parameterization of regression coefficients. Possible options are `"irt"` (IRT parameterization, default), `"is"` (intercept-slope), and `"logistic"` (logistic regression as in the `glm` function, available for the `"2PL"` model only). See Details.
`simplify`	logical: should initial values be simplified into the matrix? It is only applicable when parameterization is the same for all items.

Details

The unconstrained form of the 4PL generalized logistic regression model for probability of correct answer (i.e., Y_{pi} = 1) using IRT parameterization is

P(Y_{pi} = 1|X_p, G_p) = (c_{iR} \cdot G_p + c_{iF} \cdot (1 - G_p)) + (d_{iR} \cdot G_p + d_{iF} \cdot (1 - G_p) - c_{iR} \cdot G_p - c_{iF} \cdot (1 - G_p)) / (1 + \exp(-(a_i + a_{i\text{DIF}} \cdot G_p) \cdot (X_p - b_p - b_{i\text{DIF}} \cdot G_p))),

where X_p is the matching criterion (e.g., standardized total score) and G_p is a group membership variable for respondent p. Parameters a_i, b_i, c_{iR}, and d_{iR} are discrimination, difficulty, guessing, and inattention for the reference group for item i. Terms a_{i\text{DIF}} and b_{i\text{DIF}} then represent differences between the focal and reference groups in discrimination and difficulty for item i. Terms c_{iF}, and d_{iF} are guessing and inattention parameters for the focal group for item i. In the case that there is no assumed difference between the reference and focal group in the guessing or inattention parameters, the terms c_i and d_i are used.

Alternatively, intercept-slope parameterization may be applied:

P(Y_{pi} = 1|X_p, G_p) = (c_{iR} \cdot G_p + c_{iF} \cdot (1 - G_p)) + (d_{iR} \cdot G_p + d_{iF} \cdot (1 - G_p) - c_{iR} \cdot G_p - c_{iF} \cdot (1 - G_p)) / (1 + \exp(-(\beta_{i0} + \beta_{i1} \cdot X_p + \beta_{i2} \cdot G_p + \beta_{i3} \cdot X_p \cdot G_p))),

where parameters \beta_{i0}, \beta_{i1}, \beta_{i2}, \beta_{i3} are intercept, effect of the matching criterion, effect of the group membership, and their mutual interaction, respectively.

The model argument offers several predefined models. The options are as follows: Rasch for 1PL model with discrimination parameter fixed on value 1 for both groups, 1PL for 1PL model with discrimination parameter set the same for both groups, 2PL for logistic regression model, 3PLcg for 3PL model with fixed guessing for both groups, 3PLdg for 3PL model with fixed inattention for both groups, 3PLc (alternatively also 3PL) for 3PL regression model with guessing parameter, 3PLd for 3PL model with inattention parameter, 4PLcgdg for 4PL model with fixed guessing and inattention parameter for both groups, 4PLcgd (alternatively also 4PLd) for 4PL model with fixed guessing for both groups, 4PLcdg (alternatively also 4PLc) for 4PL model with fixed inattention for both groups, or 4PL for 4PL model.

Three possible parameterizations can be specified in the "parameterization" argument: "irt" returns the IRT parameters of the reference group and differences in these parameters between the reference and focal group. Parameters of asymptotes are printed separately for the reference and focal groups. "is" returns intercept-slope parameterization. Parameters of asymptotes are again printed separately for the reference and focal groups. "logistic" returns parameters in logistic regression parameterization as in the glm function, and it is available only for the 2PL model.

Value

A list containing elements representing items. Each element is a named numeric vector with initial values for the chosen generalized logistic regression model.

Author(s)

Adela Hladka (nee Drabinova)
Institute of Computer Science of the Czech Academy of Sciences
hladka@cs.cas.cz

Patricia Martinkova
Institute of Computer Science of the Czech Academy of Sciences
martinkova@cs.cas.cz

References

Drabinova, A. & Martinkova, P. (2017). Detection of differential item functioning with nonlinear regression: A non-IRT approach accounting for guessing. Journal of Educational Measurement, 54(4), 498–517, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1111/jedm.12158")}.

Hladka, A. & Martinkova, P. (2020). difNLR: Generalized logistic regression models for DIF and DDF detection. The R Journal, 12(1), 300–323, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.32614/RJ-2020-014")}.

Hladka, A. (2021). Statistical models for detection of differential item functioning. Dissertation thesis. Faculty of Mathematics and Physics, Charles University.

Examples

# loading data
data(GMAT)
Data <- GMAT[, 1:20] # items
group <- GMAT[, "group"] # group membership variable

# 3PL model with the same guessing for both groups
startNLR(Data, group, model = "3PLcg")
startNLR(Data, group, model = "3PLcg", parameterization = "is")
# simplified into a single table
startNLR(Data, group, model = "3PLcg", simplify = TRUE)
startNLR(Data, group, model = "3PLcg", parameterization = "is", simplify = TRUE)

# 2PL model
startNLR(Data, group, model = "2PL")
startNLR(Data, group, model = "2PL", parameterization = "is")
startNLR(Data, group, model = "2PL", parameterization = "logistic")

# 4PL model with a total score as the matching criterion
startNLR(Data, group, model = "4PL", match = "score")
startNLR(Data, group, model = "4PL", match = "score", parameterization = "is")

# starting values for model specified for each item
startNLR(Data, group,
  model = c(
    rep("1PL", 5), rep("2PL", 5),
    rep("3PL", 5), rep("4PL", 5)
  )
)

# 4PL model with fixed a and c parameters
startNLR(Data, group, model = "4PL", constraints = "ac", simplify = TRUE)

difNLR documentation built on June 30, 2025, 5:06 p.m.