# randomCAT: Random generation of adaptive tests (dichotomous and... In catR: Generation of IRT Response Patterns under Computerized Adaptive Testing

## Description

This command generates a response pattern to an adaptive test, for a given item bank (with either dichotomous or polytomous models), a true ability level, and several lists of CAT parameters (starting items, stopping rule, provisional and final ability estimators).

## Usage

 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 randomCAT(trueTheta, itemBank, model = NULL, responses = NULL, genSeed = NULL, cbControl = NULL, nAvailable = NULL, start = list(fixItems = NULL, seed = NULL, nrItems = 1, theta = 0, D = 1, randomesque = 1, random.seed = NULL, startSelect = "MFI", cb.control = FALSE, random.cb = NULL), test = list(method = "BM", priorDist = "norm", priorPar = c(0, 1), range = c(-4, 4), D = 1, parInt = c(-4, 4, 33), itemSelect = "MFI", infoType = "observed", randomesque = 1, random.seed = NULL, AP = 1, proRule = "length", proThr = 20,constantPatt = NULL), stop = list(rule = "length", thr = 20, alpha = 0.05), final = list(method = "BM", priorDist = "norm", priorPar = c(0, 1), range = c(-4, 4), D = 1, parInt = c(-4, 4, 33), alpha = 0.05), allTheta = FALSE, save.output = FALSE, output = c("path", "name", "csv")) ## S3 method for class 'cat' print(x, ...) ## S3 method for class 'cat' plot(x, ci = FALSE, alpha = 0.05, trueTh = TRUE, classThr = NULL, save.plot = FALSE, save.options = c("path", "name", "pdf"),...)

 trueTheta numeric: the value of the true ability level. itemBank numeric: a suitable matrix of item parameters (possibly augmented by group membership for content balancing). See Details. model either NULL (default) for dichotomous models, or any suitable acronym for polytomous models. Possible values are "GRM", "MGRM", "PCM", "GPCM", "RSM" and "NRM". See Details. responses either NULL (default) or a vector of pre-specified item responses with as many components as the rows of itemBank. See Details. genSeed either a numeric value to fix the random generation of responses pattern or NULL (default). Ignored if responses is not NULL. See Details. cbControl either a list of accurate format to control for content balancing, or NULL. See Details. nAvailable either a boolean vector indicating which items (denoted by 1's) are available at the start of the test and which (denoted by 0's) are not, or NULL (default). See Details. start a list with the options for starting the adaptive test. See Details. test a list with the options for provisional ability estimation and next item selection. See Details. stop a list with the options of the stopping rule. See Details. final a list with the options for final ability estimation. See Details. allTheta logical: should all provisional ability estimates and standard errors be computed and returned (including among the starting items)? Default is FALSE, meaning that provisional ability estimates and standard errors are computed only after the selection of the starting items. Ignored if the $nrItems of the start list is equal to zero. save.output logical: should the output be saved in an external text file? (default is FALSE). output character: a vector of three components. The first component is either the file path to save the output of "path" (default), the second component is the name of the output file, and the third component is the file type, either "txt" or "csv" (default). See Details. x an object of class "cat", typically an output of randomCAT function. ci logical: should the confidence intervals be plotted for each provisional ability estimate? (default is TRUE). alpha numeric: the significance level for provisional confidence intervals (default is 0.05). Ignored if ci is FALSE. trueTh logical: should the true ability level be drawn by a horizontal line? (default is TRUE). classThr either a numeric value giving the classification threshold to be displayed, or NULL. save.plot logical: should the plot be saved in an external figure? (default is FALSE). save.options character: a vector of three components. The first component is either the file path or "path" (default), the second component is the name of the output file or ,"name" (default), and the third component is the file extension, either "pdf" (default) or "jpeg". Ignored if save.plot is FALSE. See Details. ... other generic arguments to be passed to print and plot functions. ## Details The randomCAT function generates an adaptive test using an item bank specified by arguments itemBank and model, and for a given true ability level specified by argument trueTheta. Dichotomous IRT models are considered whenever model is set to NULL (default value). In this case, itemBank must be a matrix with one row per item and four columns, with the values of the discrimination, the difficulty, the pseudo-guessing and the inattention parameters (in this order). These are the parameters of the four-parameter logistic (4PL) model (Barton and Lord, 1981). Polytomous IRT models are specified by their respective acronym: "GRM" for Graded Response Model, "MGRM" for Modified Graded Response Model, "PCM" for Partical Credit Model, "GPCM" for Generalized Partial Credit Model, "RSM" for Rating Scale Model and "NRM" for Nominal Response Model. The itemBank still holds one row per item, end the number of columns and their content depends on the model. See genPolyMatrix for further information and illustrative examples of suitable polytomous item banks. By default all item responses will be randomly drawn from parent distribution set by the item bank parameters of the itemBank matrix (using the genPattern function for instance). Moreover, the random generation of the item responses can be fixed (for e.g., replication purposes) by assigning some numeric value to the genSeed argument. By default this argument is equal to NULL so the random seed is not fixed (and two successive runs of randomCAT will usually lead to different response patterns). It is possible, however, to provide a full response pattern of previously recorded responses to each of the item bank, for instance for post-hoc simulations. This is done by providing to the responses argument a vector of binary entries (without missing values). By default responses is set to NULL and item responses will be drawn from the item bank parameters. With the aforementioned item bank structures, content balancing cannot be controled and cbControl must be set to NULL (default value). Otherwise this will most often lead to an error. In order to allow for content balancing control: 1. the itemBank must be updated with an additional column holding the group names; 2. the cbControl argument must be set properly as a list with group names and theoretical proportions for content balancing. See the nextItem function for further details on how to specify cbControl properly and under which conditions it is operational (see Kingsbury and Zara, 1989, for further details). Separation of the item parameters and the vector of group membership is performed internally through the breakBank function (and thus should not be performed prior to CAT generation). The test specification is made by means of four lists of options: one list for the selection of the starting items, one list with the options for provisional ability estimation, one list to define the stopping rule, and one list with the options for final ability estimation. These lists are specified respectively by the arguments start, test, stop and final. The start list can contain one or several of the following arguments: • fixItems: either a vector of integer values, setting the items to be administered as first items, or NULL (default) to let the function select the items. • seed: either a numeric value to fix the random seed for item selection, NA to randomly select the items withour fixing the random seed, or NULL (default) to select the items on the basis of their difficulty level. Ignored if fixItems is not NULL. • nrItems: numeric, the number of starting items to be randomly selected (default is 1). Can be equal to zero to avoid initial selection of items (see Details). Used only if fixItems is NULL and seed is not NULL. • theta: numeric, a vector of the initial ability levels for selecting the first items (default is the single value 0). Ignored if either fixItems or seed is not NULL. See startItems for further details. • D: numeric, the metric constant. Default is D=1 (for logistic metric); D=1.702 yields approximately the normal metric (Haley, 1952). Ignored if model is not NULL and if startSelect is not "MFI". • randomesque: integer, the number of 'randomesque' items to be picked up optimally for each value of theta vector, before random selection of a single one. Ignored if either fixItems or seed is not NULL. See startItems for further details. • random.seed: either NULL (default) or a numeric value to fix the random seed of randomesque selection of the items. Ignored if either fixItems or seed is not NULL. • startSelect: the method for selecting the first items of the test, with possible values "bOpt" and "MFI" (default). Ignored if either fixItems or seed is not NULL. See startItems for further details. • cb.control: logical value indicating whether control for content balancing should also be done when selecting the starting items. Default is FALSE. Ignored if argument cbControl is NULL. • random.cb: either NULL (default) or a numeric value to fix the selection of subgroups of items in case of content balancing control with starting items. Ignored if either cbControl is NULL or if start$cb.control is FALSE.

These arguments are passed to the function startItems to select the first items of the test.

If the argument nrItems is set to zero, then no starting item is selected and the adaptive process starts with a provisional ability level equal to the value of argument theta (or its default). Moreover, the likelihood function is then set as a flat, uniform function on the whole ability range. See the nextItem function for further details.

The test list can contain one or several of the following arguments:

• method: a character string to specify the method for ability estimation. Possible values are: "BM" (default) for Bayesian modal estimation (Birnbaum, 1969), "ML" for maximum likelihood estimation (Lord, 1980), "EAP" for expected a posteriori (EAP) estimation (Bock and Mislevy, 1982), and "WL" for weighted likelihood estimation (Warm, 1989).

• priorDist: a character string which sets the prior distribution. Possible values are: "norm" (default) for normal distribution, "unif" for uniform distribution, and "Jeffreys" for Jeffreys' noninformative prior distribution (Jeffreys, 1939, 1946). ignored if method is neither "BM" nor "EAP".

• priorPar: a vector of two numeric components, which sets the parameters of the prior distribution. If (method="BM" or method=="EAP") and priorDist="norm", the components of priorPar are respectively the mean and the standard deviation of the prior normal density. If (method="BM" or method="EAP") and priorDist="unif", the components of priorPar are respectively the lower and upper bound of the prior uniform density. Ignored in all other cases. By default, priorPar takes the parameters of the prior standard normal distribution (i.e., priorPar=c(0,1)). In addition, priorPar also provides the prior parameters for the comoutation of MLWI and MPWI values for next item selection (see nextItem for further details).

• range: the maximal range of ability levels, set as a vector of two numeric components. The ability estimate will always lie to this interval (set by default to [-4, 4]). Ignored if method=="EAP".

• D: the value of the metric constant. Default is D=1 for logistic metric. Setting D=1.702 yields approximately the normal metric (Haley, 1952). Ignored if model is not NULL.

• parInt: a numeric vector of three components, holding respectively the values of the arguments lower, upper and nqp of the eapEst, eapSem and MWI commands. It specifies the range of quadrature points for numerical integration, and is used for computing the EAP estimate, its standard error, and the MLWI and MPWI values for next item selection. Default vector is (-4, 4, 33), thus setting the range from -4 to 4 by steps of 0.25. Ignored if method is not "EAP" and if itemSelect is neither "MLWI" nor "MPWI".

• itemSelect: the rule for next item selecion, with possible values "MFI" (default) for maximum Fisher information criterion; "bOpt" for optimal ability-difficulty match (or Urry's procedure) (not available if model is not NULL); "thOpt" for optimal theta selection (not available if model is not NULL); "MLWI" and "MPWI" for respectively maximum likelihood and posterior weighted information criterion; "MEPV" for minimum expected posterior variance; "MEI" for maximum expected information; "KL" and "KLP" for Kullback-Leibler and posterior Kullback-Leibler information methods; "progressive" and "proportional" for progressive and proportional methods; ; and "random" for random selection. For further details, see nextItem.

• infoType: character: the type of information function to be used for next item selection. Possible values are "observed" (default) for observed information function, and "Fisher" for Fisher information function. Ignored if itemselect is not "MEI".

• randomesque: integer: the number of items to be chosen from the next item selection rule, among those the next item to be administered will be randomly picked up. Default value is 1 and leads to usual selection of the optimal item (Kingsbury and Zara, 1989).

• random.seed: either NULL (default) or a numeric value to fix the random seed of randomesque selection of the items. Ignored if either fixItems or seed is not NULL.

• AP: the acceleration parameter required for progressive and proportional methods, with default value 1. Ignored with all other selection methods.

• proRule: the stopping rule considered for progressive and proportional methods, with possible values "length" (default), "precision" or both. Ignored with all other selection methods.

• proThr: the stopping rule threshold considered for progressive and proportional methods. Default value is 20. Ignored with all other selection methods.

• constantPatt: character: the method to estimate ability in case of constant pattern (i.e. only correct or only incorrect responses). Can be either NULL (default), "BM", "EAP", "WL", "fixed4", "fixed7" or "var". Currently only implemented for dichotomous IRT models.

These arguments are passed to the functions thetaEst and semTheta to estimate the ability level and the standard error of this estimate. In addition, some arguments are passed to nextItem to select the next item appropriately.

The stop list can contain one or several of the following arguments:

• rule: a vector of character strings specifying the stopping rules. Possible values are: "length" (default), to stop the test after a pre-specified number of items administered; "precision", to stop the test when the provisional standard error of ability becomes less than or equal to the pre-specified value; "classification", for which the test ends whenever the provisional confidence interval (set by the alpha argument) does not hold the classification threshold anymore (this is also called the ACI rule; see e.g. Thomson, 2009); and "minInfo" to stop the test if the maximum item information of the available items at current ability estimate is smaller than the prespecified threshold. Can take a single value.

• thr: a vector of numeric values fixing the threshold(s) of the stopping rule(s). If rule="length", thr is the maximal number of items to be administered. If rule="precision", thr is the precision level (i.e. the standard error) to be reached before stopping. If rule="classification", thr corresponds to the ability level which serves as a classification rule (i.e. which must not be covered by the provisional confidence interval). Finally, If rule="minInfo", thr corresponds to the minimum item information that can be observed in the bank of remaining available items. The "classification" rule is not available for the progressive and proportional item selection rules.

• alpha: the significance (or α) level for computing the priovisional confidence interval of ability. Ignored if rule is not "classification". Important: the thr value smust be sorted in the same order of appearance as the rule methods.

Eventually, the final list can contain one or several arguments of the test list (with possiblly different values), as well as the additional alpha argument. The latter specifies the α level of the final confidence interval of ability, which is computed as

[\hat{θ}-z_{1-α/2} \; se(\hat{θ}) ; \hat{θ}+z_{1-α/2} \; se(\hat{θ})]

where \hat{θ} and se(\hat{θ}) are respectively the ability estimate and its standard error. Note that the argument itemSelect of the test list is not used for final estimation of the ability level, and is therefore not allowed into the final list.

If some arguments of these lists are missing, they are automatically set to their default value. The contents of the lists is checked with the testList function, and the adaptive test is generated only if the lists are adequately defined. Otherwise, a message error is printed. Note that the testList function works for both dichotomous and polytomous models.

Usually the ability estimates and related standard errors are computed right after the administration of the starting items (that is, if k starting items are administered, the first (k-1) ability levels and standard errors are missing). This can however be avoided by fixing the argument allTheta to TRUE (by default it is FALSE). In this case, all provisional ability estimates and standard errors are computed and returned, but in the display of th output file, the first (k-1) abilities and standard errors are printed in parentheses (otherwise they are returned as NA values). Note that allTheta is ignored if no starting item was selected (that is, if argument nrItems of the start list is set to zero).

The output of randomCAT, as displayed by the print.cat function, can be stored in a text file provided that save.output is set to TRUE (the default value FALSE does not execute the storage). In this case, the (output argument mus hold three character values: the path to where the output file must be stored, the name of the output file, and the type of output file. If the path is not provided (i.e. left to its default value "path"), it will be saved in the default working directory. The default name is "name", and the default type is "csv". Any other value yields a text file. See the Examples section for an illustration.

The function plot.cat represents the set of provisional and final ability estimates througghout the test. Corresponding confidence intervals (with confidence level defined by the argument alpha) are also drawn if ci=TRUE (which is not the default value), except when stepsize adjustment was made for constant patterns (as the standard error cannot be estimated at this stage). The true ability level can be drawn by a horizontal solid line by specifying trueTh=TRUE (which is the default value); setting it to FALSE will undo the drawing. Finally, any classification threshold can be additionally displayed by specifying a numeric value to the argument classThr. The default value NULL does not display any threshold.

Finally, the plot can be saved in an external file, either as PDF or JPEG format. First, the argument save.plot must be set to TRUE (default is FALSE). Then, the file path for figure storage, the name of the figure and its format are specified through the argument save.options, all as character strings. See the Examples section for further information and a practical example.

## Value

The function randomCAT returns a list of class "cat" with the following arguments:

 trueTheta the value of the trueTheta argument. model the value of the model argument. testItems a vector with the items that were administered during the test. itemPar a matrix with the parameters of the items administered during the test. itemNames either a vector wit the names of the selected items during the CAT, or NULL. pattern the generated response pattern (as vector of 0 and 1 entries). thetaProv a vector with the provisional ability estimates. seProv a vector with the standard errors of the provisional ability estimates. ruleFinal either the stopping rule(s) that was (were) satisfied to make the CAT stop, or NULL. thFinal the final ability estimate. seFinal the standrad error of the final ability estimate. ciFinal the confidence interval of the final ability estimate. genSeed the value of the genSeed argument. startFixItems the value of the start$fixItems argument (or its default value if missing). startSeed the value of the start$seed argument (or its default value if missing). startNrItems the value of the start$nrItems argument (or its default value if missing). startTheta the value of the start$theta argument (or its default value if missing). startD the value of the start$D argument (or its default value if missing). startRandomesque the value of the start$randomesque argument (or its default value if missing). startThStart the starting ability values used for selecting the first items of the test. startSelect the value of the start$startSelect argument (or its default value if missing). startCB logical value, being TRUE if both cbControl is not NULL and start$cb.control is TRUE, and FALSE otherwise. provMethod the value of the test$method argument (or its default value if missing). provDist the value of the test$priorDist argument (or its default value if missing). provPar the value of the test$priorPar argument (or its default value if missing). provRange the value of the test$range argument (or its default value if missing). provD the value of the test$D argument (or its default value if missing)or NA if model is not NULL. itemSelect the value of the test$itemSelect argument (or its default value if missing). infoType the value of the test$infoType argument (or its default value if missing). Not returned if model is not NULL. randomesque the value of the test$randomesque argument (or its default value if missing). AP the value of the test$AP argument (or its default value if missing). constantPattern the value of the test$constantPatt argument (or its default value if missing). cbControl the value of the cbControl argument (or its default value if missing). cbGroup the value of the itemBank$cbGroup element of the item bank itemBank (for dichotomous IRT models), or the cbGroup element returned by the breakBank function (for polytomous IRT models), or NULL. stopRule the value of the stop$rule argument (or its default value if missing). stopThr the value of the stop$thr argument (or its default value if missing). stopAlpha the value of the stop$alpha argument (or its default value if missing). endWarning a logical indactor indicating whether the adaptive test stopped because the stopping rule(s) was (were) satisfied, or becasue all items in the bank were administered. finalMethod the value of the final$method argument (or its default value if missing). finalDist the value of the final$priorDist argument (or its default value if missing). finalPar the value of the final$priorPar argument (or its default value if missing). finalRange the value of the final$range argument (or its default value if missing). finalD the value of the final$D argument (or its default value if missing), or NA if model is not NULL. finalAlpha the value of the final$alpha argument (or its default value if missing). save.output the value of the save.output argument. output the value of the output argument. assigned.responses a logical value, being TRUE if responses was provided or FALSE responses was set to NULL.

The function print.cat returns similar (but differently organized) results.

## Author(s)

David Magis
Department of Education, University of Liege, Belgium
[email protected]

Department of Psychology and Sociology, Universidad Zaragoza, Spain
[email protected]

## References

Barrada, J. R., Olea, J., Ponsoda, V., and Abad, F. J. (2010). A method for the comparison of item selection rules in computerized adaptive testing. Applied Psychological Measurement, 20, 213-229. doi: 10.1177/0146621610370152

Barton, M.A., and Lord, F.M. (1981). An upper asymptote for the three-parameter logistic item-response model. Research Bulletin 81-20. Princeton, NJ: Educational Testing Service.

Birnbaum, A. (1969). Statistical theory for logistic mental test models with a prior distribution of ability. Journal of Mathematical Psychology, 6, 258-276. doi: 10.1016/0022-2496(69)90005-4

Bock, R. D., and Mislevy, R. J. (1982). Adaptive EAP estimation of ability in a microcomputer environment. Applied Psychological Measurement, 6, 431-444. doi: 10.1177/014662168200600405

Haley, D.C. (1952). Estimation of the dosage mortality relationship when the dose is subject to error. Technical report no 15. Palo Alto, CA: Applied Mathematics and Statistics Laboratory, Stanford University.

Jeffreys, H. (1939). Theory of probability. Oxford, UK: Oxford University Press.

Jeffreys, H. (1946). An invariant form for the prior probability in estimation problems. Proceedings of the Royal Society of London. Series A, Mathematical and Physical Sciences, 186, 453-461.

Kingsbury, G. G., and Zara, A. R. (1989). Procedures for selecting items for computerized adaptive tests. Applied Measurement in Education, 2, 359-375. doi: 10.1207/s15324818ame0204_6

Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Lawrence Erlbaum.

Magis, D. and Barrada, J. R. (2017). Computerized Adaptive Testing with R: Recent Updates of the Package catR. Journal of Statistical Software, Code Snippets, 76(1), 1-18. doi: 10.18637/jss.v076.c01

Magis, D., and Raiche, G. (2012). Random Generation of Response Patterns under Computerized Adaptive Testing with the R Package catR. Journal of Statistical Software, 48 (8), 1-31. doi: 10.18637/jss.v048.i08

Thompson, N. A. (2009). Item selection in computerized classification testing. Educational and Psychological Measurement, 69, 778-793. doi: 10.1177/0013164408324460

Urry, V. W. (1970). A Monte Carlo investigation of logistic test models. Unpublished doctoral dissertation. West Lafayette, IN: Purdue University.

van der Linden, W. J. (1998). Bayesian item selection criteria for adaptive testing. Psychometrika, 63, 201-216. doi: 10.1007/BF02294775

Veerkamp, W. J. J., and Berger, M. P. F. (1997). Some new item selection criteria for adaptive testing. Journal of Educational and Behavioral Statistics, 22, 203-226. doi: 10.3102/10769986022002203

Warm, T.A. (1989). Weighted likelihood estimation of ability in item response models. Psychometrika, 54, 427-450. doi: 10.1007/BF02294627