Sbinsmth: Estimate the option probability and surprisal curves.

View source: R/Sbinsmth.R

SbinsmthR Documentation

Estimate the option probability and surprisal curves.

Description

The surprisal curves for each item are fit to the surprisal transforms of choice probabilities for each of a set of bins of current performance values index. The error sums of squares are minimized by the surprisal optimization smooth.surp in the fda package. The output is a list vector of length n containing the functional data objects defining the curves.

Usage

  Sbinsmth(index, dataList, SfdList=dataList$SfdList, 
           indexQnt=seq(0,100, len=2*nbin+1), wtvec=matrix(1,n,1),
           iterlim=20, conv=1e-4, dbglev=0)

Arguments

index

A vector of length N containing current values of score index percentile values.

dataList

A list that contains the objects needed to analyse the test or rating scale.

SfdList

A numbered list object produced by a TestGardener analysis of a test. Its length is equal to the number of items in the test or questions in the scale. Each member of SfdList is a named list containing information computed during the analysis. By default Sbinsmth uses the SfdList member of dataList, which is generated by the initialization with Sbinsmth.init within make.dataList. Usually this is what is required. However, SfdList may occasionally need to be entered from another source, and so included here as a separate argument.

indexQnt

A vector of length 2*n+1 containing the sequence of bin boundary and bin centre values.

wtvec

A vector of length n of weights on observations. Defaults to all ones.

iterlim

The maximum number of iterations used in optimizing surprisal curves. Defaults to 20.

conv

Convergence tolerance. Defaults to 0.0001.

dbglev

Level of output within Sbinsmth. If 0, no output, if 1 the error sum of squares and slope on each iterations, and if 2 or higher, results for each line search iteration with function lnsrch.

Details

The function first bins the data in order to achieve rapid estimation of the option surprisal curves. The argument indexQnt contains the sequence of bin boundaries separated by the bin centers, so that it is of length 2*nbin + 1 where nbin is the number of bins. These bin values are distributed over the percentile interval [0,100] so that the lowest boundary is 0 and highest 100. Prior to the call to Sbinsmth these boundaries are computed so that the numbers of values of index falling in the bins are roughly equal. It is important that the number of bins be chosen so that the bins contain at least about 25 values.

After the values of index are binned, the proportions that the bins are chosen for each question and each option are computed. Proportions of zero are given NA values.

The positive proportions are then converted to surprisal values where surprisal = -log_M (proportion) where log_M is the logarithm with base M, the number of options associated with a question. Bins with zero proportions are assigned a surprisal that is appropriately large in the sense of being in the range of the larger surprisal values associated with small but positive proportions. This surprisal value is usually about 4.

The next step is to fit the surprisal values for each question by a functional data object that is smooth, passes as closely as possible to an option's surprisal values, and has values consistent with being a surprisal value. The function smooth.surp() is used for this purpose. The arc length of thme item information curve is also computed.

Finally the curves and other results for each question are saved in object SfdList, a list vector of length n, and the list vector is returned.

Value

The optimized numbered list object SfdList with length n that provides data on the probability and surprisal data and curves. The 12 objects for each item are as follows:

Sfd:

A surprisal functional data object that is used for plotting. It also contains the coefficient matrix and functional data basis that define the object.

M:

The number of options, including if needed a final option which is for the missing and illegitimate responses.

Pbin:

A nbin by M matrix of proportions of choice for each option.

Sbin:

A nbin by M matrix of surprisal values for each option..

indfine:

A fine mesh of 101 equally spaced score index values over the interval [0,1].

Pmatfine:

A 101 by M matrix of probability values at each of the fine mesh points indfine.

Smatfine:

A 101 by M matrix of surprisal values at each of the fine mesh points indfine.

DSmatfine:

A 101 by M matrix of surprisal first derivative values at each of the fine mesh points indfine.

D2Smatfine:

A 101 by M matrix of surprisal second derivative values at each of the fine mesh points indfine.

PSrsErr:

The standard error for probability over the fine mesh.

PSrsErr:

The standard error for surprisal over the fine mesh.

itemScope:

The length of the item info curve.

Author(s)

Juan Li and James Ramsay

References

Ramsay, J. O., Li J. and Wiberg, M. (2020) Full information optimal scoring. Journal of Educational and Behavioral Statistics, 45, 297-315.

Ramsay, J. O., Li J. and Wiberg, M. (2020) Better rating scale scores with information-based psychometrics. Psych, 2, 347-360.

See Also

ICC_plot, Sbinsmth

Examples

#  Example 1.  Display the item probability and surprisal curves for the 
#  short SweSAT multiple choice test with 24 items and 1000 examinees
#  and estimate the initial surprisal curves. 
#  The percent rank values for the jittered sum scores
dataList    <- Quant_13B_problem_dataList
index       <- dataList$percntrnk
#  The bin locations for these score index values
indexQnt    <- dataList$indexQnt
#  The five marker percentage locations for (5, 25, 50, 75, 95)
Qvec        <- dataList$Qvec
#  Carry out the surprisal smoothing operation
SfdResult   <- Sbinsmth(index, dataList)
#  Set up the list object for the estimated surprisal curves
SfdList     <- SfdResult$SfdList
#  plot the curves for the first question
binctr    <- dataList$binctr
scrfine   <- seq(0,100,len=101)
ICC_plot(scrfine, SfdList, dataList, Qvec, binctr, plotindex=1, 
         plotrange=c(0,100))

TestGardener documentation built on Nov. 24, 2023, 5:08 p.m.