ksIRT: ksIRT - kernel smoothing in Item Response Theory
In KernSmoothIRT: Nonparametric Item Response Theory

Description Usage Arguments Details Value References Examples

Fits nonparametric item and options characteristic curves using kernel smoothing techniques. Within the KernSmoothIRT package, it provides the relevant data for the graphical analysis of multiple choice test and questionnaire data.

ksIRT(responses, key, format, kernel = c("gaussian","quadratic","uniform"), itemlabels,
weights,miss = c("option","omit","random.multinom","random.unif"), NAweight = 0, 
evalpoints, nevalpoints, bandwidth = c("Silverman","CV"), RankFun = "sum", SubRank, 
thetadist = list("norm",0,1), groups = FALSE, nsubj)

## S3 method for class 'ksIRT'
print(x,...)

`responses`	input data matrix with options selected from each individual for each item. Rows represent individuals, columns represent items. Alternatively, a data.frame or list can be specified. Missing values are inserted as `NA`.
`key`	a numeric vector or a scalar. If `key` is a vector, its length must match the number of items; if it is a scalar, its value is used for all items. If the items are multiple choice, `key` should contain the option that corresponds to the correct response. If the data are rating-scale, `key` should contain the largest option value for each item. In this case, the weight assigned to each option is equal to its option number. More complicated weighting schemes, such as partial credit, can be specified in the `weights` argument. If `weights` is specified, `key` must be left blank.
`format`	a numeric scalar or vector specifying the type of items. If all of the items are multiple choice, then `format = 1`. If all of the items are rating-scale or partial credit, then `format = 2`. If all of the items are nominal items, then `format = 3`. If the test has a mixture of items of different formats, then format is a vector with length equal to the number of items with entries of 1 for each multiple choice item and 2 for each rating-scale item. For more complicated weighting schemes use the `weights` argument.
`kernel`	a character string specifying the kernel function. `kernel` must be either `"gaussian"`, `"quadratic"` or `"uniform"`. The default is `"gaussian"`.
`itemlabels`	optional list of labels for each item. If omitted, each item will be labelled according to its numerical order. These labels will be used in plotting.
`weights`	optional list that may be used in lieu of including `key`. Specifying `weights` allows for more complicated weighting schemes than the default. Its length must be equal to the number of items and each entry must be a matrix with option numbers in the first row and option weights in the second row. If weights is omitted and `format=1`, then weights are given according to `key`. If `weights` is omitted and `format=2`, then an option weight equals the option number is given to each response. If `weights` is omitted and and `format=3`, then weights are set to zero.
`miss`	a character string specifying the method used to manage missing responses. The default value, `miss="option"`, considers the missing responses as a further option, labeled as `NA`, with zero weight. Such `NA` option will be added to the plot of the Option Characteristic Curves. Alternatively, a different weight for the `NA` option may be specified through the `NAweight` argument. `miss="random.unif"` substitutes `NA`s with options randomly chosen from the possible ones for the corresponding item. `miss="random.multinom"` does the same substitution as `miss="random.unif"` but each option has a probability of being selected proportional to its relative frequency. `miss="omit"` excludes from the analysis all the subjects with at least one omitted response.
`NAweight`	a scalar value that specifies the weight given to missing responses when `miss="option"`. The default is zero.
`evalpoints`	an optional numeric vector that specifies the quantiles at which to estimate the Option Characteristic Curves. If unspecified, the default is `nevalpoints` evenly spaced values with end points determined according to the number of subjects and the distribution specified with the `thetadist` argument.
`nevalpoints`	an optional scalar value that specifies the number of evenly spaced points at which curves are estimated. This value is used as an alternative to a user defined vector in the `evalpoints` argument. The default value is 51. The end points are determined according to the number of subjects and to the distribution specified for the `thetadist` argument. If both `nevalpoints` and `evalpoints` are specified, then `evalpoints` takes precedence.
`bandwidth`	either `"Silverman"`, `"CV"` or a numeric vector specifying, for each item, the bandwidth to use for kernel smoothing. The default value, `bandwidth="Silverman"`, is a numeric vector computed following the well-known Silverman's rule of thumb. If `bandwidth="CV"`, then the bandwidth is chosen for each item through cross-validation.
`RankFun`	a function that is used to rank subjects. The default value is `"sum"`. Another common choice is `"mean"`.
`SubRank`	a numeric vector specifying the rank of each of the subjects. If unspecified and `format=1` or `format= 2`, subjects will be ranked according to the function passed through the argument `RankFun`. When `format=3` this argument must be provided.
`thetadist`	a list specifying the distribution to be used to thetadist (see Ramsay, 1991, p. 615) the subjects. By default a standard normal distribution is used. A different distribution can be adopted by specifying the first element of the list as `"norm"`, `"beta"`, `"unif"`, `"gamma"`, etc. where the character string is the same as used in the subjscoresummary function `qnorm()`,`qbeta()`, `qunif()`, `qgamma()`. The other elements of the list should be the distribution parameters as required by the subjscoresummary function chosen.
`groups`	an optional vector of length equal to the number of subjects containing the group designation of each subject. Adding this option allows for comparisons between groups using the Differential Item Functioning tools (see details section).
`nsubj`	an optional numeric value with the number of subjects.
`x`	a `ksIRT` object to be printed.
`...`	further parameters

When bandwidth="Silverman", the rule of thumb of Silverman (1986, p. 45) is implemented with the formula: 1.06*sigma.hat*nsubj^(-.2), where nsubj is the number of subjects and sigma.hat is the standard deviation of the subjscoresummary associated to the subjects according to the distribution specified with thetadist. Note that when thetadist=list("norm",mean,sd), sigma.hat is the value specified for sd.

Printing the ksIRT object shows the point polyserial correlation correlation between each item and the overall test score.

Returned from this function is a ksIRT object which is a list with the following components:

`nitem`	an integer indicating the number of items.
`nsubj`	an integer indicating the number of subjects.
`nevalpoints`	an integer indicating the number of points for curve estimation.
`binaryresp`	a matrix of binary responses. Each row corresponds to a single option. The first three columns specify the item, the option, and the corresponding weight. Each additional column is a binary indicator of whether a subject selected that option.
`OCC`	a matrix with the first 3 columns the same as `binaryresp` and an additional column for each quantile at which the option characteristic curves have been estimated. The additional columns contain the kernel smoothed probabilities of selecting each option.
`stderrs`	a matrix as `OCC` containing the standard errors of `OCC`.
`subjscore`	a vector containing the observed score of each subject.
`itemlabels`	a list containing the label for each item.
`thetadist`	a list indicating the distribution used to rank subjects (see `thetadist` in Arguments).
`subjtheta`	a vector of quantile ranks for each subject on the distribution specified in `thetadist`.
`evalpoints`	a vector with the subjscoresummary used in curve estimation.
`subjscoresummary`	a vector of subjscoresummary, of probabilities `.05`, `.25`, `.50`, `.75`, `.95`, for the observed overall scores.
`subjscoresummaryevalpoints`	a vector as `subjscoresummary` but computed on `subjtheta`.
`SmthWgts`	a matrix containing the kernel weights.
`scale`	a vector indicating whether each item is multiple-choice, rating-scale or nominal; `1` indicates multiple-choice, `0` indicates rating-scale, `3` indicates nominal.
`format`	returns the `format` argument passed at function call.
`bandwidth`	a vector containing the bandwidths for each item.
`DIF`	a list of `ksIRT` objects created for each of the subgroups specified by `groups`.
`groups`	returns the `groups` argument passed at function call.
`itemcor`	a vector containing the point polyserial correlation for each item.
`RCC`	a list of `nsubj` vectors containing the normalized likelihood for each value in `evalpoints`.
`subjthetaML`	the maximum likelihood estimate for the expected total score of each subject.

Mazza A, Punzo A, McGuire B. (2014). KernSmoothIRT: An R Package for Kernel Smoothing in Item Response Theory. Journal of Statistical Software, 58 6, 1-34. URL: http://www.jstatsoft.org/v58/i06/.

Ramsay, J.O. (2000). TestGraf: A program for the graphical analysis of multiple choice test and questionnaire data.

Silverman, B.W. (1986). Density Estimation for Statistics and Data Analysis. Chapman & Hall, London.

 ## Psych101 data
data(Psych101)
Psych1 <- ksIRT(responses = Psychresponses[1:100,], key = Psychkey, format = 1)
Psych1
    
plot(Psych1,plottype="OCC", item=c(24,25,92,96))
plot(Psych1,plottype="EIS", item=c(24,25,92,96))
plot(Psych1, plottype="tetrahedron", items=c(24,92))
plot(Psych1, plottype="triangle", items=c(24,92))
plot(Psych1, plottype="PCA")
plot(Psych1,plottype="RCC", subjects=c(33,92))
 
PCA(Psych1)
subjEIS(Psych1)
subjETS(Psych1)
subjOCC(Psych1, stype="ObsScore")
subjscore(Psych1)
subjthetaML(Psych1)
subjscoreML(Psych1)
 
plot(Psych1, plottype="expected")
plot(Psych1, plottype="sd")
plot(Psych1, plottype="density")

## HIV data
data(HIV)
HIVsubset <- HIV[c(c(1:50),c(1508:1558),c(2934:2984)),]
gr2 <- as.character(HIVsubset$SITE)
DIF2 <- ksIRT(res=HIVsubset[,-(1:3)], key=HIVkey, format = 2, groups=gr2, miss="omit")

plot(DIF2, plottype="expectedDIF", lwd=2)
plot(DIF2, plottype="densityDIF", lwd=2)
plot(DIF2, plottype="EISDIF",  item=c(6,11))

### Ordinal Survey Data
data(BDI)
BDI1 <- ksIRT(responses=BDIresponses, key=BDIkey, format = 2, miss="omit")

plot(BDI1, plottype="OCC", items=1:4)
plot(BDI1, plottype="sd")
plot(BDI1, plottype="density", ylim=c(0,0.1))