mudfold: MUDFOLD: Van Schuur's nonparametric IRT model for dichotomous...

View source: R/mudfold.R

mudfoldR Documentation

MUDFOLD: Van Schuur's nonparametric IRT model for dichotomous responses that have been generated by an unfolding process.

Description

This function is used to fit a unidimensional unfolding scale to the responses of individuals on a set of categorically scored attitudinal items. Fitting is done through Van Schuur's scaling algorithm that determines if a set of items are indicators of the same unobserved latent contstruct such as preference, attitude, ideology etc. Core in this model are the scalability coefficients that are used to assess the fit of the scale and the items to the data.

Diagnostic statistics that are used to test the model assumptions are borrowed from the nonparametric unfolding model of Post(1992). Uncertainty estimates for the scalability coefficients and the diagnostic statistics both for the scale and the individual items are obtained using nonparametric ordinary bootstrap. A bootstrap estimate of the scale is obtained as the most frequently observed scale in R bootstrap iterations.

Usage

mudfold( data, estimation, lambda1, lambda2, start.scale, 
nboot, missings, nmice, seed, mincor, ...)

Arguments

data

: A binary matrix or data frame containing the responses of nrow(data) persons to ncol(data) items. Missing values in data are not allowed.

estimation

: This argument controls the nonparametric estimation method for person locations. By deafult this argument equals to "rank" and implies that Van Schuur's estimator will be used in order to estimate person parameters. The user can set this argument to "quantile" and then an estimator proposed by Johnson is applied to obtain the person locations.

lambda1

: User specified numerical value that is used as a lower boundary for the scalability criterion of the first step of the item selection algorithm, and in the item scalability criterion at the end of the scale expansion. Default value is λ_1=0.3 but it can be any value between -∞ and 1 (i.e., λ_1 \in ≤ft(-∞,1\right]). The higher the value of λ_1 the stricter the scalability criteria of the algorithm.

lambda2

: User specified numerical value that controls explicitly the first scalability criterion of the scale expansion. In the default settings λ_2=0, however, the user can choose a negative value for λ_2, which leads to less strict scalability criterion in the beginning of the scale expansion.

start.scale

: An ordered character vector with item names from colnames(data). The length of this vector should be greater than or equal to 3 and less than or equal to ncol(data). This ordered item set is used as a startset for the scale extension phase of MUDFOLD method. If start.scale=NULL the standard MUDFOLD method is fitted to the data.

nboot

: Argument that controls the number of bootstrap iterations. If nboot=NULL (default) no bootstrap is applied.

missings

: Argument that controls how the missing values should be treated. If missings="omit" (default) list-wise deletion is applied to data. If missings="impute" then the mice function is applied to data in order to impute the missings nmice times.

nmice

: Argument that controls the number of mice imputations (This argument is used only when missings="impute" and nboot=NULL.

seed

: Argument that is used for reproducibility of bootstrap results.

mincor

: This can be scalar, numeric vector (of size ncol(data)) or numeric matrix (square, of size ncol(data) specifying the minimum threshold(s) against which the absolute correlation in the data is compared. See ?mice:::quickpred for more details.

...

: Any additional arguments that are passed to the boot function from the package boot. See ?boot::boot.

Details

This function incorporates a two-step algorithm that determines an unfolding scale from observed binary data. In the first step of the algorithm the best minimal scale that consists of three items is determined. In the second step, the minimal scale from the first step is expanded iteratively by adding the best fitting item in each iteration. The first step of the algorithm can be skiped with the argument start which can be used for setting manually an item rank order that will be extended in the second step of the item selection algorithm. The resulting scale consists of the best m fitting items based on scalability criteria (where m ncol(data)).

In mudfold function, the user can specify a value λ_1 that will be used as a lower bound in the scalability criteria of the MUDFOLD algorithm. By default, the lower bound for the scalability coefficients is lambda1=0.3. The user can choose a second value λ_2 that will be used as a lower bound only for the second step of the algorithm (by default, lambda2=0). The parameter λ_2 is used mostly, in order to relax the first scalability criterion of the second step. Generally, values greater than 0.3 for λ_1, and λ_2 lead to very strict criteria while negative values relax these criteria.

Uncertainty estimates of the MUDFOLD statistics can be calculated with the argument nboot of the mudfold function. When nboot is an integer then nboot bootstrap iterations will run to obtain the variance parameter for each MUDFOLD statistic. Missing values are either list-wise deleted or they are imputed nmice times when nboot=NULL and missings="impute". If the argument nboot is not NULL and missings="impute" then each resampled dataset in bootstrap iterations is imputed once before we fit a MUDFOLD scale.

Moreover, the user is able to choose between two nonparametric estimation methods in order to obtain person parameters that are estimated using the item ranks from the MUDFOLD algorithm. The default setting (i.e., estimation="rank") uses an estimation proposed by Van Schuur(1984) based on item ranks. Alternatively, an estimation method described by Johnson(2005), which uses item quantiles for estimating person parameters, can be used by setting estimation="quantile".

Value

The function mudfold returns a list of class "mdf" with the following components:

CALL

A list where its components provide information for the function call.

CHECK

A list where its components provide information from the data checking step.

DESCRIPTIVES

A list with descriptive statistics for the data.

MUDFOLD_INFO

A list with three main components. The first component is called triple_stats and is a list where in each element contains the observed errors, expected errors, and scalability coefficients for each item triple. The second element is a list called first_step and contains the results of the first step of the MUDFOLD item selection algorithm. The third element of this list is called second_step and is a list with the MUDFOLD statistics and parameter estimates for the given scale.

If bootstrap is applied, then, an additional component is included in the output. This component is called BOOTSTRAP and is a list that contains the output of nboot bootstrap iterations.

Author(s)

Spyros E. Balafas (auth.), Wim P. Krijnen (auth.), Wendy J. Post (contr.), Ernst C. Wit (auth.)

Maintainer: Spyros E. Balafas (s.balafas@rug.nl)

References

W.H. Van Schuur.(1984). Structure in Political Beliefs: A New Model for Stochastic Unfolding with Application to European Party Activists. CT Press.

W.J. Post. (1992). Nonparametric Unfolding Models: A Latent Structure Approach. M & T series. DSWO Press.

W.J. Post. and T.AB. Snijders. (1993).Nonparametric unfolding models for dichotomous data. Methodika.

M.S. Johnson. (2006). Nonparametric Estimation of Item and Respondent Locations from Unfolding-type Items. Psychometrica

Examples

## Not run: 
#####################################
#### MUDFOLD method on real data ####
#####################################



###########################################################################
###### MUDFOLD method on ANDRICH data (see Post and Snijders pp.147) ######
###########################################################################
data(ANDRICH)
## fit MUDFOLD on ANDRICH data ##
fit_andr <- mudfold(ANDRICH)

## generic functions for the S3 class .mdf object fit ##
## print.mdf
print(fit_andr)
## summary.mdf
summary(fit_andr)
## plot.mdf
plot(fit_andr)


## fit MUDFOLD on ANDRICH data with bootsrap ##
fit_andr_boot <- mudfold(ANDRICH, nboot=100)

## generic functions for the S3 class .mdf object fit ##
## print.mdf
print(fit_andr_boot)
## summary.mdf
summary(fit_andr_boot, boot=TRUE)
## plot.mdf
plot(fit_andr_boot)

############################################
###### MUDFOLD method on EURPAR2 data ######
############################################
data("EURPAR2")

## fit MUDFOLD on EURPAR2 data ##
fit_eurp <- mudfold(EURPAR2)

## print
print(fit_eurp)

## summary
summary(fit_eurp)

## plot
plot(fit_eurp)

###########################################
###### MUDFOLD method on Plato7 data ######
###########################################

data("Plato7")

## transform to binary data
## using as threshold the mean
## per row of Plato7

dat_plato <- pick(Plato7)

## fit MUDFOLD on Plato7 data ##
fit_plato <- mudfold(dat_plato, nboot=1000)

## print
print(fit_plato)

## summary
summary(fit_plato, boot=TRUE)

## plot
plot(fit_plato, plot.type="scale")
plot(fit_plato, plot.type="IRF")
plot(fit_plato, plot.type="persons")


##########################################
#### MUDFOLD method on simulated data ####
##########################################

### Data with the responses of
### n=3000 on p=20 items

simulation1 <- mudfoldsim(N=20, n=3000, gamma1=2, gamma2=-10, zeros=FALSE,seed = 1)
dat_sim1 <- simulation1$dat

## fit MUDFOLD on simulated data ##
fit.sim1 <- mudfold(dat_sim1)

# print
fit.sim1

# summary
summary(fit.sim1)

# plot
plot(fit.sim1)

### Data with the responses of
### n=3000 on N=26 items

simulation2 <- mudfoldsim(N=26, n=3000, gamma1=2, gamma2=-10, zeros=FALSE,seed = 1)
dat_sim2 <- simulation2$dat

## fit MUDFOLD on simulated data ##
fit.sim2 <- mudfold(dat_sim2)

# print
fit.sim2

# summary
summary(fit.sim2)

# plot
plot(fit.sim2, plot.type="scale")
plot(fit.sim2, plot.type="IRF")
plot(fit.sim2, plot.type="persons")


## End(Not run)

mudfold documentation built on Nov. 24, 2022, 5:09 p.m.