polyserial: Polyserial Correlation

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/polyserial.R

Description

Computes the polyserial correlation (and its standard error) between a quantitative variable and an ordinal variable, based on the assumption that the joint distribution of the quantitative variable and a latent continuous variable underlying the ordinal variable is bivariate normal. Either the maximum-likelihood estimator or a quicker “two-step” approximation is available. For the ML estimator the estimates of the thresholds and the covariance matrix of the estimates are also available.

Usage

1
2
polyserial(x, y, ML = FALSE, control = list(), 
  std.err = FALSE, maxcor=.9999, bins=4, start, thresholds=FALSE)

Arguments

x

a numerical variable.

y

an ordered categorical variable; can be numeric, logical, a factor, an ordered factor, or a character variables, but if a factor, its levels should be in proper order, and the values of a character variable are ordered alphabetically.

ML

if TRUE, compute the maximum-likelihood estimate; if FALSE, the default, compute a quicker “two-step” approximation.

control

optional arguments to be passed to the optim function.

std.err

if TRUE, return the estimated variance of the correlation (for the two-step estimator) or the estimated covariance matrix of the correlation and thresholds (for the ML estimator); the default is FALSE.

maxcor

maximum absolute correlation (to insure numerical stability).

bins

the number of bins into which to dissect x for a test of bivariate normality; the default is 4.

start

optional start value(s): if a single number, start value for the correlation; if a list with the elements rho and thresholds, start values for these parameters; start values are supplied automatically if omitted, and are only relevant when the ML estimator or standard errors are selected.

thresholds

if TRUE (the default is FALSE) return estimated thresholds along with the estimated correlation even if standard errors aren't computed.

Details

The ML estimator is computed by maximizing the bivariate-normal likelihood with respect to the thresholds for y (τ^y[j], j = 1,…, c - 1) and the population correlation (ρ). The likelihood is maximized numerically using the optim function, and the covariance matrix of the estimated parameters is based on the numerical Hessian computed by optim.

The two-step estimator is computed by first estimating the thresholds (τ^y[j], j = 1,…, c - 1) from the marginal distribution of y. Then if the standard error of ρ hat is requested, the one-dimensional likelihood for ρ is maximized numerically, using optim if standard errors are requested; the standard error computed treats the thresholds as fixed. If the standard error isn't request, ρ hat is computed directly.

Value

If std.err or thresholds is TRUE, returns an object of class "polycor" with the following components:

type

set to "polyserial".

rho

the polyserial correlation.

cuts

estimated thresholds for the ordinal variable (y), for the ML estimator.

var

the estimated variance of the correlation, or, for the ML estimator, the estimated covariance matrix of the correlation and thresholds.

n

the number of observations on which the correlation is based.

chisq

chi-square test for bivariate normality.

df

degrees of freedom for the test of bivariate normality.

ML

TRUE for the ML estimate, FALSE for the two-step estimate.

Othewise, returns the polyserial correlation.

Author(s)

John Fox jfox@mcmaster.ca

References

Drasgow, F. (1986) Polychoric and polyserial correlations. Pp. 68–74 in S. Kotz and N. Johnson, eds., The Encyclopedia of Statistics, Volume 7. Wiley.

See Also

hetcor, polychor, print.polycor, optim

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
if(require(mvtnorm)){
    set.seed(12345)
    data <- rmvnorm(1000, c(0, 0), matrix(c(1, .5, .5, 1), 2, 2))
    x <- data[,1]
    y <- data[,2]
    cor(x, y)  # sample correlation
    }
if(require(mvtnorm)){
    y <- cut(y, c(-Inf, -1, .5, 1.5, Inf))
    polyserial(x, y)  # 2-step estimate
    }
if(require(mvtnorm)){
    polyserial(x, y, ML=TRUE, std.err=TRUE) # ML estimate
    }

Example output

Loading required package: mvtnorm
[1] 0.5263698
[1] 0.5121031

Polyserial Correlation, ML est. = 0.5083 (0.02466)
Test of bivariate normality: Chisquare = 8.548, df = 11, p = 0.6635

                 1      2       3
Threshold -0.98560 0.4812 1.50700
Std.Err.   0.04408 0.0379 0.05847

polycor documentation built on Dec. 11, 2021, 3:01 a.m.

Related to polyserial in polycor...