MCMCmixfactanal: Markov Chain Monte Carlo for Mixed Data Factor Analysis Model
In MCMCpack: Markov Chain Monte Carlo (MCMC) Package

MCMCmixfactanal

R Documentation

Markov Chain Monte Carlo for Mixed Data Factor Analysis Model

Description

This function generates a sample from the posterior distribution of a mixed data (both continuous and ordinal) factor analysis model. Normal priors are assumed on the factor loadings and factor scores, improper uniform priors are assumed on the cutpoints, and inverse gamma priors are assumed for the error variances (uniquenesses). The user supplies data and parameters for the prior distributions, and a sample from the posterior distribution is returned as an mcmc object, which can be subsequently analyzed with functions provided in the coda package.

Usage

MCMCmixfactanal(
  x,
  factors,
  lambda.constraints = list(),
  data = parent.frame(),
  burnin = 1000,
  mcmc = 20000,
  thin = 1,
  tune = NA,
  verbose = 0,
  seed = NA,
  lambda.start = NA,
  psi.start = NA,
  l0 = 0,
  L0 = 0,
  a0 = 0.001,
  b0 = 0.001,
  store.lambda = TRUE,
  store.scores = FALSE,
  std.mean = TRUE,
  std.var = TRUE,
  ...
)

Arguments

`x`	A one-sided formula containing the manifest variables. Ordinal (including dichotomous) variables must be coded as ordered factors. Each level of these ordered factors must be present in the data passed to the function. NOTE: data input is different in `MCMCmixfactanal` than in either `MCMCfactanal` or `MCMCordfactanal`.
`factors`	The number of factors to be fitted.
`lambda.constraints`	List of lists specifying possible equality or simple inequality constraints on the factor loadings. A typical entry in the list has one of three forms: `varname=list(d,c)` which will constrain the dth loading for the variable named varname to be equal to c, `varname=list(d,"+")` which will constrain the dth loading for the variable named varname to be positive, and `varname=list(d, "-")` which will constrain the dth loading for the variable named varname to be negative. If x is a matrix without column names defaults names of “V1", “V2", ... , etc will be used. Note that, unlike `MCMCfactanal`, the `\Lambda` matrix used here has `factors`+1 columns. The first column of `\Lambda` corresponds to negative item difficulty parameters for ordinal manifest variables and mean parameters for continuous manifest variables and should generally not be constrained directly by the user.
`data`	A data frame.
`burnin`	The number of burn-in iterations for the sampler.
`mcmc`	The number of iterations for the sampler.
`thin`	The thinning interval used in the simulation. The number of iterations must be divisible by this value.
`tune`	The tuning parameter for the Metropolis-Hastings sampling. Can be either a scalar or a `k`-vector (where `k` is the number of manifest variables). `tune` must be strictly positive.
`verbose`	A switch which determines whether or not the progress of the sampler is printed to the screen. If `verbose` is great than 0 the iteration number and the Metropolis-Hastings acceptance rate are printed to the screen every `verbose`th iteration.
`seed`	The seed for the random number generator. If NA, the Mersenne Twister generator is used with default seed 12345; if an integer is passed it is used to seed the Mersenne twister. The user can also pass a list of length two to use the L'Ecuyer random number generator, which is suitable for parallel computation. The first element of the list is the L'Ecuyer seed, which is a vector of length six or NA (if NA a default seed of `rep(12345,6)` is used). The second element of list is a positive substream number. See the MCMCpack specification for more details.
`lambda.start`	Starting values for the factor loading matrix Lambda. If `lambda.start` is set to a scalar the starting value for all unconstrained loadings will be set to that scalar. If `lambda.start` is a matrix of the same dimensions as Lambda then the `lambda.start` matrix is used as the starting values (except for equality-constrained elements). If `lambda.start` is set to `NA` (the default) then starting values for unconstrained elements in the first column of Lambda are based on the observed response pattern, the remaining unconstrained elements of Lambda are set to 0, and starting values for inequality constrained elements are set to either 1.0 or -1.0 depending on the nature of the constraints.
`psi.start`	Starting values for the error variance (uniqueness) matrix. If `psi.start` is set to a scalar then the starting value for all diagonal elements of `Psi` that represent error variances for continuous variables are set to this value. If `psi.start` is a `k`-vector (where `k` is the number of manifest variables) then the staring value of `Psi` has `psi.start` on the main diagonal with the exception that entries corresponding to error variances for ordinal variables are set to 1.. If `psi.start` is set to `NA` (the default) the starting values of all the continuous variable uniquenesses are set to 0.5. Error variances for ordinal response variables are always constrained (regardless of the value of `psi.start` to have an error variance of 1 in order to achieve identification.
`l0`	The means of the independent Normal prior on the factor loadings. Can be either a scalar or a matrix with the same dimensions as `Lambda`.
`L0`	The precisions (inverse variances) of the independent Normal prior on the factor loadings. Can be either a scalar or a matrix with the same dimensions as `Lambda`.
`a0`	Controls the shape of the inverse Gamma prior on the uniqueness. The actual shape parameter is set to `a0/2`. Can be either a scalar or a `k`-vector.
`b0`	Controls the scale of the inverse Gamma prior on the uniquenesses. The actual scale parameter is set to `b0/2`. Can be either a scalar or a `k`-vector.
`store.lambda`	A switch that determines whether or not to store the factor loadings for posterior analysis. By default, the factor loadings are all stored.
`store.scores`	A switch that determines whether or not to store the factor scores for posterior analysis. NOTE: This takes an enormous amount of memory, so should only be used if the chain is thinned heavily, or for applications with a small number of observations. By default, the factor scores are not stored.
`std.mean`	If `TRUE` (the default) the continuous manifest variables are rescaled to have zero mean.
`std.var`	If `TRUE` (the default) the continuous manifest variables are rescaled to have unit variance.
`...`	further arguments to be passed

Details

The model takes the following form:

Let i=1,\ldots,N index observations and j=1,\ldots,K index response variables within an observation. An observed variable x_{ij} can be either ordinal with a total of C_j categories or continuous. The distribution of X is governed by a N \times K matrix of latent variables X^* and a series of cutpoints \gamma. X^* is assumed to be generated according to:

x^*_i = \Lambda \phi_i + \epsilon_i

\epsilon_i \sim \mathcal{N}(0,\Psi)

where x^*_i is the k-vector of latent variables specific to observation i, \Lambda is the k \times d matrix of factor loadings, and \phi_i is the d-vector of latent factor scores. It is assumed that the first element of \phi_i is equal to 1 for all i.

If the jth variable is ordinal, the probability that it takes the value c in observation i is:

\pi_{ijc} = \Phi(\gamma_{jc} - \Lambda'_j\phi_i) - \Phi(\gamma_{j(c-1)} - \Lambda'_j\phi_i)

If the jth variable is continuous, it is assumed that x^*_{ij} = x_{ij} for all i.

The implementation used here assumes independent conjugate priors for each element of \Lambda and each \phi_i. More specifically we assume:

\Lambda_{ij} \sim \mathcal{N}(l_{0_{ij}}, L_{0_{ij}}^{-1}), i=1,\ldots,k, j=1,\ldots,d

\phi_{i(2:d)} \sim \mathcal{N}(0, I), i=1,\dots,n

MCMCmixfactanal simulates from the posterior distribution using a Metropolis-Hastings within Gibbs sampling algorithm. The algorithm employed is based on work by Cowles (1996). Note that the first element of \phi_i is a 1. As a result, the first column of \Lambda can be interpretated as negative item difficulty parameters. Further, the first element \gamma_1 is normalized to zero, and thus not returned in the mcmc object. The simulation proper is done in compiled C++ code to maximize efficiency. Please consult the coda documentation for a comprehensive list of functions that can be used to analyze the posterior sample.

As is the case with all measurement models, make sure that you have plenty of free memory, especially when storing the scores.

Value

An mcmc object that contains the posterior sample. This object can be summarized by functions provided by the coda package.

References

Kevin M. Quinn. 2004. “Bayesian Factor Analysis for Mixed Ordinal and Continuous Responses.” Political Analysis. 12: 338-353.

Andrew D. Martin, Kevin M. Quinn, and Jong Hee Park. 2011. “MCMCpack: Markov Chain Monte Carlo in R.”, Journal of Statistical Software. 42(9): 1-21. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v042.i09")}.

M. K. Cowles. 1996. “Accelerating Monte Carlo Markov Chain Convergence for Cumulative-link Generalized Linear Models." Statistics and Computing. 6: 101-110.

Valen E. Johnson and James H. Albert. 1999. “Ordinal Data Modeling." Springer: New York.

Daniel Pemstein, Kevin M. Quinn, and Andrew D. Martin. 2007. Scythe Statistical Library 1.0. http://scythe.wustl.edu.s3-website-us-east-1.amazonaws.com/.

Martyn Plummer, Nicky Best, Kate Cowles, and Karen Vines. 2006. “Output Analysis and Diagnostics for MCMC (CODA)”, R News. 6(1): 7-11. https://CRAN.R-project.org/doc/Rnews/Rnews_2006-1.pdf.

Examples


## Not run: 
data(PErisk)

post <- MCMCmixfactanal(~courts+barb2+prsexp2+prscorr2+gdpw2,
                        factors=1, data=PErisk,
                        lambda.constraints = list(courts=list(2,"-")),
                        burnin=5000, mcmc=1000000, thin=50,
                        verbose=500, L0=.25, store.lambda=TRUE,
                        store.scores=TRUE, tune=1.2)
plot(post)
summary(post)




library(MASS)
data(Cars93)
attach(Cars93)
new.cars <- data.frame(Price, MPG.city, MPG.highway,
                 Cylinders, EngineSize, Horsepower,
                 RPM, Length, Wheelbase, Width, Weight, Origin)
rownames(new.cars) <- paste(Manufacturer, Model)
detach(Cars93)

# drop obs 57 (Mazda RX 7) b/c it has a rotary engine
new.cars <- new.cars[-57,]
# drop 3 cylinder cars
new.cars <- new.cars[new.cars$Cylinders!=3,]
# drop 5 cylinder cars
new.cars <- new.cars[new.cars$Cylinders!=5,]

new.cars$log.Price <- log(new.cars$Price)
new.cars$log.MPG.city <- log(new.cars$MPG.city)
new.cars$log.MPG.highway <- log(new.cars$MPG.highway)
new.cars$log.EngineSize <- log(new.cars$EngineSize)
new.cars$log.Horsepower <- log(new.cars$Horsepower)

new.cars$Cylinders <- ordered(new.cars$Cylinders)
new.cars$Origin    <- ordered(new.cars$Origin)



post <- MCMCmixfactanal(~log.Price+log.MPG.city+
                 log.MPG.highway+Cylinders+log.EngineSize+
                 log.Horsepower+RPM+Length+
                 Wheelbase+Width+Weight+Origin, data=new.cars,
                 lambda.constraints=list(log.Horsepower=list(2,"+"),
                 log.Horsepower=c(3,0), weight=list(3,"+")),
                 factors=2,
                 burnin=5000, mcmc=500000, thin=100, verbose=500,
                 L0=.25, tune=3.0)
plot(post)
summary(post)


## End(Not run)

MCMCpack documentation built on Sept. 11, 2024, 8:13 p.m.

MCMCpack index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

MCMCpack
Markov Chain Monte Carlo (MCMC) Package

MCMCmixfactanal: Markov Chain Monte Carlo for Mixed Data Factor Analysis Model
In MCMCpack: Markov Chain Monte Carlo (MCMC) Package

Markov Chain Monte Carlo for Mixed Data Factor Analysis Model

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Related to MCMCmixfactanal in MCMCpack...

R Package Documentation

Browse R Packages

We want your feedback!

MCMCpack Markov Chain Monte Carlo (MCMC) Package

MCMCmixfactanal: Markov Chain Monte Carlo for Mixed Data Factor Analysis Model In MCMCpack: Markov Chain Monte Carlo (MCMC) Package

Markov Chain Monte Carlo for Mixed Data Factor Analysis Model

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Related to MCMCmixfactanal in MCMCpack...

R Package Documentation

Browse R Packages

We want your feedback!

MCMCpack
Markov Chain Monte Carlo (MCMC) Package

MCMCmixfactanal: Markov Chain Monte Carlo for Mixed Data Factor Analysis Model
In MCMCpack: Markov Chain Monte Carlo (MCMC) Package