Description Usage Arguments Details Value Reporting Progress (Log Protocol) Warning Note Author(s) References See Also Examples
This function provides Bayesian multinomial logit regression using auxiliary mixture sampling. See
Fruehwirth-Schnatter and Fruehwirth (2010). That is an MCMC sampler that is also used for the mixtures-of-experts
extension of Dirichlet Multinomial (dmClustExtended
) and Markov chain clustering
(mcClustExtended
). It requires four mandatory arguments: Data
, Prior
, Initial
and
Mcmc
; each representing a list of (mandatory) arguments: Data
contains data information, Prior
contains prior information, Initial
contains information about starting conditions (initial values) and
Mcmc
contains the setup for the MCMC sampler.
1 2 3 4 5 6 7 | MNLAuxMix(
Data = list( storeDir = "try01",
X = stop("X (matrix of covariates) must be specified")),
Prior = list( H = 4, betaPrior = "informative",
betaPriorMean = 0, betaPriorVar = 1),
Initial = list( S.i.start = rep(1:H, N), Beta.start = NULL),
Mcmc = list( M = 50, M0 = 20, mOut = 5, mSave = 10, seed = 12345))
|
Data |
a list consisting of: |
Prior |
a list consisting of: |
Initial |
a list consisting of: |
Mcmc |
a list consisting of: |
Note that the values of the arguments indicated here have nothing to do with default values! For a call of these functions this lists-of-arguments structure requires a complete specification of all arguments!
The following arguments which are lists have to be completely provided (note that there are no such things as default values within lists!):
Data
contains:
storeDir
A character indicating the name of the directory (will be created if not already existing) where the results are to be stored.
X
The matrix of covariates (with N rows) including the unit vector for the intercept to be included in the multinomial logit model.
Prior
contains (see also Section Prior Data):
H
An integer >= 1 indicating the number of response categories.
betaPrior
A character. If "uninformative" (improper) prior parameters are used for the regression coefficients (i.e. betaPriorVar = ∞). Otherwise mean and variance of the normal prior distribution for the regression coefficients have to be specified.
betaPriorMean, betaPriorVar
Numerical values specifying the parameters of the normal prior distribution for the regression coefficients, only if betaPrior!="uninformative"
.
Initial
contains:
S.i.start
A vector of length N giving initial response categories.
Beta.start
A matrix of dimension ncol(X)
x H
giving start values for the regression coefficients including the zero vector in the first column
representing the baseline response category.
Mcmc
contains:
M
An integer indicating the overall number of iterations.
M0
An integer indicating the number of the first iteration after the burn-in phase.
mOut
An integer indicating that after each mOut
-th iteration a report line is written to the output window/screen.
mSave
An integer indicating that after each mSave
-th iteration an intermediate storage of the workspace is carried out.
seed
An integer indicating a random seed.
A list containing (/the output file contains):
workspaceFile |
A character indicating the name of and the path (based on the currend working directory) to the output file, wherein all the
results are saved.
The name of the output file starts with “ |
Data |
The argument |
Prior |
The argument |
Initial |
The argument |
Mcmc |
The argument |
Beta.m |
A 3-dimensional array of dimension ncol(X) x H x M containing the draws for the regression coefficients β_h in each m-th iteration step. |
bk0 |
The prior parameters for the mean vectors of the normal (prior) distributions of the regression coefficients. |
Bk0inv |
The prior parameters for the inverse variance-covariance matrices of the normal (prior) distributions of the regression coefficients. |
fileName |
A character value indicating the name of the output file (see also |
N |
An integer indicating N, the number of individuals/units/objects. |
totalTime |
A numeric value indicating the total time (in secs) used for the function call. |
bkN |
The posterior parameters (in the last iteration step) for the mean vectors of the normal (posterior) distributions from which the regression coefficients were drawn. |
BkN |
The posterior parameters (in the last iteration step) for the variance-covariance matrices of the normal (posterior) distributions from which the regression coefficients were drawn. |
logLike |
A vector containing the values of the log-likelihood calculated in each iteration step. |
The log protocol keeps record of the progress of the estimation procedure and is shown on the screen. At first the
name of the workspace file is documented. Then for each mOut
-th iteration step (at least for m=1,
…, 5, 10, 20, 50, 100, 200, 500) information about the elapsed time and the expected time to the end is
reported. Finally the total time is shown.
For example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | workspaceFile: MNLAuxMix-Example-Outfiles\mnLogit_newAux_H2_M100_20111129_083023.RData (within current working directory!)
m = 1 ; duration of iter proc so far: 0.25 sec.
m = 2 ; duration of iter proc so far: 0.33 sec., exp time to end: 0.54 min.
m = 3 ; duration of iter proc so far: 0.44 sec., exp time to end: 0.36 min.
m = 4 ; duration of iter proc so far: 0.52 sec., exp time to end: 0.28 min.
m = 5 ; duration of iter proc so far: 0.6 sec., exp time to end: 0.24 min.
m = 10 ; duration of iter proc so far: 1.04 sec., exp time to end: 0.18 min.
m = 20 ; duration of iter proc so far: 1.93 sec., exp time to end: 0.14 min.
m = 30 ; duration of iter proc so far: 2.8 sec., exp time to end: 0.11 min.
m = 40 ; duration of iter proc so far: 3.79 sec., exp time to end: 0.1 min.
m = 50 ; duration of iter proc so far: 4.79 sec., exp time to end: 0.08 min.
m = 60 ; duration of iter proc so far: 5.89 sec., exp time to end: 0.07 min.
m = 70 ; duration of iter proc so far: 6.8 sec., exp time to end: 0.05 min.
m = 80 ; duration of iter proc so far: 7.68 sec., exp time to end: 0.03 min.
m = 90 ; duration of iter proc so far: 8.63 sec., exp time to end: 0.02 min.
m = 100 ; duration of iter proc so far: 9.52 sec., exp time to end: 0 min.
Total time: 0 hours 0 min
|
Note that there are no such things as default values (see Section Arguments)!
Note that the required data files have to be provided in the current working directory and that the results (see
Section Value) are to be saved in the directory provided by storeDir
within the current working
directory. Make sure that the current working directory is set appropriately before the function is called.
Note, that in contrast to the literature (see References), the numbering (labelling) of the states of the categorical outcome variable (time series) in this package is sometimes 0,...,K (instead of 1,...,K), however, there are K+1 categories (states)!
Christoph Pamminger <christoph.pamminger@gmail.com>
Sylvia Fruehwirth-Schnatter, Christoph Pamminger, Andrea Weber and Rudolf Winter-Ebmer, (2011), "Labor market entry and earnings dynamics: Bayesian inference using mixtures-of-experts Markov chain clustering". Journal of Applied Econometrics. DOI: 10.1002/jae.1249 http://onlinelibrary.wiley.com/doi/10.1002/jae.1249/abstract
Christoph Pamminger and Sylvia Fruehwirth-Schnatter, (2010), "Model-based Clustering of Categorical Time Series". Bayesian Analysis, Vol. 5, No. 2, pp. 345-368. DOI: 10.1214/10-BA606 http://ba.stat.cmu.edu/journal/2010/vol05/issue02/pamminger.pdf
Sylvia Fruehwirth-Schnatter and Rudolf Fruehwirth, (2010), "Data augmentation and MCMC for binary and multinomial logit models". In T. Kneib and G. Tutz (eds): Statistical Modelling and Regression Structures: Festschrift in Honour of Ludwig Fahrmeir. Physica Verlag, Heidelberg, pp. 111-132. DOI: 10.1007/978-3-7908-2413-1_7 http://www.springerlink.com/content/t4h810017645wh68/. See also: IFAS Research Paper Series 2010-48 (http://www.jku.at/ifas/content/e108280/e108491/e108471/e109880/ifas_rp48.pdf).
mcClustExtended
, dmClustExtended
, MCCExtExampleData
,
calcAllocationsMNL
, calcRegCoeffs
, calcSegmentationPower
,
calcEntropy
, plotLikeliPaths
, calcNumEff
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 | #rm(list=ls(all=TRUE))
# ==================================================================================
if ( FALSE ) {
# ==================================================================================
# set working directory
oldDir <- getwd()
curDir <- tempdir()
setwd(curDir)
if ( !file.exists("bayesMCClust-wd") ) dir.create("bayesMCClust-wd")
setwd("bayesMCClust-wd")
myOutfilesDir <- "MNLAuxMix-Example-Outfiles"
data(MCCExtExampleData)
if (!is.element("MCCExtExampleData$covariates", search())) {
attach(MCCExtExampleData$covariates)
}
# ==================================================================================
response <- MCCExtExampleData[[ sample(5:7, 1) ]] # MCCExtExampleData$MNLresponse2gr
# MCCExtExampleData$MNLresponse3gr # MCCExtExampleData$MNLresponse4gr #
groupNr <- max(response) # 3
# ==================================================================================
# ==================================================================================
require(nnet, quietly = TRUE)
H <- groupNr
X = cbind( intercept=1, alrateBezNew, unskilled, skilled, angStart )
N <- dim(X)[1]
mX <- data.frame( cbind(group=as.factor( response ), X[,-1],
matrix(sample(1:H,H*N,replace=TRUE),N,H)) )
colnames(mX)[6:(6+groupNr-1)] <- c( "as.1", "as.2", "as.3", "as.4" )[1:groupNr]
tempMNom <- multinom(group ~ alrateBezNew+ unskilled+ skilled+ angStart,
data=as.data.frame(mX))
toStartBeta <- t(rbind(0,coef( tempMNom )))
# ==================================================================================
system.time(
outList <- MNLAuxMix(
Data = list( storeDir = myOutfilesDir,
# will be created if not existing (in current working directory!)
X = cbind( intercept=1, alrateBezNew, unskilled, skilled, angStart ) ),
Prior = list( H = groupNr, # number of alternatives 1,...,H
betaPrior = "informative",
# 'uninformative' (improper) prior pars for beta (betaPriorVar = infty)
betaPriorMean = 0,
betaPriorVar = 1), # 'informative' prior pars for beta -> N(0,1)
Initial = list( S.i.start = response, # vector of multinomial outcomes / choice made
Beta.start = toStartBeta ),
Mcmc = list( M = 100,
M0 = 50,
mOut = 10,
mSave = 50,
seed = sample(1:100000, 1) # 6984684
)
)
)
str(outList)
#outFileName <- outList$workspaceFile
#outFileName
#results <- load(outFileName)
#results
allocList <- calcAllocationsMNL(outList, thin=1, maxi=50)
str(allocList)
myRegCoeffs <- calcRegCoeffs(outList, hBase=2, thin=1)
#, M0=Mcmc$M0, grLabels=paste("Group", 1:Prior$H),
# printHPD=TRUE, plotPaths=TRUE, plotACFs=TRUE
str(myRegCoeffs)
mySegPower <- calcSegmentationPower(outList, classProbs=allocList$classProbs,
class=allocList$class, printXtable=TRUE, calcSharp=TRUE, printSharpXtable=TRUE )
# , grLabels=paste("Group", 1:Prior$H)
str(mySegPower)
myEntropy <- calcEntropy(outList, classProbs=allocList$classProbs,
class=allocList$class, printXtable=TRUE )
# , grLabels=paste("Group", 1:Prior$H)
myEntropy
plotLikeliPaths(outList, from=10, by=1 )
myNumEffTables <- calcNumEff( outList, thin=1, printXi=TRUE, printE=TRUE,
printBeta=TRUE, grLabels=paste("Group", 1:outList$Prior$H) )
str(myNumEffTables)
setwd(oldDir)
# ==================================================================================
if ( is.element("MCCExtExampleData$covariates", search())) {
detach(MCCExtExampleData$covariates)
}
# ==================================================================================
} # end if
# ==================================================================================
# ==================================================================================
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.