Description Usage Arguments Details Value Author(s) References See Also Examples
Scaling mixed binary and count data while estimating the underlying latent dimensionality.
1 2 3 |
M |
Matrix to be scaled. |
missing.mat |
Matrix indicating missing data. Should be the same size as M, with a 1 denoting a missing observation and a 0 otherwise. Defaults to all zeroes. |
gibbs |
Number of posterior samples to draw |
burnin |
Number of burnin samples. |
max.optim |
Number of iterations to fit the cutpoints using optim. This is generally faster than the Hamiltonian Monte Carlo estimates, and is useful for the first part of the burnin phase. |
thin |
Extent of thinning of the MCMC chain. Only every thin draw is saved to the output. |
save.curr |
Name of file in which to save object. |
save.each |
Whether to save with a new name at each thinned draw. |
thin.save |
How many thinned draws to wait between saving output. |
maxdim |
Number of latent dimensions to fit. Should be greater than the number of estimated dimensions. |
The function sfa is the main function in the package, SparseFactorAnalysis. It takes in a matrix which in rows has the same data type–either binary or count. For example, every row may consist of roll call votes or word counts, and the columns may correspond with legislators. The method combines the two data types, scales both, and selects the underlying latent dimensionality.
dim.sparse |
Output for sparse estimates of dimensionality. |
dim.mean |
Non-sparse estimates of posterior mean of dimensionality. |
rowdim1 |
Posterior samples of first dimension of spatial locations for each observation i. |
rowdim2 |
Posterior samples of second dimension of spatial locations for row unit of observation. |
coldim1 |
Posterior samples of first dimension of spatial locations for column unit of observation. |
coldim2 |
Posterior samples of second dimension of spatial locations for column unit of observation. |
lambda.lasso |
Posterior samples for tuning parameter used for dimension selection. |
Z |
Posterior mean of fitted values, on a z-scale. |
rowdims.all |
Posterior mean of all row spatial locations. |
coldims.all |
Posterior mean of all column spatial locations. |
Marc Ratkovic and Yuki Shiraito
In Song Kim, John Londregan, and Marc Ratkovic. 2015. "Voting, Speechmaking, and the Dimensions of Conflict in the US Senate." Working paper.
plot.sfa, summary.sfa
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 | ## Not run:
##Sample size and dimensions.
set.seed(1)
n.sim<-50
k.sim<-500
##True vector of dimension weights.
d.sim<-rep(0,n.sim)
d.sim[1:3]<-c(2, 1.5, 1)*3
##Formulate true latent dimensions.
U.sim<-matrix(rnorm(n.sim^2,sd=.5), nr=n.sim, nc=n.sim)
V.sim<-matrix(rnorm(n.sim*k.sim,sd=.5), nr=k.sim, nc=n.sim)
Theta.sim<-U.sim%*%diag(d.sim)%*%t(V.sim)
##Generate binary outcome and count data.
probs.sim<-pnorm((-1+Theta.sim+rep(1,n.sim)%*%t(rnorm(k.sim,sd=.5)) +
rnorm(n.sim,sd=.5)%*%t(rep(1,k.sim)) ))
votes.mat<-
apply(probs.sim[1:25,],c(1,2),FUN=function(x) rbinom(1,1,x))
count.mat<-
apply(probs.sim[26:50, ],c(1,2),FUN=function(x) rpois(1,20*x))
M<-rbind(votes.mat,count.mat)
## Run sfa
sparse1<-sfa(M, maxdim=10)
##Analyze results.
summary(sparse1)
plot(sparse1,type="dim")
plot(sparse1,type="scatter")
##Compare to true data generating process
plot(sparse1$Z,Theta.sim)
abline(c(0,1))
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.