| corassign | R Documentation |
We extract latent factors from the log of mat using an SVD, then
generate an underlying group-assignment variable from a conditional
normal distribution (conditional on the latent factors). This underlying
group-assignment variable is used to assign groups.
corassign(mat, nfac = NULL, corvec = NULL, return = c("group", "full"))
mat |
A matrix of count data. The rows index the individuals and the columns index the genes. |
nfac |
The number of latent factors. If |
corvec |
The vector of correlations. |
return |
What should we return? Just the group assignment
( |
If nfac is provided, then corvec must be the same length as nfac.
If nfac is not provided, then it is assumed that the first nfac
elements of corvec are the underlying correlations, if nfac turns out to be
smaller than the length of corvec. If nfac turns
out to be larger than the length of corvec, then the factors without
defined correlations are assumed to have correlation 0.
A list with some or all of the following elements:
xThe vector of group assignments. 0L indicates
membership to one group and 1L indicates membership to
the other group.
nfacThe number of assumed latent factors.
facmatA matrix, whose columns contain the latent factors.
groupfacThe underlying group-assignment factor.
corvecThe correlation vector. Note that this is the
correlation between random variables observed in groupfac
and facmat,
If return = "group", then the list only contains x.
David Gerard
A. Onatski (2010), Determining the number of factors from empirical distribution of eigenvalues. The Review of Economics and Statistics 92(4).
## Simulate data from given matrix of counts
## In practice, you would obtain Y from a real dataset, not simulate it.
set.seed(1)
nsamp <- 1000
ngene <- 10
Y <- matrix(stats::rpois(nsamp * ngene, lambda = 50), nrow = ngene)
## Set target correlation to be 0.9 and nfac to be 1
corvec <- 0.9
nfac <- 1
## Group assignment
cout <- corassign(mat = t(Y),
nfac = nfac,
corvec = corvec,
return = "full")
## Correlation between facmat and groupfac should be about 0.9
cor(cout$facmat, cout$groupfac)
## Correlation between facmat and x should be about 0.9 * sqrt(2 / pi)
cor(cout$facmat, cout$x)
corvec * sqrt(2 / pi)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.