# jackstraw_lfa: Non-Parametric Jackstraw for Logistic Factor Analysis In jackstraw: Statistical Inference for Unsupervised Learning

## Description

Test association between the observed variables and their latent variables captured by logistic factors (LFs).

## Usage

 ``` 1 2 3 4 5 6 7 8 9 10 11 12``` ```jackstraw_lfa( dat, FUN = function(x) lfa(x, r)[, , drop = FALSE], devR = FALSE, r = NULL, r1 = NULL, s = NULL, B = NULL, covariate = NULL, verbose = TRUE, seed = NULL ) ```

## Arguments

 `dat` a genotype matrix with `m` rows as variables and `n` columns as observations. `FUN` a function to use for LFA (by default, it uses the lfagen package) `devR` use a R function to compute deviance. By default, FALSE (uses C++). `r` a number of significant LFs. `r1` a numeric vector of LFs of interest (implying you are not interested in all `r` LFs). `s` a number of “synthetic” null variables. Out of `m` variables, `s` variables are independently permuted. `B` a number of resampling iterations. There will be a total of `s*B` null statistics. `covariate` a data matrix of covariates with corresponding `n` observations (do not include an intercept term). `verbose` a logical specifying to print the computational progress. `seed` a seed for the random number generator.

## Details

This function uses logistic factor analysis (LFA) from Wei et al. (2014). Particularly, a deviation `dev` in logistic regression (the full model with `r` LFs vs. the intercept-only model) is used to assess association.

## Value

`jackstraw_lfa` returns a list consisting of

 `p.value` `m` p-values of association tests between variables and their LFs `obs.stat` `m` observed devs `null.stat` `s*B` null devs

## Author(s)

Neo Christopher Chung nchchung@gmail.com

## References

Chung and Storey (2015) Statistical significance of variables driving systematic variation in high-dimensional data. Bioinformatics, 31(4): 545-554 https://academic.oup.com/bioinformatics/article/31/4/545/2748186

Chung (2020) Statistical significance of cluster membership for unsupervised evaluation of cell identities https://academic.oup.com/bioinformatics/article/36/10/3107/5788523

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21``` ```set.seed(1234) ## Not run: ## simulate genotype data from a logistic factor model: drawing rbinom from logit(BL) m=5000; n=100; pi0=.9 m0 = round(m*pi0) m1 = m-round(m*pi0) B = matrix(0, nrow=m, ncol=1) B[1:m1,] = matrix(runif(m1*n, min=-.5, max=.5), nrow=m1, ncol=1) L = matrix(rnorm(n), nrow=1, ncol=n) BL = B %*% L prob = exp(BL)/(1+exp(BL)) dat = matrix(rbinom(m*n, 2, as.numeric(prob)), m, n) ## apply the jackstraw_lfa out = jackstraw_lfa(dat, 2) ## apply the jackstraw_lfa using self-contained R functions out = jackstraw_lfa(dat, FUN = function(x) lfa.corpcor(x, 2)[, , drop = FALSE], r = 2, devR = TRUE) ## End(Not run) ```