View source: R/jackstraw_lfa.R
jackstraw_lfa | R Documentation |
Test association between the observed variables and their latent variables captured by logistic factors (LFs).
jackstraw_lfa(
dat,
r,
FUN,
r1 = NULL,
s = NULL,
B = NULL,
covariate = NULL,
permute_alleles = TRUE,
verbose = TRUE
)
dat |
either a genotype matrix with |
r |
a number of significant LFs. |
FUN |
a function to use for LFA. |
r1 |
a numeric vector of LFs of interest (implying you are not interested in all |
s |
a number of “synthetic” null variables. Out of |
B |
a number of resampling iterations. There will be a total of |
covariate |
a data matrix of covariates with corresponding |
permute_alleles |
If TRUE (default), alleles (rather than genotypes) are permuted, which results in a more Binomial synthetic null when data is highly structured. Changing to FALSE is not recommended, except for research purposes to confirm that it performs worse than the default. |
verbose |
a logical specifying to print the computational progress. |
This function uses logistic factor analysis (LFA) from Hao et al. (2016).
Particularly, the deviance in logistic regression (the full model with r
LFs vs. the intercept-only model) is used to assess significance.
This function requires the gcatest
package, and in practice also the lfa
package, to be installed from Bioconductor.
The random outputs of the regular matrix versus the BEDMatrix
versions are equal in distribution.
However, fixing a seed and providing the same data to both versions does not result in the same exact outputs.
This is because the BEDMatrix
version permutes loci in a different order by necessity.
jackstraw_lfa
returns a list consisting of
p.value |
|
obs.stat |
|
null.stat |
|
Neo Christopher Chung nchchung@gmail.com
Alejandro Ochoa alejandro.ochoa@duke.edu
Chung and Storey (2015) Statistical significance of variables driving systematic variation in high-dimensional data. Bioinformatics, 31(4): 545-554 \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1093/bioinformatics/btu674")}
jackstraw_pca jackstraw jackstraw_subspace
## Not run:
## simulate genotype data from a logistic factor model: drawing rbinom from logit(BL)
m <- 5000; n <- 100; pi0 <- .9
m0 <- round(m*pi0)
m1 <- m - round(m*pi0)
B <- matrix(0, nrow=m, ncol=1)
B[1:m1,] <- matrix(runif(m1*n, min=-.5, max=.5), nrow=m1, ncol=n)
L <- matrix(rnorm(n), nrow=1, ncol=n)
BL <- B %*% L
prob <- exp(BL)/(1+exp(BL))
dat <- matrix(rbinom(m*n, 2, as.numeric(prob)), m, n)
# load lfa package (install from Bioconductor)
library(lfa)
# choose the number of logistic factors, including the intercept
r <- 2
# define the function this way, a function of the genotype matrix only
FUN <- function(x) lfa::lfa( x, r )
## apply the jackstraw_lfa
out <- jackstraw_lfa( dat, r, FUN )
# if you had very large genotype data in plink BED/BIM/FAM files,
# use BEDMatrix and save memory by reading from disk (at the expense of speed)
library(BEDMatrix)
dat_BM <- BEDMatrix( 'filepath' ) # assumes filepath.bed, .bim and .fam exist
# run jackstraw!
out <- jackstraw_lfa( dat_BM, r, FUN )
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.