View source: R/ligera2_f_multi.R
ligera2_f_multi | R Documentation |
This function performs multiple genetic association scans, adding one significant locus per iteration to the model (modeled as a covariate) to increase power in the final model.
The function returns a tibble containing association statistics and several intermediates.
This version calculates p-values using an F-test, which gives calibrated statistics under both quantitative and binary traits.
Compared to ligera()
, which uses the faster Wald test (calibrated for quantitative but not binary traits), this F-test version is quite a bit slower, and is optimized for m >> n
, so it is a work in progress.
ligera2_f_multi( X, trait, mean_kinship, q_cut = 0.05, one_per_iter = FALSE, covar = NULL, loci_on_cols = FALSE, mem_factor = 0.7, mem_lim = NA, m_chunk_max = 1000, tol = 1e-15 )
X |
The |
trait |
The length- |
mean_kinship |
An estimate of the mean kinship produced externally, to ensure internal estimates of kinship are unbiased. |
q_cut |
The q-value threshold to admit new loci into the polygenic model. |
one_per_iter |
If true, only the most significant locus per iteration is added to model of next iteration. Otherwise all significant loci per iteration are added to the model of next iteration. |
covar |
An optional |
loci_on_cols |
If |
mem_factor |
Proportion of available memory to use loading and processing genotypes.
Ignored if |
mem_lim |
Memory limit in GB, used to break up genotype data into chunks for very large datasets.
Note memory usage is somewhat underestimated and is not controlled strictly.
Default in Linux and Windows is |
m_chunk_max |
Sets the maximum number of loci to process at the time. Actual number of loci loaded may be lower if memory is limiting. |
tol |
Tolerance value passed to conjugate gradient method solver. |
Suppose there are n
individuals and m
loci.
A tibble containing the following association statistics from the last scan for non-selected loci.
For selected loci, these are the values from the scan before each was added to the model (as after addition they get beta ~= 0
and pval ~= 1
).
pval
: The p-value of the last association scan.
beta
: The estimated effect size coefficient for the trait vector at this locus.
f_stat
: The F statistic.
df
: degrees of freedom: number of non-missing individuals minus number of parameters of full model
qval
: The q-value of the last association scan.
sel
: the order in which loci were selected, or zero if they were not selected.
The popkin
package.
# Construct random data # number of individuals we want n_ind <- 5 # number of loci we want m_loci <- 100 # a not so small random genotype matrix X <- matrix( rbinom( m_loci * n_ind, 2, 0.5 ), nrow = m_loci ) # random trait trait <- rnorm( n_ind ) # add a genetic effect from first locus trait <- trait + X[ 1, ] # a required parameter mean_kinship <- mean( diag( n_ind ) / 2 ) # unstructured case tib <- ligera2_f_multi( X, trait, mean_kinship ) tib
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.