score.calc.MC: Calculate -log10(p) for single-SNP GWAS (multi-cores)

View source: R/functions_for_RGWAS.R

score.calc.MCR Documentation

Calculate -log10(p) for single-SNP GWAS (multi-cores)

Description

Calculate -log10(p) of each SNP by the Wald test.

Usage

score.calc.MC(
  M.now,
  ZETA.now,
  y,
  X.now,
  package.MM = "gaston",
  Hinv,
  n.core = 2,
  parallel.method = "mclapply",
  P3D = TRUE,
  eigen.G = NULL,
  optimizer = "nlminb",
  min.MAF = 0.02,
  count = TRUE
)

Arguments

M.now

A n \times m genotype matrix where n is sample size and m is the number of markers.

ZETA.now

A list of variance (relationship) matrix (K; m \times m) and its design matrix (Z; n \times m) of random effects. You can use only one kernel matrix. For example, ZETA = list(A = list(Z = Z, K = K)) Please set names of list "Z" and "K"!

y

A n \times 1 vector. A vector of phenotypic values should be used. NA is allowed.

X.now

A n \times p matrix. You should assign mean vector (rep(1, n)) and covariates. NA is not allowed.

package.MM

The package name to be used when solving mixed-effects model. We only offer the following three packages: "RAINBOWR", "MM4LMM" and "gaston". Default package is 'gaston'. See more details at EM3.general.

Hinv

The inverse of H = ZKZ' + \lambda I where \lambda = \sigma^2_e / \sigma^2_u.

n.core

Setting n.core > 1 will enable parallel execution on a machine with multiple cores. This argument is not valid when 'parallel.method = "furrr"'.

parallel.method

Method for parallel computation. We offer three methods, "mclapply", "furrr", and "foreach".

When 'parallel.method = "mclapply"', we utilize pbmclapply function in the 'pbmcapply' package with 'count = TRUE' and mclapply function in the 'parallel' package with 'count = FALSE'.

When 'parallel.method = "furrr"', we utilize future_map function in the 'furrr' package. With 'count = TRUE', we also utilize progressor function in the 'progressr' package to show the progress bar, so please install the 'progressr' package from github (https://github.com/HenrikBengtsson/progressr). For 'parallel.method = "furrr"', you can perform multi-thread parallelization by sharing memories, which results in saving your memory, but quite slower compared to 'parallel.method = "mclapply"'.

When 'parallel.method = "foreach"', we utilize foreach function in the 'foreach' package with the utilization of makeCluster function in 'parallel' package, and registerDoParallel function in 'doParallel' package. With 'count = TRUE', we also utilize setTxtProgressBar and txtProgressBar functions in the 'utils' package to show the progress bar.

We recommend that you use the option 'parallel.method = "mclapply"', but for Windows users, this parallelization method is not supported. So, if you are Windows user, we recommend that you use the option 'parallel.method = "foreach"'.

P3D

When P3D = TRUE, variance components are estimated by REML only once, without any markers in the model. When P3D = FALSE, variance components are estimated by REML for each marker separately.

eigen.G

A list with

$values

Eigen values

$vectors

Eigen vectors

The result of the eigen decompsition of G = ZKZ'. You can use "spectralG.cpp" function in RAINBOWR. If this argument is NULL, the eigen decomposition will be performed in this function. We recommend you assign the result of the eigen decomposition beforehand for time saving.

optimizer

The function used in the optimization process. We offer "optim", "optimx", and "nlminb" functions. This argument is only valid when ‘package.MM = ’RAINBOWR''.

min.MAF

Specifies the minimum minor allele frequency (MAF). If a marker has a MAF less than min.MAF, it is assigned a zero score.

count

When count is TRUE, you can know how far RGWAS has ended with percent display.

Value

-log10(p) for each marker

References

Kennedy, B.W., Quinton, M. and van Arendonk, J.A. (1992) Estimation of effects of single genes on quantitative traits. J Anim Sci. 70(7): 2000-2012.

Kang, H.M. et al. (2008) Efficient Control of Population Structure in Model Organism Association Mapping. Genetics. 178(3): 1709-1723.

Kang, H.M. et al. (2010) Variance component model to account for sample structure in genome-wide association studies. Nat Genet. 42(4): 348-354.

Zhang, Z. et al. (2010) Mixed linear model approach adapted for genome-wide association studies. Nat Genet. 42(4): 355-360.


RAINBOWR documentation built on July 4, 2024, 1:11 a.m.