Description Usage Arguments Details Value Author(s) References See Also Examples
Fit the null model in the mixed model framework with genetic relationship matrix (GRM).
1 2 3 4 5 6 | seqFitNullGLMM_SPA(formula, data, gdsfile, trait.type=c("binary", "quantitative"),
sample.col="sample.id", maf=0.005, missing.rate=0.01, max.num.snp=1000000L,
variant.id=NULL, inv.norm=TRUE, X.transform=TRUE, tol=0.02, maxiter=20L,
nrun=30L, tolPCG=1e-5, maxiterPCG=500L, num.marker=30L, tau.init=c(0,0),
traceCVcutoff=0.0025, ratioCVcutoff=0.001, geno.sparse=TRUE, num.thread=1L,
model.savefn="", seed=200L, fork.loading=FALSE, verbose=TRUE)
|
formula |
an object of class |
data |
a data frame for the formulas |
gdsfile |
a SeqArray GDS filename, or a GDS object |
trait.type |
"binary" for binary outcomes, "quantitative" for continuous outcomes |
sample.col |
the column name of sample IDs corresponding to the GDS file |
maf |
minor allele frequency for imported genotypes (checking >= maf),
if |
missing.rate |
threshold of missing rate (checking <= missing.rate),
if |
max.num.snp |
the maximum number of SNPs used, or -1 for no limit |
variant.id |
a list of variant IDs, used to construct GRM |
inv.norm |
if |
X.transform |
if |
tol |
overall tolerance for model fitting |
maxiter |
the maximum number of iterations for model fitting |
nrun |
the number of random vectors in the trace estimation |
tolPCG |
tolerance of PCG iterations |
maxiterPCG |
the maximum number of PCG iterations |
num.marker |
the number of SNPs used to calculate the variance ratio |
tau.init |
a 2-length numeric vector, the initial values for variance
components, tau; for binary traits, the first element is always be set
to 1. if |
traceCVcutoff |
the threshold for coefficient of variation (CV) for the trace estimator, and the number of runs for trace estimation will be increased until the CV is below the threshold |
ratioCVcutoff |
the threshold for coefficient of variation (CV) for estimating the variance ratio, and the number of randomly selected markers will be increased until the CV is below the threshold |
geno.sparse |
if |
num.thread |
the number of threads |
model.savefn |
the filename of model output, R data file '.rda', '.RData', or '.rds' |
seed |
an integer as a seed for random numbers |
fork.loading |
load genotypes via forking or not; forking processes in Unix can reduce loading time of genotypes, but may double the memory usage; not applicable on Windows |
verbose |
if |
Utilizing the sparse structure of genotypes could significantly improve the computational efficiency of model fitting, but it also increases the memory usage. For more details of SAIGE algorithm, please refer to the SAIGE paper [Zhou et al. 2018] (see the reference section).
Returns a list with the following components:
coefficients |
the beta coefficients for fixed effects; |
tau |
a numeric vector of variance components 'Sigma_E' and 'Sigma_G'; |
linear.predictors |
the linear fit on link scale; |
fitted.values |
fitted values from objects returned by modeling
functions using |
residuals |
residuals; |
cov |
covariance matrix of beta coefficients; |
converged |
whether the model is fitted or not; |
obj.noK |
internal use, returned object from the SPAtest package; |
var.ratio |
a data.frame with columns 'id' (variant.id), 'maf' (minor allele frequency), 'mac' (minor allele count), 'var1' (the variance of score statistic), 'var2' (a variance estimate without accounting for estimated random effects) and 'ratio' (var1/var2, estimated variance ratio for variance approximation); |
trait.type |
either "binary" or "quantitative"; |
sample.id |
the sample IDs used in the model fitting; |
variant.id |
the variant IDs used in the model fitting. |
Xiuwen Zheng
Zhou W, Nielsen JB, Fritsche LG, Dey R, Gabrielsen ME, Wolford BN, LeFaive J, VandeHaar P, Gagliano SA, Gifford A, Bastarache LA, Wei WQ, Denny JC, Lin M, Hveem K, Kang HM, Abecasis GR, Willer CJ, Lee S. Efficiently controlling for case-control imbalance and sample relatedness in large-scale genetic association studies. Nat Genet (2018). Sep;50(9):1335-1341.
T Sofer, X Zheng, SM Gogarten, CA Laurie, etc. A fully adjusted two-stage procedure for rank-normalization in genetic association studies. 2019. Genetic Epidemiology 43(3), 263-275
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | # open a GDS file
fn <- system.file("extdata", "grm1k_10k_snp.gds", package="SAIGEgds")
gdsfile <- seqOpen(fn)
# load phenotype
phenofn <- system.file("extdata", "pheno.txt.gz", package="SAIGEgds")
pheno <- read.table(phenofn, header=TRUE, as.is=TRUE)
head(pheno)
# fit the null model
glmm <- seqFitNullGLMM_SPA(y ~ x1 + x2, pheno, gdsfile, trait.type="binary")
glmm
# close the GDS file
seqClose(gdsfile)
|
Loading required package: gdsfmt
Loading required package: SeqArray
Loading required package: Rcpp
sample.id y yy x1 x2
1 s1 0 4.5542 1.5118 1
2 s2 0 3.7941 0.3898 1
3 s3 0 5.0411 -0.6212 1
4 s4 0 5.6394 -2.2147 1
5 s5 0 4.2134 1.1249 1
6 s6 0 4.6145 -0.0449 1
SAIGE association analysis:
Fri Jun 4 16:06:42 2021
Filtering variants:
[..................................................] 0%, ETC: ---
[==================================================] 100%, completed, 1s
# of selected variants: 9,976
Fit the null model: y ~ x1 + x2 + var(GRM)
# of samples: 1,000
# of variants: 9,976
using 1 thread
Transform on the design matrix with QR decomposition:
new formula: y ~ x0 + x1 + x2 - 1
Start loading SNP genotypes:
[..................................................] 0%, ETC: ---
[==================================================] 100%, completed, 0s
using 6.6M (sparse matrix)
Binary outcome: y
y Number Proportion
0 902 0.902
1 98 0.098
Initial fixed-effect coefficients:
x0 x1 x2
2.520514 -0.7666948 -0.4557928
Initial variance component estimates, tau:
Sigma_E: 1, Sigma_G: 0.499412
Iteration 1:
tau: (1, 0.4994116)
fixed coeff: (2.520514, -0.7666948, -0.4557928)
Iteration 2:
tau: (1, 0.3287896)
fixed coeff: (2.521231, -0.776603, -0.4592503)
Iteration 3:
tau: (1, 0.2817812)
fixed coeff: (2.525954, -0.7738757, -0.4579659)
Iteration 4:
tau: (1, 0.3211452)
fixed coeff: (2.525719, -0.7730823, -0.4577413)
Iteration 5:
tau: (1, 0.3361534)
fixed coeff: (2.527166, -0.7739766, -0.4579633)
Final tau: (1, 0.3322063)
fixed coeff: (2.527666, -0.774237, -0.4580237)
Calculate the average ratio of variances:
Fri Jun 4 16:06:49 2021
1, maf: 0.0370, mac: 74, ratio: 0.9325 (var1: 0.0792, var2: 0.0849)
2, maf: 0.0645, mac: 129, ratio: 0.9459 (var1: 0.0723, var2: 0.0764)
3, maf: 0.4390, mac: 878, ratio: 0.9398 (var1: 0.0422, var2: 0.0449)
4, maf: 0.0115, mac: 23, ratio: 0.9202 (var1: 0.1, var2: 0.109)
5, maf: 0.0135, mac: 27, ratio: 0.9439 (var1: 0.0823, var2: 0.0872)
6, maf: 0.0505, mac: 101, ratio: 0.9391 (var1: 0.0716, var2: 0.0763)
7, maf: 0.0425, mac: 85, ratio: 0.9417 (var1: 0.073, var2: 0.0775)
8, maf: 0.2290, mac: 458, ratio: 0.9421 (var1: 0.0563, var2: 0.0598)
9, maf: 0.0270, mac: 54, ratio: 0.9381 (var1: 0.0761, var2: 0.0812)
10, maf: 0.0205, mac: 41, ratio: 0.9390 (var1: 0.0824, var2: 0.0878)
11, maf: 0.1560, mac: 312, ratio: 0.9384 (var1: 0.0666, var2: 0.071)
12, maf: 0.0285, mac: 57, ratio: 0.9343 (var1: 0.0803, var2: 0.0859)
13, maf: 0.4110, mac: 822, ratio: 0.9376 (var1: 0.0458, var2: 0.0488)
14, maf: 0.4530, mac: 906, ratio: 0.9421 (var1: 0.0398, var2: 0.0422)
15, maf: 0.0930, mac: 186, ratio: 0.9396 (var1: 0.0715, var2: 0.0761)
16, maf: 0.0220, mac: 44, ratio: 0.9387 (var1: 0.0668, var2: 0.0712)
17, maf: 0.1655, mac: 331, ratio: 0.9410 (var1: 0.0641, var2: 0.0681)
18, maf: 0.4520, mac: 904, ratio: 0.9375 (var1: 0.0438, var2: 0.0467)
19, maf: 0.0105, mac: 21, ratio: 0.9406 (var1: 0.0821, var2: 0.0873)
20, maf: 0.0350, mac: 70, ratio: 0.9332 (var1: 0.0833, var2: 0.0893)
21, maf: 0.0235, mac: 47, ratio: 0.9377 (var1: 0.0744, var2: 0.0793)
22, maf: 0.3730, mac: 746, ratio: 0.9406 (var1: 0.0474, var2: 0.0504)
23, maf: 0.0130, mac: 26, ratio: 0.9530 (var1: 0.0629, var2: 0.066)
24, maf: 0.0345, mac: 69, ratio: 0.9459 (var1: 0.0684, var2: 0.0724)
25, maf: 0.1905, mac: 381, ratio: 0.9370 (var1: 0.0628, var2: 0.0671)
26, maf: 0.4080, mac: 816, ratio: 0.9392 (var1: 0.0422, var2: 0.0449)
27, maf: 0.0105, mac: 21, ratio: 0.9338 (var1: 0.0902, var2: 0.0966)
28, maf: 0.1350, mac: 270, ratio: 0.9422 (var1: 0.0624, var2: 0.0663)
29, maf: 0.1600, mac: 320, ratio: 0.9393 (var1: 0.0648, var2: 0.069)
30, maf: 0.0335, mac: 67, ratio: 0.9396 (var1: 0.0738, var2: 0.0785)
ratio avg. is 0.9391186, sd: 0.005418568
Fri Jun 4 16:06:49 2021
Done.
List of 12
$ coefficients : Named num [1:3] -2.98 0.752 0.916
..- attr(*, "names")= chr [1:3] "(Intercept)" "x1" "x2"
$ tau : Named num [1:2] 1 0.332
..- attr(*, "names")= chr [1:2] "Sigma_E" "Sigma_G"
$ linear.predictors: num [1:1000] -1.04 -1.85 -2.66 -3.88 -1.26 ...
$ fitted.values : num [1:1000] 0.2605 0.1356 0.0652 0.0202 0.2204 ...
$ residuals : num [1:1000] -0.2605 -0.1356 -0.0652 -0.0202 -0.2204 ...
$ cov : num [1:3, 1:3] 0.01799 -0.00764 -0.00497 -0.00764 0.014 ...
$ converged : logi TRUE
$ obj.noK :List of 7
..$ y : Named num [1:1000] 0 0 0 0 0 0 0 0 0 0 ...
.. ..- attr(*, "names")= chr [1:1000] "1" "2" "3" "4" ...
..$ mu : Named num [1:1000] 0.2823 0.1458 0.0744 0.024 0.2278 ...
.. ..- attr(*, "names")= chr [1:1000] "1" "2" "3" "4" ...
..$ res : Named num [1:1000] -0.2823 -0.1458 -0.0744 -0.024 -0.2278 ...
.. ..- attr(*, "names")= chr [1:1000] "1" "2" "3" "4" ...
..$ V : Named num [1:1000] 0.2026 0.1245 0.0689 0.0234 0.1759 ...
.. ..- attr(*, "names")= chr [1:1000] "1" "2" "3" "4" ...
..$ X1 : num [1:1000, 1:3] -1 -1 -1 -1 -1 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr [1:1000] "1" "2" "3" "4" ...
.. .. ..$ : chr [1:3] "x0" "x1" "x2"
.. ..- attr(*, "assign")= int [1:3] 1 2 3
..$ XV : num [1:3, 1:1000] -0.2026 -0.3058 -0.1914 -0.1245 -0.0503 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr [1:3] "x0" "x1" "x2"
.. .. ..$ : chr [1:1000] "1" "2" "3" "4" ...
.. ..- attr(*, "assign")= int [1:3] 1 2 3
..$ XXVX_inv: num [1:1000, 1:3] -0.0019 -0.01009 -0.01746 -0.02909 -0.00472 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : chr [1:1000] "1" "2" "3" "4" ...
.. .. ..$ : chr [1:3] "x0" "x1" "x2"
..- attr(*, "class")= chr "SA_NULL"
$ var.ratio :'data.frame': 30 obs. of 6 variables:
..$ id : int [1:30] 904 987 1008 1408 1566 2013 2221 3767 3809 3981 ...
..$ maf : num [1:30] 0.452 0.135 0.453 0.1655 0.0505 ...
..$ mac : num [1:30] 904 270 906 331 101 44 74 54 21 70 ...
..$ var1 : num [1:30] 0.0438 0.0624 0.0398 0.0641 0.0716 ...
..$ var2 : num [1:30] 0.0467 0.0663 0.0422 0.0681 0.0763 ...
..$ ratio: num [1:30] 0.937 0.942 0.942 0.941 0.939 ...
$ trait.type : chr "binary"
$ sample.id : chr [1:1000] "s1" "s2" "s3" "s4" ...
$ variant.id : int [1:9976] 1 2 3 4 5 6 7 8 9 10 ...
- attr(*, "class")= chr "ClassSAIGE_NullModel"
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.