Description Usage Arguments Details Value Note Author(s) References See Also Examples
This function is a replacement for prepScores, prepScoresX and
prepCox. It computes and organizes the neccesary output to efficiently
meta-analyze SKAT and other tests. Note that the tests are *not* computed
by these functions. The output must be passed to one of
skatMeta
, burdenMeta
, or
singlesnpMeta
.
Unlike the SKAT package which operates on one gene at a time, these functions are intended to operate on many genes, e.g. a whole exome, to facilitate meta analysis of whole genomes or exomes.
1 2 3 |
Z |
A genotype matrix (dosage matrix) - rows correspond to individuals and columns correspond to SNPs. Use 'NA' for missing values. The column names of this matrix should correspond to SNP names in the SNP information file. |
formula |
Base formula, of the kind used in glm() - typically of the form y~covariate1 + covariate2. For Cox models, the formula follows that of the coxph() function. |
family |
either 'gaussian', for continuous data, 'binomial' for 0/1 outcomes or 'cox' for survival models. Family data not currently supported for binomial or survival outcomes. Male also not supported for survival outcomes. See Details. |
SNPInfo |
SNP Info file - must contain fields given in 'snpName' and 'aggregateBy'. |
snpNames |
The field of SNPInfo where the SNP identifiers are found. Default is 'Name'. See Details. |
aggregateBy |
The field of SNPInfo on which the skat results were aggregated. Default is 'gene'. For single snps which are intended only for single variant analyses, it is recomended that they have a unique identifier in this field. |
kins |
the kinship matrix for related individuals. Only supported for family=gaussian(). See lmekin in the kinship2 package for more details. |
sparse |
whether or not to use a sparse Matrix approximation for dense kinship matrices (defaults to TRUE). |
data |
data frame in which to find variables in the formula |
male |
For analyzing the X chromosome, with prepScoresX, ‘male’ is the gender (0/1 or F/T) indicating female/male. See details. |
verbose |
logical. whether or not to print the progress bar. |
This function is a drop in replacement for prepScores, prepScoresX, and prepCox. If family is 'cox' then the call is equivalent to prepCox and an error will occur if either male or kins is provided. When family is 'gaussian' or 'binomial' and male is not provided then the call is equivalent to prepScores. Whereas if male is provided then the call is equivalent to prepScoresX.
This function computes the neccesary information to meta analyze SKAT
analyses: the individual SNP scores, their MAF, and a covariance matrix for
each unit of aggregation. Note that the SKAT test is *not* calculated by
this function. The output must be passed to one of
skatMeta
, burdenMeta
, or
singlesnpMeta
.
A crucial component of SKAT and other region-based tests is a common unit
of aggregation accross studies. This is given in the SNP information file
(argument SNPInfo
), which pairs SNPs to a unit of aggregation
(typically a gene). The additional arguments snpNames
and
aggregateBy
specify the columns of the SNP information file which
contain these pairings. Note that the column names of the genotype matrix
Z
must match the names given in the snpNames
field.
Using prepScores2
, users are strongly recommended to use all SNPs,
even if they are monomorphic in your study. This is for two reasons;
firstly, monomorphic SNPs provide information about MAF across all studies;
without providing the information we are unable to tell if a missing SNP
data was monomorphic in a study, or simply failed to genotype adequately in
that study. Second, even if some SNPs will be filtered out of a particular
meta-analysis (e.g., because they are intronic or common) constructing
seqMeta objects describing all SNPs will reduce the workload for subsequent
follow-up analyses.
Note: to view results for a single study, one can pass a single seqMeta object to a function for meta-analysis.
an object of class 'seqMeta'. This is a list, not meant for human
consumption, but to be fed to skatMeta()
or another function. The
names of the list correspond to gene names. Each element in the list
contains
scores |
The scores (y-yhat)^t g |
cov |
The variance of the scores. When no covariates are used, this is the LD matrix. |
n |
The number of subjects |
maf |
The alternate allele frequency |
sey |
The residual standard error. |
For survival models, the signed likelihood ratio statistic is used
instead of the score, as the score test is anti-conservative for
proportional hazards regression. The code for this routine is based on the
coxph.fit
function from the survival
package.
Please see the package vignette for more details.
Brian Davis, Arie Voorman, Jennifer Brody
Wu, M.C., Lee, S., Cai, T., Li, Y., Boehnke, M., and Lin, X. (2011) Rare Variant Association Testing for Sequencing Data Using the Sequence Kernel Association Test (SKAT). American Journal of Human Genetics.
Chen H, Meigs JB, Dupuis J. Sequence Kernel Association Test for Quantitative Traits in Family Samples. Genetic Epidemiology. (To appear)
Lin, DY and Zeng, D. On the relative efficiency of using summary statistics versus individual-level data in meta-analysis. Biometrika. 2010.
prepScores
prepScoresX
prepCox
skatMeta
burdenMeta
singlesnpMeta
skatOMeta
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | ###load example data for two studies:
### see ?seqMetaExample
data(seqMetaExample)
####run on each cohort:
cohort1 <- prepScores2(Z=Z1, y~sex+bmi, SNPInfo = SNPInfo, data = pheno1)
cohort2 <- prepScores2(Z=Z2, y~sex+bmi, SNPInfo = SNPInfo, kins = kins, data = pheno2)
#### combine results:
##skat
out <- skatMeta(cohort1, cohort2, SNPInfo = SNPInfo)
head(out)
##T1 test
out.t1 <- burdenMeta(cohort1,cohort2, SNPInfo = SNPInfo, mafRange = c(0,0.01))
head(out.t1)
##single snp tests:
out.ss <- singlesnpMeta(cohort1,cohort2, SNPInfo = SNPInfo)
head(out.ss)
## Not run:
########################
####binary data
cohort1 <- prepScores2(Z=Z1, formula = ybin~1, family = "binomial",
SNPInfo = SNPInfo, data = pheno1)
out <- skatMeta(cohort1, SNPInfo = SNPInfo)
head(out)
####################
####survival data
cohort1 <- prepScores2(Z=Z1, formula = Surv(time,status)~strata(sex)+bmi,
family = "cox", SNPInfo = SNPInfo, data = pheno1)
out <- skatMeta(cohort1, SNPInfo = SNPInfo)
head(out)
## End(Not run)
|
Loading required package: survival
gene p Qmeta cmaf nmiss nsnps
1 gene1 0.858651041 97982.36 0.2050000 600 15
2 gene10 0.385906585 236430.85 0.3112500 1200 21
3 gene100 0.310106989 201665.04 0.2537500 600 16
4 gene11 0.348914794 298153.58 0.3766667 1800 26
5 gene12 0.005833266 364830.51 0.2175000 0 15
6 gene13 0.716506668 193644.56 0.3095833 600 22
gene p beta se cmafTotal cmafUsed nsnpsTotal
1 gene1 0.88817354 0.02859985 0.2033902 0.2050000 0.009583333 15
2 gene10 0.06436376 -0.37607834 0.2033239 0.3112500 0.009583333 21
3 gene100 NA NA NA 0.2537500 0.000000000 16
4 gene11 NA NA NA 0.3766667 0.000000000 26
5 gene12 NA NA NA 0.2175000 0.000000000 15
6 gene13 0.21375342 0.21098043 0.1696925 0.3095833 0.016250000 22
nsnpsUsed nmiss
1 1 0
2 1 0
3 0 0
4 0 0
5 0 0
6 2 600
gene Name p maf caf nmiss ntotal beta
1 gene1 1000001 0.8283362 0.012916667 0.012916667 0 1200 -0.03808783
2 gene1 1000002 0.4096389 0.014583333 0.014583333 0 1200 -0.13724623
3 gene1 1000003 0.8881735 0.009583333 0.009583333 0 1200 0.02859985
4 gene1 1000004 0.8392653 0.015833333 0.015833333 0 1200 -0.03222936
5 gene1 1000005 0.4976409 0.012083333 0.012083333 0 1200 0.12424703
6 gene1 1000006 0.4737698 0.014166667 0.014166667 0 1200 0.12167820
se beta.cohort1 se.cohort1 beta.cohort2 se.cohort2
1 0.1756527 -0.06556674 0.2475084 -0.01020464 0.2493225
2 0.1664540 -0.26824737 0.2276350 0.01329406 0.2440214
3 0.2033902 -0.28387603 0.2848673 0.35353121 0.2904894
4 0.1588958 -0.13005243 0.2410367 0.04295373 0.2113112
5 0.1831995 -0.11249089 0.2845093 0.29193007 0.2394457
6 0.1698565 0.11089457 0.2339095 0.13370805 0.2470560
gene p Qmeta cmaf nmiss nsnps
1 gene1 0.60357255 16429.79 0.2058333 0 15
2 gene10 0.43880917 26284.57 0.3083333 600 20
3 gene100 0.60085665 18150.47 0.2400000 0 16
4 gene11 0.32005395 34476.96 0.3475000 1200 24
5 gene12 0.06862814 30937.60 0.2233333 0 15
6 gene13 0.35458006 29535.92 0.2983333 600 21
Warning messages:
1: In coxlr.fit(dat, m$y, m$strata, NULL, init = c(0, m$coef), coxph.control(iter.max = 100), :
Loglik converged before variable 1 ; beta may be infinite.
2: In coxlr.fit(dat, m$y, m$strata, NULL, init = c(0, m$coef), coxph.control(iter.max = 100), :
Loglik converged before variable 1 ; beta may be infinite.
3: In coxlr.fit(dat, m$y, m$strata, NULL, init = c(0, m$coef), coxph.control(iter.max = 100), :
Loglik converged before variable 1 ; beta may be infinite.
gene p Qmeta cmaf nmiss nsnps
1 gene1 0.9851753 23122.75 0.2058333 0 15
2 gene10 0.1619854 123747.04 0.3083333 600 20
3 gene100 0.5545533 63668.09 0.2400000 0 16
4 gene11 0.5233723 102691.94 0.3475000 1200 24
5 gene12 0.8078825 45068.24 0.2233333 0 15
6 gene13 0.7262212 66556.76 0.2983333 600 21
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.