Description Usage Arguments Details Value References Examples
A fast kernelbased regional association analysis in related or population samples
1 2 3 4 5 6 7  FFBSKAT(formula, phenodata, genodata, kin = NULL, nullmod,
regions = NULL, sliding.window = c(20, 10), mode = "add",
ncores = 1, return.time = FALSE, kernel = "linear.weighted",
beta.par = c(1, 25), weights = NULL, method = "kuonen",
acc = 1e8, lim = 1e+6, return.variance.explained = FALSE,
reml = TRUE, flip.genotypes = FALSE, impute.method = 'mean',
rho = FALSE, write.file = FALSE, ...)

formula 
referring to the column(s) in 
phenodata 
a data frame containing columns mentioned in 
genodata 
an object with genotypes to analyze. Several formats are allowed: 
kin 
a square symmetric matrix giving the pairwise kinship coefficients between analyzed
individuals. Under default 
nullmod 
an object containing parameter estimates under the null model. Setting 
regions 
an object assigning regions to be analyzed. This can be: 
sliding.window 
the sliding window size and step. Has no effect if 
mode 
the mode of inheritance: "add", "dom" or "rec" for additive, dominant or recessive mode, respectively. For dominant (recessive) mode genotypes will be recoded as AA = 0, Aa = 1 and aa = 1 (AA = 0, Aa = 0 and aa = 1), where a is a minor allele. Default mode is additive. 
ncores 
number of CPUs for parallel calculations. Default = 1. 
return.time 
a logical value indicating whether the running time should be returned. 
kernel 
one of "linear.weighted" (default), "quadratic", "IBS", "IBS.weighted", "2wayIX".
See Details for "linear.weighted" kernel description and [Wu, et al., 2011] for other kernel types.
"2wayIX" kernel considers SNPSNP interaction terms along with main effects. For
"linear.weighted" and "IBS.weighted" kernels, weights can be varied by defining

beta.par 
two positive numeric shape parameters in the beta distribution to assign weights
for each SNP in weighted kernels (see Details). Default = c(1, 25) is recommended for analysis
of rare variants. Has no effect for unweighted kernels or if 
weights 
a numeric vector or a function of minor allele frequency (MAF) to assign weights for each genetic variant in the weighted kernels. Has no effect if one of unweighted kernels was chosen. If NULL, the weights will be calculated using the beta distribution (see Details). 
method 
either "kuonen" or "davies". Method for computing the P value (see Details). Default = "kuonen". 
acc 
accuracy parameter for "davies" method. 
lim 
limit parameter for "davies" method. 
return.variance.explained 
a logical value indicating whether the (marginal) variance explained by each region should be returned. Default = FALSE for faster performance. 
reml 
a logical value indicating whether the restricted maximum likelihood should be used to estimate
the variance explained by each region. Default = TRUE for faster performance. 
flip.genotypes 
a logical value indicating whether the genotypes of some genetic variants should be flipped (relabeled) to ensure that all minor allele frequencies (MAFs) < 0.5. Default = FALSE, with warning of any MAF > 0.5. 
impute.method 
a method for imputation of missing genotypes. It can be either "mean" (default) or "blue". If "mean" the genotypes will be imputed by the simple mean values. If "blue" the best linear unbiased estimates (BLUEs) of mean genotypes will be calculated taking into account the relationships between individuals [McPeek, et al., 2004, DOI: 10.1111/j.0006341X.2004.00180.x] and used for imputation. 
rho 
If TRUE the "optimal" test (SKATO) is performed [Lee, et al., 2012]. 
write.file 
output file name to write results as they come (sequential mode only). 
... 
other arguments that could be passed to 
By default, FFBSKAT uses the linear weighted kernel function to set the interindividual
similarity matrix K = GWWG^T for SKAT and K = GW(Iρ + (1  ρ)ee^T)WG^T for SKATO, where \mathit{G} is the \mathit{n\times p}
genotype matrix for \mathit{n} individuals and \mathit{p} genetic variants in the region,
\mathit{W} is the \mathit{p\times p} diagonal weight matrix, I is the \mathit{p\times p} identity matrix, ρ is pairwise correlation
coefficient between genetic effects (which will be adaptively selected from given rho
), and e is the \mathit{p\times 1} vector of ones. Given the shape parameters
of the beta function, beta.par = c(a, b)
,
the weights are defined using probability density function of the beta distribution:
W_{i}=(B(a,b))^{^{1}}MAF_{i}^{a1}(1MAF_{i})^{b1} ,
where MAF_{i} is a minor allelic frequency for the i^{th} genetic variant in the region,
which is estimated from genotypes, and B(a,b) is the beta function. This way of defining weights
is the same as in original SKAT (see [Wu, et al., 2011] for details). beta.par = c(1, 1)
corresponds
to the unweighted SKAT.
The formula:
Q=0.5\tilde{y}^{T}Ω^{1}KΩ^{1}\tilde{y}
is used to calculate score statistic, where \tilde{y} and Ω are environmental
residuals and covariance matrix obtained under the null hypothesis, respectively. Depending on
the method option chosen, either Kuonen or Davies method is used to calculate P values from the
score statistic Q. Both an Applied Statistics algorithm that inverts the characteristic function
of the mixture chisq [Davies, 1980] and a saddlepoint approximation [Kuonen, 1999] are nearly exact,
with the latter usually being a bit faster.
For other kernel types, see [Wu, et al., 2011].
A list with values:
results 
a data frame containing P values, numbers of variants and polymorphic variants for each of analyzed regions. 
nullmod 
an object containing the estimates of the null model parameters: heritability (h2), total variance (total.var), estimates of fixed effects of covariates (alpha), the gradient (df), and the total loglikelihood (logLH). 
sample.size 
the sample size after omitting NAs. 
time 
If 
Svishcheva G.R., Belonogova N.M. and Axenovich T.I. (2014) FFBSKAT: Fast FamilyBased Sequence Kernel Association Test. PLoS ONE 9(6): e99407. doi:10.1371/journal.pone.0099407
Davies R.B. (1980) Algorithm AS 155: The Distribution of a Linear Combination of chi2 Random Variables, Journal of the Royal Statistical Society. Series C (Applied Statistics), Vol. 29, N 3, P. 323333.
Kuonen D. (1999) Saddlepoint Approximations for Distributions of Quadratic Forms in Normal Variables. Biometrika, Vol. 86, No. 4, P. 929935.
Wu M.C., et al. (2011) Rarevariant association testing for sequencing data with the sequence kernel association test. Am. J. Hum. Genet., Vol. 89, P. 8293.
Lee S., et al. (2012) Optimal unified approach for rare variant association testing with application to small sample casecontrol wholeexome sequencing studies. American Journal of Human Genetics, 91, 224237.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32  data(example.data)
## Run FFBSKAT with sliding window (default):
out < FFBSKAT(trait ~ age + sex, phenodata, genodata, kin)
## Run FFBSKAT with regions defined in snpdata$gene and with
## null model obtained in first run:
out < FFBSKAT(trait ~ age + sex, phenodata, genodata, kin,
out$nullmod, regions = snpdata$gene)
## Run FFBSKAT parallelized on two cores (this will require
## 'foreach' and 'doParallel' Rpackages installed and
## cores available):
out < FFBSKAT(trait ~ age + sex, phenodata, genodata, kin,
out$nullmod, ncores = 2)
## Run FFBSKAT with genotypes in VCF format:
VCFfileName < system.file(
"testfiles/1000g.phase1.20110521.CFH.var.anno.vcf.gz",
package = "FREGAT")
geneFile < system.file("testfiles/refFlat_hg19_6col.txt.gz",
package = "FREGAT")
phe < data.frame(trait = rnorm(85))
out < FFBSKAT(trait, phe, VCFfileName, geneFile = geneFile,
reg = "CFH", annoType = "Nonsynonymous",
flip.genotypes = TRUE)
## Run FFBSKAT with genotypes in PLINK binary data format:
bedFile < system.file("testfiles/sample.bed",
package = "FREGAT")
phe < data.frame(trait = rnorm(120))
out < FFBSKAT(trait, phe, bedFile)

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.