Genotype.Kernels: Kernelize the Genotype matrix for the region to be tested.

Description Usage Arguments Details Value Author(s) References Examples

Description

This function kernelizes the genotype matrix

Usage

1
2
3
4
Genotype.Kernels(Z, obj.res, kernel = "linear.weighted", Is.Common = FALSE,
weights.beta = c(1,25), weights = NULL, impute.method = "fixed", 
r.corr = 0, is_check_genotype = TRUE, is_dosage = FALSE, 
missing_cutoff = 0.15, estimate_MAF = 1,max_maf=1,verbose = TRUE)

Arguments

Z

a numeric genotype matrix with each row as a different individual and each column as a separate gene/snp. Each genotype should be coded as 0, 1, 2, and 9 (or NA) for AA, Aa, aa, and missing, where A is a major allele and a is a minor allele. Missing genotypes will be imputed by the simple Hardy-Weinberg equilibrium (HWE) based imputation.

obj.res

an output object of the MultiSKAT_NULL function.

kernel

a type of kernel (default= "linear.weighted"). See detail section.

Is.Common

a binary variable indiciating whether a variant has the same effect on all the phenotypes (default=FALSE).

weights.beta

a numeric vector of parameters for the beta weights for the weighted kernels. If you want to use your own weights, please use the "weights" parameter. It will be ignored if "weights" parameter is not null.

weights

a numeric vector of weights for the weighted kernels.

impute.method

a method to impute missing genotypes (default= "fixed"). "bestguess" imputes missing genotypes as most likely values (0,1,2), "random" imputes missing genotypes by generating binomial(2,p) random variables (p is the MAF), and "fixed" imputes missing genotypes by assigning the mean genotype values (2p).

r.corr

the ρ parameter for the compound symmetric correlation structure kernels (default=0).

is_check_genotype

a logical value indicating whether to check the validity of the genotype matrix Z (default= TRUE). If Z has non-SNP data, please set it FALSE, otherwise you will get an error message. If it is FALSE and you use weighted kernels, the weights should be given through the "weights" parameter.

is_dosage

a logical value indicating whether the matrix Z is a dosage matrix. If it is TRUE, the function will ignore "is_check_genotype".

missing_cutoff

a cutoff of the missing rates of SNPs (default=0.15). Any SNPs with missing rates higher than the cutoff will be excluded from the analysis.

estimate_MAF

a numeric value indicating how to estimate MAFs for the weight calculation and the missing genotype imputation. If estimate_MAF=1 (default), it uses all samples to estimate MAFs. If estimate_MAF=2, only samples with non-missing phenotypes and covariates are used to estimate MAFs

max_maf

a cutoff of the maximum minor allele frequencies (MAF) (default=1, no cutoff). Any SNPs with MAF > cutoff will be excluded from the analysis

verbose

a binary indicator to display messages (default=TRUE, displays messages)

Details

There are 6 types of pre-specified kernels: "linear", "linear.weighted", "IBS", "IBS.weighted", "quadratic" and "2wayIX". Among them, "2wayIX" is a product kernel consisting of main effects and SNP-SNP interaction terms. If users want to use dosage values instead of genotypes, set is_dosage=TRUE.

The r.corr represents a ρ parameter of the unified test, Q_{ρ} = (1-ρ) Q_S + ρ Q_B, where Q_S is a SKAT test statistic, and Q_B is a weighted burden test statistic. Therefore, ρ=0 results in the original weighted linear kernel SKAT, and ρ=1 results in the weighted burden test (default: ρ=0).

If users want to silent the messages from the function, set verbose=FALSE.

By default, SKAT uses impute.method="fixed" that imputes missing genotypes as the mean genotype values (2p). When variates are very rare and missing rates between cases and controls are highly unbalanced, impute.method="fixed" can yield inflated type I error rate. In this case, we recommend to use impute.method="bestguess", which does not suffer the same problem.

Value

This function returns the Kernelized Genotype matrix.

Author(s)

Seunuggeun Lee

References

Lee, S., Wu, M. C., and Lin, X. (2012) Optimal tests for rare variant effects in sequencing association studies. Biostatistics, 13, 762-775.

Wu, M. C.*, Lee, S.*, Cai, T., Li, Y., Boehnke, M., and Lin, X. (2011) Rare Variant Association Testing for Sequencing Data Using the Sequence Kernel Association Test (SKAT). American Journal of Human Genetics, 89, 82-93. \ * contributed equally.

Examples

1
2
3
4
5
data(MultiSKAT.example.data)
attach(MultiSKAT.example.data)

obj.null <- MultiSKAT_NULL(Phenotypes,Cov)
G.Kernel <- Genotype.Kernels(Genotypes,obj.null)

diptavo/MultiSKAT documentation built on May 22, 2019, 1:36 p.m.