GGI: Gene-Gene Interaction Analysis of a set of genes

Description Usage Arguments Details Value References See Also Examples

View source: R/GGI.R

Description

GGI allows the search for Gene-Gene Interactions by testing all possible pairs of genes in a set of genes.

Usage

1
2
GGI(Y, snpX, genes.length = NULL, genes.info = NULL, method = c("minP","PCA", "CCA",
 "KCCA","CLD","PLSPM","GBIGM","GATES", "tTS", "tProd"), ...)

Arguments

Y

numeric, integer, character or factor vector with exactly two different values. Y is the response variable and should be of length equal to the number of rows of snpX (number of individuals).

snpX

SnpMatrix object. Must have a number of rows equal to the length of Y. See details.

genes.length

(optional) a numeric vector. gene.length is the length (in columns/SNP) of each gene.

genes.info

(optional) a data frame. genes.info must have four columns named Genenames, SNPnames, Position and Chromosome. Each row describes a SNP and missing values are not allowed.

method

a string matching one of the following: PCA, CCA, KCCA, CLD, PLSPM, GBIGM, minP, GATES, tTS or tProd. Only one method can be parsed.

...

Other optional arguments to be passed to the functions associated with the method chosen. See more in elementary methods help.

Details

This function is a wrapper for all Gene-Gene Interaction analysis methods and drive the overall analysis: splitting the dataset in gene matrices and starting elementary analysis for each pair of genes.

SNPs from the same gene are assumed to be ordered along the chromosome. See selectSnps.

If genes.lenght is provided, it contains the number of SNPs of each gene. For example, if genes.length is the vector: c(20, 35, 15), then gene 1 will be interpreted as the set of the first 20 columns/SNPs of snpX, gene 2 will be interpreted as the following 35 columns/SNP, etc. Each gene declared is considered contiguous with the one before and after it. genes.length can be named if you want the returned matrix to have dimensions named after those. If no names are given then generic names are generated following the pattern Gene.n (n being the gene's index) .

The following methods are available to perform the interaction test for a single pair of genes:

Missing values are not allowed and trying to parse an incomplete SnpMatrix object as an argument will result in an error. Imputation can be performed prior to the analysis with the imputeSnpMatrix function.

Value

A list with class "GGInetwork" containing the following components:

statistic

a symmetric matrix of size G*G where G is the number of genes studied. The general term of the matrix is the statistic of the interaction between the two genes.

p.value

a symmetric matrix of size G*G where G is the number of genes studied. The general term of the matrix is the p-value of the interaction between the two genes.

df

(Only for method="PCA"). a symmetric matrix of size G*G where G is the number of genes studied. The general term of the matrix is the degrees of freedom of the interaction test.

method

The method used to perform the Gene-Gene interaction test.

parameter

A list of the parameters used to perform the Gene-Gene Interaction test.

References

M. Emily. AGGrEGATOr: A Gene-based GEne-Gene interActTiOn test for case-control association studies, Statistical Application in Genetics and Molecular Biology, 15(2): 151-171, 2016.
J. Li et al. Identification of gene-gene interaction using principal components. BMC Proceedings, 3 (Suppl. 7): S78, 2009.
Qianqian Peng, Jinghua Zhao, and Fuzhong Xue. A gene-based method for detecting gene-gene co-association in a case-control study. European Journal of Human Genetics, 18(5) :582-587, 2010.
Yuan, Z. et al. (2012): Detection for gene-gene co-association via kernel canonical correlation analysis, BMC Genetics, 13, 83.
Larson, N. B. et al. (2013): A kernel regression approach to gene-gene interaction detection for case-control studies, Genetic Epidemiology, 37, 695-703.
Indika Rajapakse, Michael D. Perlman, Paul J. Martin, John A. Hansen, and Charles Kooperberg. Multivariate detection of gene-gene interactions. Genetic Epidemiology, 36(6):622-630, 2012.
X. Zhang et al. A PLSPM-based test statistic for detecting gene-gene co-association in genome-wide association study with case-control design. PLoS ONE, 8(4):e62129, 2013.
J. Li, et al.. A gene-based information gain method for detecting gene-gene interactions in case-control studies. European Journal of Human Genetics, 23 :1566-1572, 2015.
M.X. Li et al. GATES: A Rapid and Powerful Gene-Based Association Test Using Extended Simes Procedure, American Journal of Human Genetics, 88(3): 283-293, 2011.
B. Jiang, X. Zhang, Y. Zuo and G. Kang. A powerful truncated tail strength method for testing multiple null hypotheses in one dataset. Journal of Theoretical Biology 277: 67-73, 2011.
D.V. Zaykin, L.A. Zhivotovsky, P.H. Westfall and B.S. Weir. Truncated product method for combining P-values. Genetic epidemiology 22: 170-185, 2002.

See Also

PCA.test, CCA.test, KCCA.test, CLD.test, PLSPM.test, GBIGM.test, plot.GGInetwork, minP.test, gates.test, tTS.test, tProd.test, imputeSnpMatrix

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
## Not run: 
## Dataset is included in the package
ped <- system.file("extdata/example.ped", package="GeneGeneInteR")
info <- system.file("extdata/example.info", package="GeneGeneInteR")
posi <- system.file("extdata/example.txt", package="GeneGeneInteR")

## Importation of the genotypes
data.imported <- importFile(file=ped, snps=info, pos=posi, pos.sep="\t")
## Filtering of the data: SNPs with MAF < 0.05 or p.value for HWE < 1e-3 or SNPs with 
## call.rate < 0.9 are removed. 
data.scour <- snpMatrixScour(snpX=data.imported$snpX,genes.info=data.imported$genes.info,min.maf=0.05,
                              min.eq=1e-3,call.rate=0.9)
## Imputation of the missing genotypes
data.imputed <- imputeSnpMatrix(data.scour$snpX, genes.info = data.scour$genes.info)

## End(Not run)
## Equivalent loading of the genotypes
load(system.file("extdata/dataImputed.Rdata", package="GeneGeneInteR"))

## Importation of the phenotype
resp <- system.file("extdata/response.txt", package="GeneGeneInteR")
Y  <- read.csv(resp, header=FALSE)

## estimation of the interaction between the 17 genes with the CLD method -- can take a few minutes
## Not run: 
GGI.res <- GGI(Y=Y, snpX=data.imputed$snpX, genes.info=data.imputed$genes.info,method="CLD")

## End(Not run)

## estimation of the interaction between 12 among the 17 genes with the default PCA method 
## Selection of 12 genes among 17
dta <- selectSnps(data.imputed$snpX, data.imputed$genes.info, c("bub3","CDSN","Gc","GLRX",
                  "PADI1","PADI2","PADI4","PADI6","PRKD3","PSORS1C1","SERPINA1","SORBS1"))
GGI.res <- GGI(Y=Y, snpX=dta$snpX, genes.info=dta$genes.info,method="PCA")

GeneGeneInteR documentation built on Nov. 8, 2020, 6:28 p.m.