# SNP Genotype Association Testing with Mixed Models

### Description

`assocTestMM`

performs SNP genotype association tests using the null model fit with `fitNullMM`

.

### Usage

1 2 3 |

### Arguments

`genoData` |
An object of class |

`nullMMobj` |
A null model object returned by |

`test` |
A character string specifying the type of test to be performed. The possibilities are "Wald" (default) or "Score"; only "Score" can be used when the family of the null model fit with |

`snp.include` |
A vector of SNP IDs to include in the analysis. If NULL, see |

`chromosome` |
A vector of integers specifying which chromosomes to analyze. This parameter is only considred when |

`impute.geno` |
A logical indicator of whether sporadic missing genotype values should be mean imputed. The default is TRUE. See 'Details' for further information. |

`snp.block.size` |
The number of SNPs to read-in/analyze at once. The default value is 5000. See 'Details' for further information regarding how this parameter works when |

`ivars` |
A vector of character strings specifying the names of the variables for which a genotype interaction term should be included. If NULL (default) no genotype interactions are included. See 'Details' for further information. |

`ivar.return.betaCov` |
Logical indicator of whether the estimated covariance matrix of the effect size estimates (betas) for the genotype and genotype interaction terms should be returned; the default is FALSE. |

`verbose` |
Logical indicator of whether updates from the function should be printed to the console; the default is TRUE. |

### Details

When `impute.geno`

is TRUE, sporadic missing genotype values are mean imputed using the minor allele frequency (MAF) calculated on all other samples at that SNP. When `impute.geno`

is FALSE, samples with missing values for all of the SNP genotypes in the current SNP block are removed from the analysis for the block; this may significantly slow down computation time because many pre-computed matrices need to be re-computed each time the sample set changes. Also note: when `impute.geno`

is FALSE, sporadic missingness for a sample inside of a SNP block will lead to an error.

The input `ivars`

can be used to perform GxE tests. Multiple interaction variables may be specified, but all interaction variables specified must have been included as covariates in fitting the null model with `fitNullMM`

. When performing GxE analyses, `assocTestMM`

will report two tests: (1) the joint test of all genotype interaction terms in the model (this is the test for any genotype interaction effect), and (2) the joint test of the genotype term along with all of the genotype interaction terms (this is the test for any genetic effect). Individual genotype interaction terms can be tested by creating Wald test statistics from the reported effect size estimates and their standard errors (Note: when `ivars`

contains a single continuous or binary covariate, this test is the same as the test for any genotype interaction effect mentioned above). In order to test more complex hypotheses regarding subsets of multiple genotype interaction terms, `ivar.return.betaCov`

can be used to retrieve the estimated covariance matrix of the effect size estimates.

### Value

A data.frame where each row refers to a different SNP with the columns:

`snpID` |
The SNP ID |

`chr` |
The numeric chromosome value |

`n` |
The number of samples used to analyze the SNP |

`MAF` |
The estimated minor allele frequency |

`minor.allele` |
Either "A" or "B" indicating which allele is the minor allele |

If `test`

is "Score":

`Score` |
The value of the score function |

`Var` |
The variance of the score function |

`Score.Stat` |
The score chi-squared test statistic |

`Score.pval` |
The score p-value |

If `test`

is "Wald" and `ivars`

is `NULL`

:

`Est` |
The effect size estimate for each additional copy of the "A" allele |

`SE` |
The estimated standard error of the effect size estimate |

`Wald.Stat` |
The Wald chi-squared test statistic |

`Wald.pval` |
The Wald p-value |

If `test`

is "Wald" and `ivars`

is not `NULL`

:

`Est.G` |
The effect size estimate for the genotype term |

`Est.G:ivar` |
The effect size estimate for the genotype*ivar interaction term. There will be as many of these terms as there are interaction variables, and "ivar" will be replaced with the variable name. |

`SE.G` |
The estimated standard error of the genotype term effect size estimate |

`SE.G:ivar` |
The estimated standard error of the genotype*ivar effect size estimate. There will be as many of these terms as there are interaction variables, and "ivar" will be replaced with the variable name. |

`GxE.Stat` |
The Wald chi-squared test statistic for the test of all genotype interaction terms. When there is only one genotype interaction term, this is the test statistic for that term. |

`GxE.pval` |
The Wald p-value for the test of all genotype interaction terms; i.e. the test of any genotype interaction effect |

`Joint.Stat` |
The Wald chi-squared test statistic for the joint test of the genotype term and all of the genotype interaction terms |

`Joint.pval` |
The Wald p-value for the joint test of the genotype term and all of the genotype interaction terms; i.e. the test of any genotype effect |

When `ivars`

is not `NULL`

, if `ivar.return.betaCov`

is `TRUE`

, then the output is a list with two elements. The first, "results", is the data.frame described above. The second, "betaCov", is a list with length equal to the number of rows of "results", where each element of the list is the covariance matrix of the effect size estimates (betas) for the genotype and genotype interaction terms.

If `genoData`

is a `SeqVarData`

object, the effect size estimate is for each copy of the alternate allele.

### Note

The `GenotypeData`

function in the `GWASTools`

package should be used to create the input `genoData`

. Input to the `GenotypeData`

function can easily be created from an R matrix or GDS file. PLINK .bed, .bim, and .fam files can easily be converted to a GDS file with the function `snpgdsBED2GDS`

in the `SNPRelate`

package. Alternatively, the `SeqVarData`

function in the `SeqVarTools`

package can be used to create the input `genodata`

when working with sequencing data.

### Author(s)

Matthew P. Conomos

### See Also

`fitNullMM`

for fitting the null mixed model needed as input to `assocTestMM`

.
`qqPlot`

for a function to make QQ plots and `manhattanPlot`

for a function to make Manhattan plots of p-values.
`GWASTools`

for a description of the package containing the following functions: `GenotypeData`

for a description of creating a `GenotypeData`

class object for storing sample and SNP genotype data, `MatrixGenotypeReader`

for a description of reading in genotype data stored as a matrix, and `GdsGenotypeReader`

for a description of reading in genotype data stored as a GDS file. Also see `snpgdsBED2GDS`

in the `SNPRelate`

package for a description of converting binary PLINK files to GDS.

### Examples

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 | ```
library(GWASTools)
# file path to GDS file
gdsfile <- system.file("extdata", "HapMap_ASW_MXL_geno.gds", package="GENESIS")
# read in GDS data
HapMap_geno <- GdsGenotypeReader(filename = gdsfile)
# create a GenotypeData class object
HapMap_genoData <- GenotypeData(HapMap_geno)
# load saved matrix of KING-robust estimates
data("HapMap_ASW_MXL_KINGmat")
# run PC-AiR
mypcair <- pcair(genoData = HapMap_genoData, kinMat = HapMap_ASW_MXL_KINGmat,
divMat = HapMap_ASW_MXL_KINGmat)
# run PC-Relate
mypcrel <- pcrelate(genoData = HapMap_genoData, pcMat = mypcair$vectors[,1],
training.set = mypcair$unrels)
# generate a phenotype
set.seed(4)
pheno <- 0.2*mypcair$vectors[,1] + rnorm(mypcair$nsamp, mean = 0, sd = 1)
# make ScanAnnotationDataFrame
scanAnnot <- ScanAnnotationDataFrame(data.frame(scanID = mypcrel$sample.id,
pc1 = mypcair$vectors[,1], pheno = pheno))
# make covMatList
covMatList <- list("Kin" = pcrelateMakeGRM(mypcrel))
# fit the null mixed model
nullmod <- fitNullMM(scanData = scanAnnot, outcome = "pheno", covars = "pc1", covMatList = covMatList)
# run the association test
myassoc <- assocTestMM(genoData = HapMap_genoData, nullMMobj = nullmod)
close(HapMap_genoData)
# make a QQ plot
qqPlot(myassoc$Wald.pval)
``` |