Score tests with SNP genotypes as dependent variable


Under the assumption of Hardy-Weinberg equilibrium, a SNP genotype is a binomial variate with two trials for an autosomal SNP or with one or two trials (depending on sex) for a SNP on the X chromosome. With each SNP in an input "SnpMatrix" as dependent variable, this function first fits a "base" logistic regression model and then carries out a score test for the addition of further term(s). The Hardy-Weinberg assumption can be relaxed by use of a "robust" option.


snp.lhs.tests(, base.formula, add.formula, subset, snp.subset,
                data = sys.parent(), robust = FALSE, uncertain = FALSE, 
                control=glm.test.control(), score=FALSE)


The SNP data, as an object of class "SnpMatrix" or "XSnpMatrix"


A formula object describing the base model, with dependent variable omitted


A formula object describing the additional terms to be tested, also with dependent variable omitted


An array describing the subset of observations to be considered


An array describing the subset of SNPs to be considered. Default action is to test all SNPs.


The data frame in which base.formula, add.formula and subset are to be evaluated


If TRUE, a test which does not assume Hardy-Weinberg equilibrium will be used


If TRUE, uncertain genotypes are used and scored by their posterior expectations. Otherwise they are treated as missing. If set, this option forces robust variance estimates


An object giving parameters for the IRLS algorithm fitting of the base model and for the acceptable aliasing amongst new terms to be tested. See glm.test.control


Is extended score information to be returned?


The tests used are asymptotic chi-squared tests based on the vector of first and second derivatives of the log-likelihood with respect to the parameters of the additional model. The "robust" form is a generalized score test in the sense discussed by Boos(1992). If a data argument is supplied, the and data objects are aligned by rowname. Otherwise all variables in the model formulae are assumed to be stored in the same order as the columns of the object.


An object of class snp.tests.glm or GlmTests.score depending on whether score is set to FALSE or TRUE in the call.


A factor (or several factors) may be included as arguments to the function strata(...) in the base.formula. This fits all interactions of the factors so included, but leads to faster computation than fitting these in the normal way. Additionally, a cluster(...) call may be included in the base model formula. This identifies clusters of potentially correlated observations (e.g. for members of the same family); in this case, an appropriate robust estimate of the variance of the score test is used. No more than one strata() call may be used, and neither strata(...) or cluster(...) calls may appear in the add.formula. A known bug is that the function fails when no data argument is supplied and the base model formula contains no variables (~1). A work-round is to create a data frame to hold the variables in the models and pass this as data=.


David Clayton


Boos, Dennis D. (1992) On generalized score tests. The American Statistician, 46:327-333.

See Also

GlmTests-class, GlmTestsScore-class, glm.test.control,snp.rhs.tests single.snp.tests, SnpMatrix-class, XSnpMatrix-class


snp.lhs.tests(Autosomes[,1:10], ~cc, ~region,
snp.lhs.tests(Autosomes[,1:10], ~strata(region), ~cc,

Questions? Problems? Suggestions? or email at

All documentation is copyright its authors; we didn't write any of that.