seqhap: Sequential Haplotype Scan Association Analysis for...
In haplo.stats: Statistical Analysis of Haplotypes with Traits and Covariates when Linkage Phase is Ambiguous

seqhap

R Documentation

Sequential Haplotype Scan Association Analysis for Case-Control Data

Description

Seqhap implements sequential haplotype scan methods to perform association analyses for case-control data. When evaluating each locus, loci that contribute additional information to haplotype associations with disease status will be added sequentially. This conditional evaluation is based on the Mantel-Haenszel (MH) test. Two sequential methods are provided, a sequential haplotype method and a sequential summary method, as well as results based on the traditional single-locus method. Currently, seqhap only works with bialleleic loci (single nucleotide polymorphisms, or SNPs) and binary traits.

Usage

seqhap(y, geno, pos, locus.label=NA, weight=NULL, 
       mh.threshold=3.84, r2.threshold=0.95, haplo.freq.min=0.005, 
       miss.val=c(0, NA), sim.control=score.sim.control(),
       control=haplo.em.control())
## S3 method for class 'seqhap'
print(x, digits=max(options()$digits-2, 5), ...)

Arguments

`y`	vector of binary response (1=case, 0=control). The length is equal to the number of rows in geno.
`geno`	matrix of alleles, such that each locus has a pair of adjacent columns of alleles, and the order of columns corresponds to the order of loci on a chromosome. If there are K loci, then ncol(geno)=2*K. Rows represent the alleles for each subject. Currently, only bi-allelic loci (SNPs) are allowed.
`pos`	vector of physical positions (or relative physical positions) for loci. If there are K loci, length(pos)=K. The scale (in kb, bp, or etc.) doesn't affect the results.
`locus.label`	vector of labels for the set of loci
`weight`	weights for observations (rows of geno matrix).
`mh.threshold`	threshold for the Mantel-Haenszel statistic that evaluates whether a locus contributes additional information of haplotype association to disease, conditional on current haplotypes. The default is 3.84, which is the 95th percentile of the chi-square distribution with 1 degree of freedom.
`r2.threshold`	threshold for a locus to be skipped. When scanning locus k, loci with correlations r-squared (the square of the Pearson's correlation) greater than r2.threshold with locus k will be ignored, so that the haplotype growing process continues for markers that are further away from locus k.
`haplo.freq.min`	the minimum haplotype frequency for a haplotype to be included in the association tests. The haplotype frequency is based on the EM algorithm that estimates haplotype frequencies independent of trait.
`miss.val`	vector of values that represent missing alleles.
`sim.control`	A list of control parameters to determine how simulations are performed for permutation p-values, similar to the strategy in haplo.score. The list is created by the function score.sim.control and the default values of this function can be changed as desired. Permutations are performed until a p.threshold accuracy rate is met for the three region-based p-values calculated in seqhap. See score.sim.control for details.
`control`	A list of parameters that control the EM algorithm for estimating haplotype frequencies when phase is unknown. The list is created by the function haplo.em.control - see this function for more details.
`x`	a seqhap object to print
`digits`	Number of significant digits to print for numeric values
`...`	Additional parameters for the print method

Details

No further details

Value

list with components:

`converge`	indicator of convergence of the EM algorithm (see haplo.em); 1 = converge, 0=failed
`locus.label`	vector of labels for loci
`pos`	chromosome positions for loci, same as input.
`n.sim`	number of permutations performed for emperical p-values
`inlist`	matrix that shows which loci are combined for association analysis in the sequential scan. The non-zero values of the kth row of inlist are the indices of the loci combined when scanning locus k.
`chi.stat`	chi-square statistics of single-locus analysis.
`chi.p.point`	permuted pointwise p-values of single-locus analysis.
`chi.p.region`	permuted regional p-value of single-locus analysis.
`hap.stat`	chi-square statistics of sequential haplotype analysis.
`hap.df`	degrees of freedom of sequential haplotype analysis.
`hap.p.point`	permuted pointwise p-values of sequential haplotype analysis.
`hap.p.region`	permuted region p-value of sequential haplotype analysis.
`sum.stat`	chi-square statistics of sequential summary analysis.
`sum.df`	degrees of freedom of sequential summary analysis.
`sum.p.point`	permuted pointwise p-values of sequential summary analysis.
`sum.p.region`	permuted regional p-value of sequential summary analysis.

References

Yu Z, Schaid DJ. (2007) Sequential haplotype scan methods for association analysis. Genet Epidemiol, in print.

Examples


# load example data with response and genotypes. 
data(seqhap.dat)
mydata.y <- seqhap.dat[,1]
mydata.x <- seqhap.dat[,-1]
# load positions
data(seqhap.pos)
pos <- seqhap.pos$pos
# run seqhap with default settings
## Not run: 
  # this example takes 5-10 seconds to run
  myobj <- seqhap(y=mydata.y, geno=mydata.x, pos=pos)
  print.seqhap(myobj)

## End(Not run)

haplo.stats documentation built on May 29, 2024, 9:53 a.m.