seqhap: Sequential Haplotype Scan Association Analysis for...

View source: R/seqhap.q

seqhapR Documentation

Sequential Haplotype Scan Association Analysis for Case-Control Data

Description

Seqhap implements sequential haplotype scan methods to perform association analyses for case-control data. When evaluating each locus, loci that contribute additional information to haplotype associations with disease status will be added sequentially. This conditional evaluation is based on the Mantel-Haenszel (MH) test. Two sequential methods are provided, a sequential haplotype method and a sequential summary method, as well as results based on the traditional single-locus method. Currently, seqhap only works with bialleleic loci (single nucleotide polymorphisms, or SNPs) and binary traits.

Usage

seqhap(y, geno, pos, locus.label=NA, weight=NULL, 
       mh.threshold=3.84, r2.threshold=0.95, haplo.freq.min=0.005, 
       miss.val=c(0, NA), sim.control=score.sim.control(),
       control=haplo.em.control())
## S3 method for class 'seqhap'
print(x, digits=max(options()$digits-2, 5), ...)

Arguments

y

vector of binary response (1=case, 0=control). The length is equal to the number of rows in geno.

geno

matrix of alleles, such that each locus has a pair of adjacent columns of alleles, and the order of columns corresponds to the order of loci on a chromosome. If there are K loci, then ncol(geno)=2*K. Rows represent the alleles for each subject. Currently, only bi-allelic loci (SNPs) are allowed.

pos

vector of physical positions (or relative physical positions) for loci. If there are K loci, length(pos)=K. The scale (in kb, bp, or etc.) doesn't affect the results.

locus.label

vector of labels for the set of loci

weight

weights for observations (rows of geno matrix).

mh.threshold

threshold for the Mantel-Haenszel statistic that evaluates whether a locus contributes additional information of haplotype association to disease, conditional on current haplotypes. The default is 3.84, which is the 95th percentile of the chi-square distribution with 1 degree of freedom.

r2.threshold

threshold for a locus to be skipped. When scanning locus k, loci with correlations r-squared (the square of the Pearson's correlation) greater than r2.threshold with locus k will be ignored, so that the haplotype growing process continues for markers that are further away from locus k.

haplo.freq.min

the minimum haplotype frequency for a haplotype to be included in the association tests. The haplotype frequency is based on the EM algorithm that estimates haplotype frequencies independent of trait.

miss.val

vector of values that represent missing alleles.

sim.control

A list of control parameters to determine how simulations are performed for permutation p-values, similar to the strategy in haplo.score. The list is created by the function score.sim.control and the default values of this function can be changed as desired. Permutations are performed until a p.threshold accuracy rate is met for the three region-based p-values calculated in seqhap. See score.sim.control for details.

control

A list of parameters that control the EM algorithm for estimating haplotype frequencies when phase is unknown. The list is created by the function haplo.em.control - see this function for more details.

x

a seqhap object to print

digits

Number of significant digits to print for numeric values

...

Additional parameters for the print method

Details

No further details

Value

list with components:

converge

indicator of convergence of the EM algorithm (see haplo.em); 1 = converge, 0=failed

locus.label

vector of labels for loci

pos

chromosome positions for loci, same as input.

n.sim

number of permutations performed for emperical p-values

inlist

matrix that shows which loci are combined for association analysis in the sequential scan. The non-zero values of the kth row of inlist are the indices of the loci combined when scanning locus k.

chi.stat

chi-square statistics of single-locus analysis.

chi.p.point

permuted pointwise p-values of single-locus analysis.

chi.p.region

permuted regional p-value of single-locus analysis.

hap.stat

chi-square statistics of sequential haplotype analysis.

hap.df

degrees of freedom of sequential haplotype analysis.

hap.p.point

permuted pointwise p-values of sequential haplotype analysis.

hap.p.region

permuted region p-value of sequential haplotype analysis.

sum.stat

chi-square statistics of sequential summary analysis.

sum.df

degrees of freedom of sequential summary analysis.

sum.p.point

permuted pointwise p-values of sequential summary analysis.

sum.p.region

permuted regional p-value of sequential summary analysis.

References

Yu Z, Schaid DJ. (2007) Sequential haplotype scan methods for association analysis. Genet Epidemiol, in print.

See Also

haplo.em, print.seqhap, plot.seqhap, score.sim.control

Examples


# load example data with response and genotypes. 
data(seqhap.dat)
mydata.y <- seqhap.dat[,1]
mydata.x <- seqhap.dat[,-1]
# load positions
data(seqhap.pos)
pos <- seqhap.pos$pos
# run seqhap with default settings
## Not run: 
  # this example takes 5-10 seconds to run
  myobj <- seqhap(y=mydata.y, geno=mydata.x, pos=pos)
  print.seqhap(myobj)

## End(Not run)

haplo.stats documentation built on Jan. 22, 2023, 1:40 a.m.