sumstat_ihh: Summary Statistic: Integrated Extended Haplotype Homozygosity

View source: R/sumstat_ihh.R

sumstat_ihhR Documentation

Summary Statistic: Integrated Extended Haplotype Homozygosity

Description

This summary statistic calculates a number of values based on extended haplotype homozygosity (EHH), including iHH, iES and optionally iHS. Coala relies on scan_hh from package rehh to calculate this statistic. Please refer to their documentation for detailed information on the implementation. Please cite the corresponding publication (see below) if you use the statistic for a publication.

Usage

sumstat_ihh(
  name = "ihh",
  population = 1,
  max_snps = 1000,
  calc_ihs = FALSE,
  transformation = identity
)

Arguments

name

The name of the summary statistic. When simulating a model, the value of the statistics are written to an entry of the returned list with this name. Summary statistic names must be unique in a model.

population

The population for which the statistic is calculated. Can also be "all" to calculate it from all populations. Default is population 1.

max_snps

The maximal number of SNPs per locus that are used for the calculation. If a locus has more SNPs, only a random subset of them will be used to increase performance. Set to Inf to use all SNPs.

calc_ihs

If set to TRUE, additionally standardized iHS is calculated.

transformation

An optional function for transforming the results of the statistic. If specified, the results of the transformation are returned instead of the original values.

Value

If calc_ihs = FALSE, a data.frame with values for iHH and iES is returned. Otherwise, a list of two data frames are returned, one for IHH and IES values and the other one for IHS values.

In all data.frames rows are SNPs and the columns present the following values for each SNP:

  • CHR: The SNP's locus

  • POSITION: The SNP's absolute position on its locus

  • FREQ_A: The frequency of the ancestral allele

  • FREQ_D: The frequency of the derived allele

  • IHH_A: integrated EHH for the ancestral allele

  • IHH_D: integrated EHH for the derived allele

  • IES: integrated EHHS

  • INES: integrated normalized EHHS

  • IHS: iHS, normalized over all loci.

References

  • Mathieu Gautier and Renaud Vitalis, rehh: an R package to detect footprints of selection in genome-wide SNP data from haplotype structure. Bioinformatics (2012) 28 (8): 1176-1177 first published online March 7, 2012 doi:10.1093/bioinformatics/bts115

  • Voight et al., A map of recent positive selection in the human genome. PLoS Biol, 4(3):e72, Mar 2006.

Author(s)

Paul Staab

See Also

To create a demographic model: coal_model

To calculate this statistic from data: calc_sumstats_from_data

Other summary statistics: sumstat_dna(), sumstat_file(), sumstat_four_gamete(), sumstat_jsfs(), sumstat_mcmf(), sumstat_nucleotide_div(), sumstat_omega(), sumstat_seg_sites(), sumstat_sfs(), sumstat_tajimas_d(), sumstat_trees()

Examples

  model <- coal_model(20, 1, 1000) +
    feat_mutation(1000) +
    sumstat_ihh()

    stat <- simulate(model)
    print(stat$ihh)

statgenlmu/coala documentation built on March 5, 2024, 10:49 p.m.