Inference of haplotype origin

Description

Given one test population and two parental populations, the function assigns haplotype alleles in the test population to one of the parental populations.

Usage

1
2
 ghap.ancestral(hapstats.test, hapstats.parent1, hapstats.parent2, freq = 0.05,
  prob.assign = 0.60)

Arguments

hapstats.test

A data.frame containing haplotype statistics for the population to be tested, as generated by the ghap.hapstats function.

hapstats.parent1

A data.frame containing haplotype statistics for the first parental population.

hapstats.parent2

A data.frame containing haplotype statistics for the second parental population.

freq

The haplotype frequency threshold used to filter haplotype alleles in the parental populations (default = 0.05). For each parental population, if the allele frequency is lower than the threshold the probability of origin is automatically set to zero.

prob.assign

The probability threshold used for the assignment test (default = 0.60). Haplotypes that are not assigned to any of the parental populations are also marked as unassigned (UNK).

Details

This function calculates the probability that one haplotype from a tested population was inherited from one of the tested parental populations. The function followed the method described by Bolormaa et al. (2011).

Value

The function returns a dataframe and a file with the following columns:

BLOCK

Block alias.

CHR

Chromosome name.

BP1

Block start position.

BP2

Block end position.

ALLELE

Haplotype allele identity.

FREQ.TEST

Haplotype frequency in the test population.

FREQ.PARENT[1 and 2]

Haplotype frequency in the first and second parental populations, respectively.

PROB.PARENT[1 and 2]

Assignment probabilities calculated following Bolormaa et al. (2011).

ORIGIN

Haplotype origin (PARENT1, PARENT2 or UNK).

Author(s)

Marco Milanesi <marco.milanesi.mm@gmail.com>

References

S. Bolormaa et al. Detection of chromosome segments of zebu and taurine origin and their effect on beef production and growth. J. Anim. Sci. 2011. 89:2050-2060.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
##### DO NOT RUN IF NOT NECESSARY ###
#
## Copy the example data in the current working directory
#ghap.makefile()
#
## Load data
#phase <- ghap.loadphase("human.samples", "human.markers", "human.phase")
#
#
#### RUN ###
#
## Subset data - randomly select 3000 markers with maf > 0.02
#ASW.ids <- unique(phase$id[phase$pop=="ASW"])
#YRI.ids <- unique(phase$id[phase$pop=="YRI"])
#CEU.ids <- unique(phase$id[phase$pop=="CEU"])
#phase <- ghap.subsetphase(phase, c(ASW.ids,YRI.ids,CEU.ids), phase$marker)
#maf <- ghap.maf(phase, ncores = 2)
#set.seed(1988)
#markers <- sample(phase$marker[maf > 0.02], 3000, replace = FALSE)
#phase <- ghap.subsetphase(phase, c(ASW.ids,YRI.ids,CEU.ids), markers)
#rm(maf,markers)
#
## Generate block coordinates based on windows of 10 markers, sliding 5 marker at a time
#blocks <- ghap.blockgen(phase, 10, 5, "marker")
#
## Generate matrix of haplotype genotypes
#ghap.haplotyping(phase, blocks, batchsize = 100, ncores = 2, freq = 0, outfile = "example")
#
# Load haplotype genotypes
#haplo <- ghap.loadhaplo("example.hapsamples", "example.hapalleles", "example.hapgenotypes")
#
# Compute haplotype allele statistics for each group
#haplo <- ghap.subsethaplo(haplo,YRI.ids,haplo$allele.in)
#YRI.hapstats <- ghap.hapstats(haplo,ncores = 2)
#haplo <- ghap.subsethaplo(haplo,CEU.ids,haplo$allele.in)
#CEU.hapstats <- ghap.hapstats(haplo,ncores = 2)
#haplo <- ghap.subsethaplo(haplo,ASW.ids,haplo$allele.in)
#ASW.hapstats <- ghap.hapstats(haplo,ncores = 2)

## Find haplotype origin
## ASW is the test population. YRI and CEU are used as parental populations
## The frequency threshold is set to 0.05 and the probability of assignment to 0.60
#ancestry <- ghap.ancestral(ASW.hapstats, YRI.hapstats, CEU.hapstats, 0.05, 0.60)