Haplotype genotypes

Description

Generate matrix of HapGenotypes for user-defined blocks.

Usage

1
2
ghap.haplotyping(phase, blocks, outfile, freq = 0.05,
 batchsize = 500, ncores = 1, verbose = TRUE)

Arguments

phase

A GHap.phase object.

blocks

A data frame containing block boundaries, such as supplied by the ghap.blockgen function.

outfile

A character value specifying the name for the output files.

freq

A numeric value specifying the minimum haplotype allele frequency (default = 0.05).

batchsize

A numeric value controlling the number of haplotype blocks to be processed and written to output at a time (default = 500).

ncores

A numeric value specifying the number of cores to be used in parallel computations (default = 1).

verbose

A logical value specfying whether log messages should be printed (default = TRUE).

Value

The function outputs three files with suffix:

  • .hapsamples: space-delimited file without header containing two columns: Population and Individual ID.

  • .hapalleles: space-delimited file without header containing five columns: Block Name, Chromosome, Start and End Position (in bp), and HapAllele.

  • .hapgenotypes: space-delimited file without header containing the HapGenotype matrix (coded as 0, 1 or 2 copies of the HapAllele). The dimension of the matrix is m x n, where m is the number of HapAlleles and n is the number of individuals.

Author(s)

Yuri Tani Utsunomiya <ytutsunomiya@gmail.com>

Marco Milanesi <marco.milanesi.mm@gmail.com>

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
# #### DO NOT RUN IF NOT NECESSARY ###
# 
# # Copy the example data in the current working directory
# ghap.makefile()
# 
# # Load data
# phase <- ghap.loadphase("human.samples", "human.markers", "human.phase")
# 
# # Subset data - randomly select 3000 markers with maf > 0.02
# maf <- ghap.maf(phase, ncores = 2)
# set.seed(1988)
# markers <- sample(phase$marker[maf > 0.02], 3000, replace = FALSE)
# phase <- ghap.subsetphase(phase, unique(phase$id), markers)
# rm(maf,markers)
# 
# # Generate block coordinates based on windows of 10 markers, sliding 5 marker at a time
# blocks <- ghap.blockgen(phase, 10, 5, "marker")
# 
# ### RUN ###
# 
# # Generate matrix of haplotype genotypes
# ghap.haplotyping(phase, blocks, batchsize = 100, ncores = 2, freq = 0.05, outfile = "example")