haplotyping: Haplotype genotypes

ghap.haplotypingR Documentation

Haplotype genotypes

Description

Generate matrix of HapGenotypes for user-defined blocks.

Usage

ghap.haplotyping(object, blocks, outfile,
                 freq = c(0, 1), drop.minor = FALSE,
                 only.active.samples = TRUE,
                 only.active.markers = TRUE,
                 batchsize = NULL, binary = TRUE,
                 ncores = 1, verbose = TRUE)

Arguments

object

A GHap.phase object.

blocks

A data frame containing block boundaries, such as supplied by the ghap.blockgen function.

outfile

A character value specifying the name for the output files.

freq

A numeric vector of length 2 specifying the range of haplotype allele frequency to be included in the output. Default is c(0,1), which includes all alleles.

drop.minor

A logical value specfying whether the minor allele should be excluded from the output (default = FALSE).

only.active.samples

A logical value specifying whether only active samples should be included in the output (default = TRUE).

only.active.markers

A logical value specifying whether only active markers should be used for haplotyping (default = TRUE).

batchsize

A numeric value controlling the number of haplotype blocks to be processed and written to output at a time (default = nblocks/10).

binary

A logical value specfying whether the output file should be binary (default = TRUE).

ncores

A numeric value specifying the number of cores to be used in parallel computations (default = 1).

verbose

A logical value specfying whether log messages should be printed (default = TRUE).

Value

The function outputs three files with suffix:

  • .hapsamples: space-delimited file without header containing two columns: Population and Individual ID.

  • .hapalleles: space-delimited file without header containing five columns: Block Name, Chromosome, Start and End Position (in bp), and HapAllele.

  • .hapgenotypes: if binary = FALSE, a space-delimited file without header containing the HapGenotype matrix (coded as 0, 1 or 2 copies of the HapAllele). The dimension of the matrix is m x n, where m is the number of HapAlleles and n is the number of individuals.

  • .hapgenotypesb: if binary = TRUE (default), the same matrix as described above compressed into bits. For seamless compatibility with softwares that use PLINK binary files, the compression is performed using the SNP-major bed format.

Author(s)

Yuri Tani Utsunomiya <ytutsunomiya@gmail.com>

Marco Milanesi <marco.milanesi.mm@gmail.com>

Examples


# #### DO NOT RUN IF NOT NECESSARY ###
# 
# # Copy phase data in the current working directory
# exfiles <- ghap.makefile(dataset = "example",
#                          format = "phase",
#                          verbose = TRUE)
# file.copy(from = exfiles, to = "./")
# 
# # Load data
# phase <- ghap.loadphase("example")
# 
# ### RUN ###
# 
# # Generate blocks
# blocks <- ghap.blockgen(phase, windowsize = 5,
#                         slide = 5, unit = "marker")
# 
# # Haplotyping
# ghap.haplotyping(phase, blocks = blocks,
#                  outfile = "example",
#                  binary = T, ncores = 1)


GHap documentation built on July 2, 2022, 1:07 a.m.