getgenotypes: fetch genotype character matrix for specified markers

View source: R/mega2rcreate.R

getgenotypesR Documentation

fetch genotype character matrix for specified markers

Description

This function calls a C++ function that does all the heavy lifting. It passes the arguments necessary for the C++ function: some from the caller's arguments and some from data frames that are in the "global" environment, envir. From the markers_arg argument, it fetches the locus_index and the index in the unified_genotype_table. It also passes the allele nucleotide separator argument. From the "global" environment, envir, it gets a bit vector of compressed genotype information, the alleles for each marker, and some bookkeeping related data. Note: This function also contains a dispatch/switch on the type of compression in the genotype vector. A different C++ function is called when there is compression versus when there is no compression.

Usage

getgenotypes(markers_arg, sepstr = "", envir = ENV)

Arguments

markers_arg

a data.frame with the following 5 observations:

locus_link

is the ordinal ranking of this marker among all loci

locus_link_fill

is the position of corresponding genotype data in the unified_genotype_table

MarkerName

is the text name of the marker

chromosome

is the integer chromosome number

position

is the integer base pair position of marker

sepstr

separator string inserted between the alleles (default is none). When present, this is typically a space, a tab or "/".

envir

an environment that contains all the data frames created from the SQLite database.

Details

The unified_genotype_table contains one raw vector for each person. In the vector there are two bits for each genotype. This function creates an output matrix by fixing the marker and collecting genotype information for each person and then repeating for all the needed markers. (Currently, this appears slightly faster than a scan which is fixes the person and iterates over markers.)

Value

a matrix of genotypes represented as two allele pairs. The matrix has one column for each marker in markers_arg argument. There is one row for each person in the family (fam) table.

Examples

db = system.file("exdata", "seqsimm.db", package="Mega2R")
ENV = read.Mega2DB(db)

getgenotypes(ENV$markers)


Mega2R documentation built on May 29, 2024, 1:14 a.m.