getgenotypesgenabel: process the genotype matrix for specified markers and return...

View source: R/mega2rcreate.R

getgenotypesgenabelR Documentation

process the genotype matrix for specified markers and return the corresponding GenABEL genotype matrix

Description

This function calls a C++ function that does all the heavy lifting. It passes the arguments necessary for the C++ function: some from the caller's arguments and some from data frames that are in the "global" environment, envir. From its markers_arg argument, it gets the locus_index and the index in the unified_genotype_table. From the "global" environment, envir, it gets a bit vector of compressed genotype information, allele information, and some bookkeeping related data. Note: This function also contains a dispatch/switch on the type of compression in the genotype vector. A different C++ function is called when there is compression versus when there is no compression.

Usage

getgenotypesgenabel(markers_arg, envir = ENV)

Arguments

markers_arg

a data.frame with the following 5 observations:

locus_link

is the ordinal ranking of this marker among all loci

locus_link_fill

is the position of corresponding genotype data in the unified_genotype_table

MarkerName

is the text name of the marker

chromosome

is the integer chromosome number

position

is the integer base pair position of marker

envir

an environment that contains all the data frames created from the SQLite database.

Details

This function reads the genotype data in Mega2 compressed format and converts it to the GenABEL compressed format. The unified_genotype_table contains one raw vector for each person. In the vector, there are two bits for each genotype; each byte has the data for 4 markers. In GenABEL, there is one raw vector per marker, and each byte has the data for 4 persons. The C++ function does the conversion as well as adjusts the bits' contents. For example, in GenABEL the genotype represented by bits == 0, is what Mega2 represents with 2. Doing the conversion in C++ is 10 - 20 times faster than converting the Mega2 data to PLINK .tped files and then having GenABEL read in and process/convert those files.

Value

the GenABEL gwaa.data-class object component that contains the genotype data.

Note

This function is called from Mega2ENVGenABEL; it is not intended to be called by the programmer.

Examples

db = system.file("exdata", "seqsimm.db", package="Mega2R")
ENV = read.Mega2DB(db)

aa = getgenotypesgenabel(ENV$markers[ENV$markers$chromosome == 1,])

aa


Mega2R documentation built on May 29, 2024, 1:14 a.m.