ped2geno: Transformation of Ped-File

Description Usage Arguments Value Author(s) See Also Examples

View source: R/ped2geno.R

Description

Transforms a ped-file into a genotype file as required by, e.g., the functions for computing the genotypic TDT.

Usage

1
2
ped2geno(ped, snpnames = NULL, coded = c("12", "AB", "ATCG", "1234"), 
   naVal = 0, cols4ID = FALSE)

Arguments

ped

a data frame in ped format, i.e. the first six columns must contain information on the families as typically presenteed in ped files, where the column names of these six columns must be "famid", "pid", "fatid", "motid", "sex","affected". The last two of these six columns are ignored. The IDs of individuals in the second column must be unique (not only within the family, but among all individuals). The columns following the six columns are assumed to contain the alleles of the SNPs, where the alleles are coded using the letters/numbers in coded, and missing values are coded by naVal. Thus, the seventh and the eigth column contain the two alleles for the first SNP, the ninth and tenth the two alleles for the second SNP, and so on. Contrary to the names of the first six columns, the names of the columns representing the SNPs are ignored, and SNP names can be specified using snpnames.

snpnames

a character vector containing the names of the SNPs. If not specified, generic names are assigned (i.e. SNP1, SNP2, ...). Ignored if ped just contains one SNPs.

coded

the coding used for the alleles of the SNPs. coded = "12", e.g., means that one of the alleles is coded by 1, and the other by 0. coded = "ATCG" means that the alleles are coded by the actual base.

naVal

the value used for specifying missing values.

cols4ID

logical indicating whether columns should be added to output matrix containing the family ID and the individual ID. If FALSE, the individual IDs are used as the row names of the output matrix.

Value

A vector (if ped consists of alleles for one SNP) or matrix (otherwise) containing one column for each SNP representing the genotypes of the respective SNP, where the genotypes are coded by 0, 1, 2 (i.e. the number of minor alleles), and missing values are represented by NA. The vector or matrix contains 3 * t values for each SNP genotyped at the t trios, where each block of 3 values is composed of the genotypes of the father, the mother, and the offspring (in this order) of a specific trio. If data for a family with more than one children are available, each of the children is treated as a separate trio.

Author(s)

Holger Schwender, holger.schwender@udo.edu

See Also

tdt, tdt2way, trio.check

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
## Not run: 
# Assuming there is a ped-file called pedfile.ped in the 
# R working directory, this file can be read into R by
ped <- read.pedfile("pedfile.ped")

# The resulting data frame is in the typical ped format
# which needs to be transformed into the genotype format
# for applications of most of the functions in the trio
# package. This transformation can be done by
geno <- ped2geno(ped)

# This transformation can also be done directly when
# reading the ped-file into R by
geno2 <- read.pedfile("pedfile.ped", p2g = TRUE)

## End(Not run)

trio documentation built on Nov. 8, 2020, 7:41 p.m.