getSegInfo: Computes segregation information for different mode of...

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

Computes variant-based and gene-based segregation information for different mode of inheritance.

Usage

1
getSegInfo(pednew, dataPed, mapInfo, mode="recessive")

Arguments

pednew

A data frame of the complete pedigree information for all families in the dataset. The required column names of this data frame include: FID (family ID), IID (individual ID, must be of class character), faID (father ID, NA if unavailable), moID (mother ID, NA if unavailable), and sex.

dataPed

A data frame in the raw file format generated by PLINK. The number of rows equal the number of subjects in the data and the number of columns equas the number of markers M + 6. The first six columns with specific column names include: the Family ID (FID), Individual ID (IID), father ID(PAT), mother ID (MAT), sex (SEX) and affection status (PHENOTYPE). The rest of the columns containing the genotypes for the variants listed in the coreesponding mapInfo file. It is also important to make sure that the recoding is with respect to the minor allele in the population. The affection status of this file will be used as the phenotype.

mapInfo

A data frame of at least two columns (required column names): variant ID (SNP) and Gene name (GENE). The number of rows equal to the number of SNPs/markers to be considered (M).

mode

The mode of inheriance assumed to compute the segregation information. The options are "dominant", "recessive", and "CH" (compound heterozygous). The default value is "recessive".

Details

This function is used to compute the segregation information for different mode of inheritance without computing the GESE test. The mode of inheritance supported here are: dominant, recessive and compound heterozygous (CH). For dominant mode of inheritance, a variant is segregating if all the cases in the family carry at least one alternative allele (genotype X>0), and all the controls in the family do not carry any alternative allele (X=0). For recessive mode of inheritance, a variant is segregating if all the cases in the family carry two alternative alleles (X=2), and all the controls in the family carry less than 2 alternative alleles (X=0 or X=1). For compound heterozygous mode of inheritance, a variant is segregating at two variant position if all the cases in the family carry at least one alternative allele at the two positions (X1>0 and X2>0), and all the controls in the family do not carry any alternative allele at either of the two positions (X1 = 0 or X2 = 0).

Value

varSeg

For dominant and recessive mode of inheriancce, this is a data frame containing the information about whether each variant is segregating in each family. The number of columns equals the number of families +3. The last column is the number of families the variant is segregating in. The number of rows equals the number of variants. For compound heterozygous mode of inheritance, this is a data frame containing the information of whether each pair of variants is segregating in each of the families. We consider all pairs in the dataset, if the pair of variants are not included in this data frame, they are not segregating in any families.

geneSeg

For dominant and recessive mode of inheriancce, this is a data frame containing the information about whether each gene is segregating in each family. The number of columns equals the number of families +3. The last column is the number of families the gene is segregating in. The number of rows equals the number of genes. For compound heterozygous mode of inheritance, this is a data frame containing the information of whether any pair of variants in this gene are segregating in each of the families. The last columns is the number of families with the presence of any pair of variants segregating in the gene.

genePairSeg

This data frame is returned only for compound heterozygous mode of inheritance. This considers any pair of genes in the data. It returns a data frame containing the information of whether any pair of variants, each in a different gene, is segregating in each of the families considered. Each row represents the information for each gene pair, summed over all possible pairs of variants in the two genes, one in each gene.

Author(s)

Dandi Qiao

References

Qiao, D. Lange, C., Laird, N.M., Won, S., Hersh, C.P., et al. (2017). Gene-based segregation method for identifying rare variants for family-based sequencing studies. Genet Epidemiol 41(4):309-319. DOI:10.1002/gepi.22037.

See Also

GESE

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
data(pednew)
data(mapInfo)
data(dataRaw)
data(database)
result <- getSegInfo(pednew, dataRaw, mapInfo)
result$varSeg
result$geneSeg

result <- getSegInfo(pednew, dataRaw, mapInfo, mode="recessive")
result$varSeg
result$geneSeg

result <- getSegInfo(pednew, dataRaw, mapInfo, mode="CH")
result$varSeg
result$geneSeg
result$genePairSeg

GESE documentation built on May 2, 2019, 3:59 a.m.