mdr: Perform a MDR.
In fischuu/GenomicTools: Collection of Tools for Genomic Data Analysis

View source: R/mdr.R

mdr	R Documentation

Perform a MDR.

Description

This function performs a Multifactor Dimensionality Reduction (MDR).

Usage

  mdr(X, status, fold=2, t=NULL, top=3, NAvalues=NA, cvc=0,
      fix=NULL, verbose=FALSE)

Arguments

`X`	Matrix with genotype information, see details.
`status`	Vector with group information of individuals in `X`.
`fold`	Maximum dimension of used contingency tables, see details.
`t`	Threshold for high/low risk.
`top`	Length of each top list.
`NAvalues`	Label of missing data.
`cvc`	Number of cross-validation splits.
`fix`	Column number of the SNP to be fixed.
`verbose`	Logical, if detailed status messages shall be given.

Details

The matrix X contains the genotype information, each column corresponds to a SNP, each row to an individuum. SNPs need to be coded as 0,1,2. In case the matrix X is not given in 0,1,2 format the function recodeData recodes the data into the required 0,1,2 format.

The status vector is as long as X has individuals/rows and specifies the group labels for each individual. Healthy individual need to be encoded as 0 and cases as 1. If the labeling is different the smaller values are used as controls and the larger one as cases.

The fold option specifies up to which dimension the contingency tables should be used. The current maximum is four.

The t option gives the threshold for the classification between high and low risk classes. The default is the ratio of the groups sizes.

With the top option the amount of results are set.

Setting the fix option to a column number, forces the mdr function to include that particular SNP into the result.

Missing data is labeled in different ways. The definition of missing data is given to the NAvalues option. By default missing data is encoded as NA, another possible option is e.g. 3. The downstream analysis ignors then missing data. Missing data is interally coded as 9999, so do not use this value to encode other genotypes.

The number of cross-validation splits can be set with the cvc option. If cvc is larger than 0, then the origianl data is split into cvc-many equally large subsets and the mdr function is called for each of them. Then, for each results from the top results of the full data is checked, in how many splits they also appear in the top result list.

Value

An object of class mdr.

Author(s)

Daniel Fischer

References

Moore JH, Gilbert JC, Tsai CT, Chiang FT, Holden T, Barney N, White BC. (2006): A flexible computational framework for detecting, characterizing, and interpreting statistical patterns of epistasis in genetic studies of human disease susceptibility. J Theor Biol.2006 Jul 21;241(2):252-61.

Examples

data(mdrExample)
mdrSNP <- mdrExample[,1:20]
fit.mdr <- mdr(mdrSNP, mdrExample$Class, fold=4, top=5)
fit.mdr
fit.mdr <- mdr(mdrSNP, mdrExample$Class)
fit.mdr

fischuu/GenomicTools documentation built on Feb. 15, 2025, 2:13 p.m.