Description Usage Arguments Value Author(s) References Examples
This function provides various approaches to handling missing values for the MDR analysis of incomplete data to identify gene-gene interactions using biallelic marker data in genetic association studies
1 | impute.mdr(dataset, colresp, cs, combi, cv.fold = 10, na.method = 0, max_iter = 30, randomize = FALSE)
|
dataset |
A matrix of SNP data with class variable (response; phenotype; disease status). Genotypes must be coded as allele counts (0,1,2). Missing genotypes should be coded as 3 |
colresp |
Column number of class variables in the dataset. No missing value is allowed for the class variable |
cs |
The value used to indicate "case (affected)" for class variable |
combi |
The number of SNPs considered simultaneously as predictor variables (An order of interactions to analyze) |
cv.fold |
The number of folds k for k-fold cross-validation |
na.method |
Options for missing handling approaches. na.method = 0 for complete data, na.method = 1 for treating missing genotypes as another genotype category, na.method=2 for using available data for given number of SNPs under consideration as a model, na.method=3 for using method of imputing missing information by using EM (expectation-maximization) algorithm |
max_iter |
The number of maximum iteration in EM impute approach (na.method=3). In order to apply one-step EM approach, set this argument as 1 |
randomize |
Logical. If 'TRUE' the cross validation sets are randomized |
min.comb |
Marker combinations with the minimum error rate in each cross validation |
train.erate |
Training errors for selected marker combination |
test.erate |
Test error of the selected marker combination |
best.combi |
The best combination that was selected most frequently across k-fold cross validation |
Junghyun Namkung, Taeyoung Hwang, MinSeok Kwon, Sunggon Yi and Wonil Chung
Maintainer: Junghyun Namkung <jh.namkung@gmail.com>
Namkung J, Elston RC, Yang JM, Park T. "Identification of gene-gene interactions in the presence of missing data using the multifactor dimensionality reduction method" Genet Epidemiol. 2009 Nov;33(7):646-56.
1 2 3 4 | ## sample data with missing values
data(incomplete)
## analysis example of 2nd order gene-gene interaction test
impute.mdr(incomplete, colresp=1, cs=1, combi=2, cv.fold = 10,na.method=2)
|
$cv.result
SNP1 SNP2 train.err test.err
[1,] "snp9" "snp10" "0.375438596491228" "0.392857142857143"
[2,] "snp9" "snp10" "0.375" "0.393939393939394"
[3,] "snp9" "snp10" "0.372759856630824" "0.411764705882353"
[4,] "snp9" "snp10" "0.382978723404255" "0.354838709677419"
[5,] "snp9" "snp10" "0.387900355871886" "0.375"
[6,] "snp9" "snp10" "0.379432624113475" "0.354838709677419"
[7,] "snp9" "snp10" "0.362007168458781" "0.5"
[8,] "snp9" "snp10" "0.371024734982332" "0.433333333333333"
[9,] "snp9" "snp10" "0.368421052631579" "0.464285714285714"
[10,] "snp4" "snp6" "0.393333333333333" "0.676470588235294"
$best
[1] "snp9" "snp10"
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.