Description Usage Arguments Value Author(s) References Examples

Finds optimal sets of genes for classification

1 2 3 4 5 |

`x` |
data matrix |

`y` |
class vector |

`x.test` |
test data matrix if available |

`y.test` |
test class vector if available |

`probe.ID` |
probe set IDs; if NULL, row numbers are assigned. |

`rule` |
classification rule: "lda","qda","logistic","svmlin","svmrbf"; the default is "lda". |

`method.cut` |
method for pre-selection; t-test is available. |

`percent.cut` |
proportion of pre-selected genes; the default is 0.01. |

`model.sMiPP.margin` |
smallest set of genes s.t. sMiPP <= (max sMiPP-model.sMiPP.margin); the default is 0.01. |

`min.sMiPP` |
Adding genes stops if max sMiPP is at least min.sMiPP; the default is 0.85. |

`n.drops` |
Adding genes stops if sMiPP decreases (n.drops) times, in addition to min.sMiPP criterion.; the default is 2. |

`n.fold` |
number of folds; default is 5. |

`p.test` |
partition percent of train and test samples when test samples are not available; the default is 1/3 for test set. |

`n.split` |
number of splits; the default is 20. |

`n.split.eval` |
numbr of splits for evalutation; the default is 100. |

`model` |
candiadate genes (for each split if no indep set is available |

`model.eval` |
Optimal sets of genes for each split when no indep set is available |

Soukup M, Cho H, and Lee JK

Soukup M, Cho H, and Lee JK (2005). Robust classification modeling on microarray data using misclassification penalized posterior, Bioinformatics, 21 (Suppl): i423-i430.

Soukup M and Lee JK (2004). Developing optimal prediction models for cancer classification using gene expression data, Journal of Bioinformatics and Computational Biology, 1(4) 681-694

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | ```
##########
#Example 1: When an independent test set is available
data(leukemia)
#Normalize combined data
leukemia <- cbind(leuk1, leuk2)
leukemia <- mipp.preproc(leukemia, data.type="MAS4")
#Train set
x.train <- leukemia[,1:38]
y.train <- factor(c(rep("ALL",27),rep("AML",11)))
#Test set
x.test <- leukemia[,39:72]
y.test <- factor(c(rep("ALL",20),rep("AML",14)))
#Compute MiPP
out <- mipp(x=x.train, y=y.train, x.test=x.test, y.test=y.test, probe.ID = 1:nrow(x.train), n.fold=5, percent.cut=0.05, rule="lda")
#Print candidate models
out$model
##########
#Example 2: When an independent test set is not available
data(colon)
#Normalize data
x <- mipp.preproc(colon)
y <- factor(c("T", "N", "T", "N", "T", "N", "T", "N", "T", "N",
"T", "N", "T", "N", "T", "N", "T", "N", "T", "N",
"T", "N", "T", "N", "T", "T", "T", "T", "T", "T",
"T", "T", "T", "T", "T", "T", "T", "T", "N", "T",
"T", "N", "N", "T", "T", "T", "T", "N", "T", "N",
"N", "T", "T", "N", "N", "T", "T", "T", "T", "N",
"T", "N"))
#Deleting comtaminated chips
x <- x[,-c(51,55,45,49,56)]
y <- y[ -c(51,55,45,49,56)]
#Compute MiPP
out <- mipp(x=x, y=y, probe.ID = 1:nrow(x), n.fold=5, p.test=1/3, n.split=5, n.split.eval=100,
percent.cut= 0.1, rule="lda")
#Print candidate models for each split
out$model
#Print optimal models and independent evaluation for each split
out$model.eval
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.