cv: Crossvalidation for the parameter k

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

This function computes the value of k through crossvalidation.

Usage

1
cv(dat, grp, cross = 5, display = FALSE, length = 40, seed = NULL, med = FALSE, healthy = NULL)

Arguments

dat

Can either be (a) a matrix of m lines (the gene expressions) and n columns (the observations) or (b) an eSet object.

grp

Can either be (a) a character (or numeric) vector indicating the group of each observations or (b) an integer indicating the column of pData(dat) that represents the group of the observations.

cross

The number of fold that should be used in the crossvalidation.

display

Allows the user to avoid the function ktspcalc() to print waring message over the loop.

length

This paramters allows the used to control the length of the list used in the C code.

seed

Allow the user to set a seed.

med

If the mean of the median between the two groups for each gene should be substracted to the dataset or not.

healthy

This variable is used to determine which group will be considerer as the healthy group (specificity). Need to give the label of the group.

Details

This function computes the value of k through crossvalidation. The number of fold is given by the variable cross and by default is 5. It also computes the percentage of correct prediction based on the same partition as for the crossvalidation for the possible values of k.

Value

k

The selected value for k

accuracy_k

The estimated percentage of correct prediction achieved by the k-TSP with the selected k.

accuracy

A vector of the estimated percentage of correct prediction reached by the k-TSP with k = 1,3,5,7,9.

sensitivity

A vector of the estimated sensitivity reached by the k-TSP with k = 1,3,5,7,9.

specificity

A vector of the estimated specificity reached by the k-TSP with k = 1,3,5,7,9.

Author(s)

Julien Damond julien.damond@gmail.com

References

D. Geman, C. d'Avignon, D. Naiman and R. Winslow, "Classifying gene expression profiles from pairwise mRNA comparisons," Statist. Appl. in Genetics and Molecular Biology, 3, 2004.

A.C. Tan, D.Q. Naiman, L. Xu, R.L. Winslow, D. Geman, "Simple decision rules for classifying human cancers from gene expression profiles," Bioinformatics, 21: 3896-3904, 2005.

J. Damond, supervised by S. Morgenthaler and S. Hosseinian, "Presentation and study of robustness for several methods to classify individuals based on their gene expressions", Master thesis, Swiss Federal Institute of Technology Lausanne (Switzerland), 2011.

J. Damond, S. Morgenthaler, S. Hosseinian, "The robustness of the TSP and the k-TSP and the computation of ROC curves", paper is submitted in Bioinformatics, December 2011.

Jeffrey T. Leek <jtleek@jhu.edu> (). tspair: Top Scoring Pairs for Microarray Classification. R package version 1.10.0.

See Also

ktspcalc, ktspplot,predict.ktsp, summary.ktsp

Examples

1
2
3
4
5
6
7
8
9
  ## Not run: 
  ## Load data
  data(ktspdata) 
  cv <- cv(dat, grp,cross =10)
  ktsp <- ktspcalc(dat, grp, cv$k)
  ktsp
  cv
 
## End(Not run)

Example output

Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, sd, var, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, basename, cbind, colMeans, colSums, colnames,
    dirname, do.call, duplicated, eval, evalq, get, grep, grepl,
    intersect, is.unsorted, lapply, lengths, mapply, match, mget,
    order, paste, pmax, pmax.int, pmin, pmin.int, rank, rbind,
    rowMeans, rowSums, rownames, sapply, setdiff, sort, table, tapply,
    union, unique, unsplit, which, which.max, which.min

Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

k-TSP object with: 5 TSPs
Pair:		TSP Score		Indices
TSP 1 : 	 0.96 			 657 793 
TSP 2 : 	 0.92 			 74 704 
TSP 3 : 	 0.92 			 224 298 
TSP 4 : 	 0.92 			 588 758 
TSP 5 : 	 0.92 			 34 266 
$k
[1] 5

$accuracy_k
[1] 0.96

$accuracy
[1] 0.84 0.92 0.96 0.88 0.58

$sensitivity
[1] 0.8416667 0.9750000 1.0000000 0.9000000 0.6000000

$specificity
[1] 0.8240741 0.9074074 0.9444444 0.8611111 0.5750000

attr(,"class")
[1] "cv"

ktspair documentation built on May 2, 2019, 3:25 a.m.