Computation and Representation of the Scale Curve

Description

scalecurve computes the scale curve of a given group, based on the modified band depth, at a given value p as the area of the band delimited by the [np] most central observations, where [np] is the largest integer smaller than np.

Usage

1
2
scalecurve(x,y=NULL,xlab="p",ylab="A(p)",main="Scale curve",lwd=2,
           ...)

Arguments

x

a data matrix containing the observations (samples) by rows and the variables (genes) by columns

y

an optional vector (numeric or factor) of length equal to the number of rows in x, containing the class of each observation. If unprovided, then all the elements in x are assumed to belong to a single class

xlab

label in the x axis

ylab

label in the y axis

main

plot title

lwd

line widths for the corresponding scale curve(s)

...

graphical parameters to be passed to 'plot'

Details

The scale curve measures the increase in the area of the band determined by the fraction p most central curves, where p moves from 0 to 1, thus providing a measure of the sample dispersion. If the data set is represented in parallel coordinates, then the area is computed using the trapezoid formula.

Value

r

the value of the scale curve at equidistant values of p, determined by the number of observation within each class. If y is not provided, then r is a vector, otherwise is a list with as many components as classes described by y.

Author(s)

Sara Lopez-Pintado sl2929@columbia.edu and

Aurora Torrente etorrent@est-econ.uc3m.es

References

Lopez-Pintado, S. et al. (2010). Robust depth-based tools for the analysis of gene expression data. Biostatistics, 11 (2), 254-264.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
## scale curve of a single data set
  ## simulated data
  set.seed(0)  
  x <- matrix(rnorm(100),10,10)
  scalecurve(x)

  ## real data
  data(prostate)
  prost.x<-prostate[,1:100]
  prost.y<-prostate[,101]
  scalecurve(prost.x[prost.y==0,])  ## scale curve of normal samples
  scalecurve(prost.x[prost.y==1,])  ## scale curve of tumoral samples
  
## scalecurve of different groups 
  ## simulated data
  x <- matrix(rnorm(100),10,10)
  y <- c(rep("tumoral",5),rep("normal",5))
  scalecurve(x,y)

  ## real data
  labels<-prost.y 
  labels[prost.y==0]<-"normal"; labels[prost.y==1]<-"tumoral"
  scalecurve(prost.x,labels)