colon: Simplified gene expression data from Alon et al. (1999)

colonR Documentation

Simplified gene expression data from Alon et al. (1999)

Description

Gene expression data (20 genes for 62 samples) from the microarray experiments of colon tissue samples of Alon et al. (1999).

Usage

colon

Format

An object of class list of length 2.

Details

This data set contains 62 samples with 100 predictors (expanded from 20 genes using 5 basis B-splines, as described in Yang, Y. and Zou, H. (2015)): 40 tumor tissues, coded 1 and 22 normal tissues, coded -1.

Value

A list with the following elements:

x

a [62 x 100] matrix (expanded from a [62 x 20] matrix) giving the expression levels of 20 genes for the 62 colon tissue samples. Each row corresponds to a patient, each 5 consecutive columns to a grouped gene.

y

a numeric vector of length 62 giving the type of tissue sample (tumor or normal).

Source

The data are described in Alon et al. (1999) and can be freely downloaded from http://microarray.princeton.edu/oncology/affydata/index.html.

References

Alon, U. and Barkai, N. and Notterman, D.A. and Gish, K. and Ybarra, S. and Mack, D. and Levine, A.J. (1999). “Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays”, Proc. Natl. Acad. Sci. USA, 96(12), 6745–6750.

Yang, Y. and Zou, H. (2015), “A Fast Unified Algorithm for Computing Group-Lasso Penalized Learning Problems,” Statistics and Computing. 25(6), 1129-1141.
BugReport: https://github.com/emeryyi/gglasso

Examples


# load gglasso library
library(gglasso)

# load data set
data(colon)

# how many samples and how many predictors ?
dim(colon$x)

# how many samples of class -1 and 1 respectively ?
sum(colon$y==-1)
sum(colon$y==1)


gglasso documentation built on May 29, 2024, 2:38 a.m.