colon | R Documentation |
The colon data is a publicly available data set from a colon cancer study in 1999 and consists of gene expression levels (a measure of gene activity) for 1988 genes in different tissues. NOTE: The original data has 2000 genes, but 12 where discarded here due to very strange names.
colon
A tibble with 62 observations (rows) and 1989 variables (columns).
The first column, [,1] Tissue
, is a factor with two levels and
codes the disease status (40 tumor, 22 normal).
Each other column corresponds to the expression level of a certain gene.
The rows correspond to different tissues.
The higher the value, the more active the gene is in a given tissue.
Alon, U. et al. (1999) Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci U S A 96(12), 6745–-6750.
# The dimensions of the tibble
dim(colon)
# Box plot of expressions levels for one gene
boxplot(Hsa.3004 ~ Tissue, data = colon)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.