VSTexprTCGA | R Documentation |
This is data from a study of human tumours by The Cancer Genome Atlas (TCGA) Research Network. The five tumour types in this data set are Breast, Colon, Kidney, Lung, and Prostate. There are normalized gene expression values for 4000 genes from 1000 samples, 200 samples per tumour type.
VSTexprTCGA
A data frame with 1000 observations (rows) and 4001 variables (columns).
Column name | Data type | Description | Values | |
[,1] | classes | factor | 5 different types of cancer | (Breast...Prostate) |
[,2:4001] | ABCF1_23...LOC100271836_100271836 | numeric | Gene expression data | (8.109048 - 21.8406) |
The data has been used in exercises for supervised learning in BIN315. The
gene expression values were normalized using the
varianceStabilizingTransformation
function from the DESeq2
package.
This data is a subset of data provided by the
National Cancer Institute in the US
(specifically RNA-seq data from
The Cancer Genome Atlas Pan-Cancer analysis project).
Data subsetting was first done by Torgeir Rhoden Hvidsten.
Additionally, 4000 of 13946 genes were selected with the use of the
splsda
function from the mixOmics package.
The Cancer Genome Atlas Research Network., Weinstein, J., Collisson, E. et al. (2013) The Cancer Genome Atlas Pan-Cancer analysis project. Nat Genet, 45, 1113 – 1120.
# Summary of the first six variables
summary(VSTexprTCGA[, 1:6])
# Number of cases per tumour type
table(VSTexprTCGA$classes)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.