reference_set_extended | R Documentation |
A data set containing gene expression measurements from whole blood for a signature of 19 genes. This signature comprises the 7 SRS genes defined by Davenport et al plus a further 12 genes identified by Cano-Gamez et al. from microarray and RNA-seq data using canonical correlation analysis.
reference_set_extended
A data frame with 3264 rows and 19 variables:
SLC25A38, cosine-scaled, batch corrected gene expression level
DNAJA3, cosine-scaled, batch corrected gene expression level
NAT10, cosine-scaled, batch corrected gene expression level
THOC1, cosine-scaled, batch corrected gene expression level
MRPS9, cosine-scaled, batch corrected gene expression level
PGS1, cosine-scaled, batch corrected gene expression level
UBAP1, cosine-scaled, batch corrected gene expression level
USP5, cosine-scaled, batch corrected gene expression level
TTC3, cosine-scaled, batch corrected gene expression level
SH3GLB1, cosine-scaled, batch corrected gene expression level
BMS1, cosine-scaled, batch corrected gene expression level
FBXO31, cosine-scaled, batch corrected gene expression level
ARL14EP, cosine-scaled, batch corrected gene expression level
CCNB1IP1, cosine-scaled, batch corrected gene expression level
DYRK2, cosine-scaled, batch corrected gene expression level
ADGRE3, cosine-scaled, batch corrected gene expression level
MDC1, cosine-scaled, batch corrected gene expression level
TDRD9, cosine-scaled, batch corrected gene expression level
ZAP70, cosine-scaled, batch corrected gene expression level
...
This data set is formed of 1,609 samples from healthy individuals and 1,655 samples from sepsis patients.
Sepsis patients were recruited as a part of the Genomic Advances in Sepsis (GAinS) study in Oxford, UK. Of these, 676 were profiled using the Illumina HumanHT microarray, 864 using polyA-based RNA-sequencing, and 115 using qPCR.
Healthy individual data was collected from a number of publicly available sources. In particular, 991 OIllumina HumanHT microarray samples were obtained from the SHIP-TREND consortium, 518 Illumina HumanHT microarray samples were obatained from the DILGOM cohort (an extension of the FINRISK study), and 100 polyA-based RNA-sequencing samples were obtained from the dutch 500FG cohort
RNA-seq data was log-transformed and any relevant batch effects were removed using the combat algorithm. Finally, the 7 SRS signature genes were extracted from each cohort and the data was integrated together using the mutual nearest neighbout (mNN) algorithm.
The values reported in this data set were obtained after mNN alignment, and thus they represent Cosine-scale batch-corrected values.
The main use of this data set is to serve as a reference to which new input samples can be aligned before prediction of SRS group using random forest models.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.