HNSCC: A data frame containing the TCGA head and neck squamous cell...
In GEInter: Robust Gene-Environment Interaction Analysis

HNSCC

R Documentation

A data frame containing the TCGA head and neck squamous cell carcinoma (HNSCC) data.

Description

A data frame containing the 7 environmental (E) effects (the first 7 columns), 2000 genetic (G) effects (column 8 to column 2007), logarithm of survival time (column 2008), and censoring indicator (column 2009). All of them can be downloaded from TCGA Provisional using the R package cgdsr. See details.

Usage

data(HNSCC)

Format

A data frame with 484 rows and 2009 variables.

Details

There are seven E effects, namely alcohol consumption frequency (ACF), smoking pack years (SPY), age, gender, PN, PT, and ICD O3 site. For G effects, 2,000 gene expressions are considered. Among 484 subjects, 343 subjects have missingness in ACF and/or SPY. For G effects, we analyze mRNA gene expressions. A total of 18,409 gene expression measurements are available, then prescreening is conducted using marginal Cox models, finally, the top 2,000 genes with the smallest p-values are selected for downstream analysis.

Examples

data(HNSCC)
E=as.matrix(HNSCC[,1:7])
G=as.matrix(HNSCC[,8:2007])
Y=as.matrix(HNSCC[,2008:2009])

fit<-Miss.boosting(G,E,Y,im_time=10,loop_time=1000,v=0.25,num.knots=5,degree=3,tau=0.3,
family="survival",E_type=c(rep("EC",3),rep("ED",4)))
plot(fit)

GEInter documentation built on May 20, 2022, 1:17 a.m.