# Real Dataset #2: Genomic Dataset (p >> n case)

### Description

Publicly available lung cancer genomic data from the Chemores Cohort Study. This data is part of an integrated study of mRNA, miRNA
and clinical variables to characterize the molecular distinctions between squamous cell carcinoma (SCC) and adenocarcinoma (AC)
in Non Small Cell Lung Cancer (NSCLC) aside large cell lung carcinoma (LCC). Tissue samples were analysed from a cohort of
123 patients who underwent complete surgical resection at the Institut Mutualiste Montsouris (Paris, France) between
30 January 2002 and 26 June 2006. In this genomic dataset, the expression levels of Agilent miRNA probes (*p=939*)
were included from the *n=123* samples of the Chemores cohort. The data contains normalized expression levels.
See below the paper by Lazar et al. (2013) and Array Express data repository for complete description of the samples, tissue preparation,
Agilent array technology, data normalization, etc. This dataset represents a situation where the number of covariates dominates
the number of complete observations, or *p >> n* case.

### Usage

1 |

### Format

Dataset consists of a `numeric`

`data.frame`

containing *n=123* complete observations (samples)
by rows and *p=939* genomic covariates by columns, not including the censoring indicator and (censored) time-to-event variables.
It comes as a compressed Rda data file.

### Author(s)

"Jean-Eudes Dazard, Ph.D." jxd101@case.edu

"Michael Choe, M.D." mjc206@case.edu

"Michael LeBlanc, Ph.D." mleblanc@fhcrc.org

"Alberto Santana, MBA." ahs4@case.edu

Maintainer: "Jean-Eudes Dazard, Ph.D." jxd101@case.edu

Acknowledgments: This project was partially funded by the National Institutes of Health NIH - National Cancer Institute (R01-CA160593) to J-E. Dazard and J.S. Rao.

### Source

See real data application in Dazard et al., 2015.

### References

Dazard J-E., Choe M., LeBlanc M. and Rao J.S. (2015). "

*Cross-validation and Peeling Strategies for Survival Bump Hunting using Recursive Peeling Methods.*" Statistical Analysis and Data Mining (in press).Dazard J-E., Choe M., LeBlanc M. and Rao J.S. (2014). "

*Cross-Validation of Survival Bump Hunting by Recursive Peeling Methods.*" In JSM Proceedings, Survival Methods for Risk Estimation/Prediction Section. Boston, MA, USA. American Statistical Association IMS - JSM, p. 3366-3380.Dazard J-E., Choe M., LeBlanc M. and Rao J.S. (2015). "

*R package PRIMsrc: Bump Hunting by Patient Rule Induction Method for Survival, Regression and Classification.*" In JSM Proceedings, Statistical Programmers and Analysts Section. Seattle, WA, USA. American Statistical Association IMS - JSM, (in press).Dazard J-E. and J.S. Rao (2010). "

*Local Sparse Bump Hunting.*" J. Comp Graph. Statistics, 19(4):900-92.

### See Also

Array Express data repository at the European Bioinformatics Institute. Accession number: #E-MTAB-1134 (MIR). www.ebi.ac.uk/arrayexpress/

CHEMORES Consortium and website. www.chemores.ki.se/index.html