datasets: CML Cytogenetic Data

mercator-dataR Documentation

CML Cytogenetic Data

Description

These data sets contain binary versions of subsets of cytogenetic karyotype data from patients with chronic myelogenous leukemia (CML).

Usage

data("lgfFeatures")
data("CML500")
data("CML1000")
data("fakedata") # includes "fakeclin"

Format

lgfFeatures

A data matrix with 2748 rows and 6 columns listing the cytogentic bands produced as output of the CytoGPS algorithm that converts text-based karyotypes into a binary Loss-Gain-Fusion (LGF) model. The columns include the Label (the Type and Band, joined by an underscore), Type (Loss, Gain, or Fusion), Band (standard name of the cytogenetic band), Chr (chromosome), Arm (the chromsome arm, of the form #p or #q), and Index (an integer that can be used for sorting or indexing).

CML500

A BinaryMatrix object with 770 rows (subset of LGF features) and 511 columns (patients). The patients were selected using the downsample function from the full set of more than 3000 CML karyotypes. The rows were selected by removing redundant and non-informative features when considering the full data set.

CML1000

A BinaryMatrix object with 770 rows (subset of LGF features) and 1057 columns (patients). The patients were selected using the downsample function from the full set of more than 3000 CML karyotypes. The rows were selected by removing redundant and non-informative features when considering the full data set.

fakedata

A matrix with 776 rows ("features") and 300 columns ("samples") containng synthetic continuos data.

fakeclin

A data frame with 300 rows ("samples") and 4 columns of synthetic clincal data related to the fakedata.

Author(s)

Kevin R. Coombes <krc@silicovore.com>, Caitlin E. Coombes

Source

The cytogenetic data were obtained from the public Mitelman Database of Chromosomal Aberrations and Gene Fusions in Cancer on 4 April 2019. The database is currently located at https://cgap.nci.nih.gov/Chromosomes/Mitelman as part of hte Cancer Genome Anatomy Project (CGAP). The CGAP web site is expected to close on 1 October 2019 at which point the Mitelman database will move to an as-yet-undisclosed location. The data were then converted from text-based karyotrypes into binary vectors using CytoGPS http://cytogps.org/.

References

Abrams ZB, Zhang L, Abruzzo LV, Heerema NA, Li S, Dillon T, Rodriguez R, Coombes KR, Payne PRO. CytoGPS: A Web-Enabled Karyotype Analysis Tool for Cytogenetics. Bioinformatics. 2019 Jul 2. pii: btz520. doi: 10.1093/bioinformatics/btz520. [Epub ahead of print]


Mercator documentation built on June 30, 2022, 5:06 p.m.