subtype.cluster: Function to fit the Subtype Clustering Model

Description Usage Arguments Value Author(s) References See Also Examples

Description

This function fits the Subtype Clustering Model as published in Desmedt et al. 2008 and Wiarapati et al. 2008. This model is actually a mixture of three Gaussians with equal shape, volume and variance (see EEI model in Mclust). This model is adapted to breast cancer and uses ESR1, ERBB2 and AURKA dimensions to identify the molecular subtypes, i.e. ER-/HER2-, HER2+ and ER+/HER2- (Low and High Prolif).

Usage

1
2
3
subtype.cluster(module.ESR1, module.ERBB2, module.AURKA, data, annot,
  do.mapping = FALSE, mapping, do.scale = TRUE, rescale.q = 0.05,
  model.name = "EEI", do.BIC = FALSE, plot = FALSE, filen, verbose = FALSE)

Arguments

module.ESR1

Matrix containing the ESR1-related gene(s) in rows and at least three columns: "probe", "EntrezGene.ID" and "coefficient" standing for the name of the probe, the NCBI Entrez Gene id and the coefficient giving the direction and the strength of the association of each gene in the gene list.

module.ERBB2

Idem for ERBB2.

module.AURKA

Idem for AURKA.

data

Matrix of gene expressions with samples in rows and probes in columns, dimnames being properly defined.

annot

Matrix of annotations with at least one column named "EntrezGene.ID", dimnames being properly defined.

do.mapping

TRUE if the mapping through Entrez Gene ids must be performed (in case of ambiguities, the most variant probe is kept for each gene), FALSE otherwise.

mapping

**DEPRECATED** Matrix with columns "EntrezGene.ID" and "probe" used to force the mapping such that the probes are not selected based on their variance.

do.scale

TRUE if the ESR1, ERBB2 and AURKA (module) scores must be rescaled (see rescale), FALSE otherwise.

rescale.q

Proportion of expected outliers for rescaling the gene expressions.

do.BIC

TRUE if the Bayesian Information Criterion must be computed for number of clusters ranging from 1 to 10, FALSE otherwise.

model.name

Name of the model used to fit the mixture of Gaussians with the Mclust from the mclust package; default is "EEI" for fitting a mixture of Gaussians with diagonal variance, equal volume, equal shape and identical orientation.

plot

TRUE if the patients and their corresponding subtypes must be plotted, FALSE otherwise.

filen

Name of the csv file where the subtype clustering model must be stored.

verbose

TRUE to print informative messages, FALSE otherwise.

Value

model

Subtype Clustering Model (mixture of three Gaussians), like scmgene.robust, scmod1.robust and scmod2.robust when this function is used on expO dataset (International Genomics Consortium) with the gene modules published in the two references cited below.

BIC

Bayesian Information Criterion for the Subtype Clustering Model with number of clusters ranging from 1 to 10.

subtype

Subtypes identified by the Subtype Clustering Model. Subtypes can be either "ER-/HER2-", "HER2+" or "ER+/HER2-".

subtype.proba

Probabilities to belong to each subtype estimated by the Subtype Clustering Model.

subtype2

Subtypes identified by the Subtype Clustering Model using AURKA to discriminate low and high proliferative tumors. Subtypes can be either "ER-/HER2-", "HER2+", "ER+/HER2- High Prolif" or "ER+/HER2- Low Prolif".

subtype.proba2

Probabilities to belong to each subtype (including discrimination between lowly and highly proliferative ER+/HER2- tumors, see subtype2) estimated by the Subtype Clustering Model.

module.scores

Matrix containing ESR1, ERBB2 and AURKA module scores.

Author(s)

Benjamin Haibe-Kains

References

Desmedt C, Haibe-Kains B, Wirapati P, Buyse M, Larsimont D, Bontempi G, Delorenzi M, Piccart M, and Sotiriou C (2008) "Biological processes associated with breast cancer clinical outcome depend on the molecular subtypes", Clinical Cancer Research, 14(16):5158–5165.

Wirapati P, Sotiriou C, Kunkel S, Farmer P, Pradervand S, Haibe-Kains B, Desmedt C, Ignatiadis M, Sengstag T, Schutz F, Goldstein DR, Piccart MJ and Delorenzi M (2008) "Meta-analysis of Gene-Expression Profiles in Breast Cancer: Toward a Unified Understanding of Breast Cancer Sub-typing and Prognosis Signatures", Breast Cancer Research, 10(4):R65.

See Also

subtype.cluster.predict, intrinsic.cluster, intrinsic.cluster.predict, scmod1.robust, scmod2.robust

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
## example without gene mapping
## load expO data
data(expos)
## load gene modules
data(mod1)
## fit a Subtype Clustering Model
scmod1.expos <- subtype.cluster(module.ESR1=mod1$ESR1, module.ERBB2=mod1$ERBB2,
  module.AURKA=mod1$AURKA, data=data.expos, annot=annot.expos, do.mapping=FALSE,
  do.scale=TRUE, plot=TRUE, verbose=TRUE)
str(scmod1.expos, max.level=1)
table(scmod1.expos$subtype2)

## example with gene mapping
## load NKI data
data(nkis)
## load gene modules
data(mod1)
## fit a Subtype Clustering Model
scmod1.nkis <- subtype.cluster(module.ESR1=mod1$ESR1, module.ERBB2=mod1$ERBB2,
  module.AURKA=mod1$AURKA, data=data.nkis, annot=annot.nkis, do.mapping=TRUE,
  do.scale=TRUE, plot=TRUE, verbose=TRUE)
str(scmod1.nkis, max.level=1)
table(scmod1.nkis$subtype2)

Example output

Loading required package: survcomp
Loading required package: survival
Loading required package: prodlim
Loading required package: mclust
Package 'mclust' version 5.4.7
Type 'citation("mclust")' for citing this R package in publications.
Loading required package: limma
Loading required package: biomaRt
Loading required package: iC10
Loading required package: pamr
Loading required package: cluster
Loading required package: impute
Loading required package: iC10TrainingData
Loading required package: AIMS
Loading required package: e1071
Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package:BiocGenericsThe following objects are masked frompackage:parallel:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following object is masked frompackage:limma:

    plotMA

The following objects are masked frompackage:stats:

    IQR, mad, sd, var, xtabs

The following objects are masked frompackage:base:

    anyDuplicated, append, as.data.frame, basename, cbind, colnames,
    dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
    grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget,
    order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
    rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply,
    union, unique, unsplit, which.max, which.min

Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

List of 7
 $ model         :List of 4
 $ BIC           : logi NA
 $ subtype       : Named chr [1:353] "ER+/HER2-" "ER+/HER2-" "ER-/HER2-" "ER-/HER2-" ...
  ..- attr(*, "names")= chr [1:353] "GSM38051" "GSM38054" "GSM353926" "GSM325845" ...
 $ subtype.proba : num [1:353, 1:3] 1.42e-05 3.27e-03 9.27e-01 1.00 3.29e-04 ...
  ..- attr(*, "dimnames")=List of 2
 $ subtype2      : Named chr [1:353] "ER+/HER2- Low Prolif" "ER+/HER2- High Prolif" "ER-/HER2-" "ER-/HER2-" ...
  ..- attr(*, "names")= chr [1:353] "GSM38051" "GSM38054" "GSM353926" "GSM325845" ...
 $ subtype.proba2: num [1:353, 1:4] 1.42e-05 3.27e-03 9.27e-01 1.00 3.29e-04 ...
  ..- attr(*, "dimnames")=List of 2
 $ module.scores : num [1:353, 1:3] 0.774 0.271 -0.284 -0.842 0.553 ...
  ..- attr(*, "dimnames")=List of 2

            ER-/HER2- ER+/HER2- High Prolif  ER+/HER2- Low Prolif 
                  112                   101                   106 
                HER2+ 
                   34 
List of 7
 $ model         :List of 4
 $ BIC           : logi NA
 $ subtype       : Named chr [1:150] "ER+/HER2-" "ER+/HER2-" "ER+/HER2-" "ER+/HER2-" ...
  ..- attr(*, "names")= chr [1:150] "NKI_123" "NKI_327" "NKI_291" "NKI_370" ...
 $ subtype.proba : num [1:150, 1:3] 8.60e-09 5.36e-11 1.10e-06 4.07e-08 2.04e-07 ...
  ..- attr(*, "dimnames")=List of 2
 $ subtype2      : Named chr [1:150] "ER+/HER2- Low Prolif" "ER+/HER2- High Prolif" "ER+/HER2- High Prolif" "ER+/HER2- Low Prolif" ...
  ..- attr(*, "names")= chr [1:150] "NKI_123" "NKI_327" "NKI_291" "NKI_370" ...
 $ subtype.proba2: num [1:150, 1:4] 8.60e-09 5.36e-11 1.10e-06 4.07e-08 2.04e-07 ...
  ..- attr(*, "dimnames")=List of 2
 $ module.scores : num [1:150, 1:3] 0.601 0.767 0.414 0.485 0.409 ...
  ..- attr(*, "dimnames")=List of 2

            ER-/HER2- ER+/HER2- High Prolif  ER+/HER2- Low Prolif 
                   24                    48                    55 
                HER2+ 
                   23 

genefu documentation built on Jan. 28, 2021, 2:01 a.m.