PAD: PAD

View source: R/app_PAD.R

PADR Documentation

PAD

Description

Unsupervised pan-immune activation/dysfunction (PAD) subtypes of gastric cancer (or other solid tumor) sample based on RNA-Seq/microarray data

Usage

PAD(
  expr,
  PIAM = NULL,
  PIDG = NULL,
  cluster.method = c("ward.D2", "complete", "randomForest")[1],
  rF.para = list(seed = c(2020, 485, 4, 8), ntree = c(300, 300), k = c(2, 2)),
  extra.annot = NULL,
  plot.title = NULL,
  subtype = "PAD.train_20200110",
  verbose = T
)

Arguments

expr

RNA expression matrix. Samples in col and ENSEMBL genes in row.

PIAM

ID of pan-immune activation genes. IF NULL, use default gene set.

PIDG

ID of pan-immune dysfunction genes. IF NULL, use default gene set.

cluster.method

One of 'ward.D2', 'complete' or other methods in hclust. If 'randomForest' was set, the randomForest function would used to classify the samples.

rF.para

The parameters in randomForest function.

extra.annot

Extra top annotation. The same order as colnames of expr.

plot.title

The title of heatmap report.

subtype

Default subtype methods. Now, only 'PAD' and 'ImmuneSubtype' are available. ATTENTION: If you use self-defined data (ens, geneAnnoation, geneSet or scaller), you MUST set subtype as NULL! All available options please visit GSClassifier_Data

verbose

Whether to show heatmap in the process.

Details

This function is used for unsupervised classification of raw data, which is pivotal for the following supervised machine learning. Empirically, the 'ward.D2' method could be useful and high-speed for simple gene signatrues (like PAD classifier). Random forest is a powerful stragety and may act well in larger dataset or complex gene signatures.

Examples

extra.annot = HeatmapAnnotation(
Dataset = id.dataset,
col = list(
  Dataset = if(T){
    l <- mycolor[1:length(unique(id.dataset))];
    names(l) <- unique(id.dataset);
    l}
),
annotation_name_gp = gpar(fontsize = 13, fontface = "bold"),
show_legend = T
)

res1 <- PAD(
  expr = dm.combat.tumor,
  PIAM = piam,
  PIDG = pidg,
  plot.title = 'PanSTAD',
  cluster.method = 'ward.D2',
  subtype = 'PAD.train_20200110',
  extra.annot = extra.annot,
  verbose = T
)

# randomForest: time-consuming in large cohorts
res2 <- PAD(
  expr = dm.combat.tumor,
  PIAM = piam,
  PIDG = pidg,
  cluster.method = 'randomForest',
  rF.para = list(
    seed = c(2020,485,58,152),
    ntree = c(1000,1000),
    k=c(2,2)
  ),
  subtype = 'PAD.train_20200110',
  extra.annot = extra.annot,
  plot.title = 'PanSTAD',
  verbose = T
)

huangwb8/GSClassifier documentation built on July 12, 2024, 5:10 p.m.