Description Usage Arguments Details Value References See Also Examples

Perform subtyping using multiple types of data

1 2 3 4 5 6 7 8 9 10 | ```
SubtypingOmicsData(
dataList,
kMin = 2,
kMax = 5,
k = NULL,
agreementCutoff = 0.5,
ncore = 1,
verbose = T,
...
)
``` |

`dataList` |
a list of data matrices. Each matrix represents a data type where the rows are items and the columns are features. The matrices must have the same set of items. |

`kMin` |
The minimum number of clusters used for automatically detecting the number of clusters in |

`kMax` |
The maximum number of clusters used for automatically detecting the number of clusters in |

`k` |
The number of clusters. If k is set then kMin and kMax will be ignored. |

`agreementCutoff` |
agreement threshold to be considered consistent. Default value is |

`ncore` |
Number of cores that the algorithm should use. Default value is |

`verbose` |
set it to |

`...` |
these arguments will be passed to |

`SubtypingOmicsData`

implements the Subtyping multi-omic data that are based on Perturbaion clustering algorithm of Nguyen, et al (2017) and Nguyen, et al (2019).
The input is a list of data matrices where each matrix represents the molecular measurements of a data type. The input matrices must have the same number of rows.
`SubtypingOmicsData`

aims to find the optimum number of subtypes and location of each sample in the clusters from integrated input data `dataList`

through two processing stages:

1. Stage I: The algorithm first partitions each data type using the function `PerturbationClustering`

.
It then merges the connectivities across data types into similarity matrices.
Both kmeans and similarity-based clustering algorithms - partitioning around medoids `pam`

are used to partition the built similarity.
The algorithm returns the partitioning that agrees the most with individual data types.

2. Stage II: The algorithm attempts to split each discovered group if there is a strong agreement between data types,
or if the subtyping in Stage I is very unbalanced.

`SubtypingOmicsData`

returns a list with at least the following components:

`cluster1` |
A vector of labels indicating the cluster to which each sample is allocated in Stage I |

`cluster2` |
A vector of labels indicating the cluster to which each sample is allocated in Stage II |

`dataTypeResult` |
A list of results for individual data type. Each element of the list is the result of the |

1. H Nguyen, S Shrestha, S Draghici, & T Nguyen. PINSPlus: a tool for tumor subtype discovery in integrated genomic data. Bioinformatics, 35(16), 2843-2846, (2019).

2. T Nguyen, R Tagett, D Diaz, S Draghici. A novel method for data integration and disease subtyping. Genome Research, 27(12):2025-2039, 2017.

3. T. Nguyen, "Horizontal and vertical integration of bio-molecular data", PhD thesis, Wayne State University, 2017.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 | ```
# Load the kidney cancer carcinoma data
data(KIRC)
# Perform subtyping on the multi-omics data
dataList <- list (as.matrix(KIRC$GE), as.matrix(KIRC$ME), as.matrix(KIRC$MI))
names(dataList) <- c("GE", "ME", "MI")
result <- SubtypingOmicsData(dataList = dataList)
# Change Pertubation clustering algorithm's arguments
result <- SubtypingOmicsData(
dataList = dataList,
clusteringMethod = "kmeans",
clusteringOptions = list(nstart = 50)
)
# Plot the Kaplan-Meier curves and calculate Cox p-value
library(survival)
cluster1=result$cluster1;cluster2=result$cluster2
a <- intersect(unique(cluster2), unique(cluster1))
names(a) <- intersect(unique(cluster2), unique(cluster1))
a[setdiff(unique(cluster2), unique(cluster1))] <- seq(setdiff(unique(cluster2), unique(cluster1)))
+ max(cluster1)
colors <- a[levels(factor(cluster2))]
coxFit <- coxph(
Surv(time = Survival, event = Death) ~ as.factor(cluster2),
data = KIRC$survival,
ties = "exact"
)
mfit <- survfit(Surv(Survival, Death == 1) ~ as.factor(cluster2), data = KIRC$survival)
plot(
mfit, col = colors,
main = "Survival curves for KIRC, level 2",
xlab = "Days", ylab = "Survival",lwd = 2
)
legend("bottomright",
legend = paste(
"Cox p-value:",
round(summary(coxFit)$sctest[3], digits = 5),
sep = ""
)
)
legend(
"bottomleft",
fill = colors,
legend = paste(
"Group ",
levels(factor(cluster2)),": ", table(cluster2)[levels(factor(cluster2))],
sep =""
)
)
``` |

