getDist: Calculates weighted distance matrix of multiple genomic data...

View source: R/getDist.R

getDistR Documentation

Calculates weighted distance matrix of multiple genomic data types

Description

Given multiple genomic data types (e.g., gene expression, copy number, DNA methylation, miRNA expression (continuous) and mutation (binary)) measured across samples, allowing for missing values (NA) and missing samples, getDist calculates the survival weighted distance metric among samples. Used as an input to, combineDist().

Usage

getDist(datasets, survdat = NULL, cv = FALSE, train.snames = NULL, type = NULL)

Arguments

datasets

A list object containing m data matrices representing m different genomic data types measured in a set of N~m samples. OR MultiAssayExperiment object of desired types of data. For list of matrices, each matrix, the rows represent samples, and the columns represent genomic features. Each data matrix is allowed to have different samples

survdat

A matrix, containing two columns - 1st column time and 2nd column containing events information. OR this information can be provided as a part of colData MultiAssayExperiment

cv

logical. If TRUE, train.names cannot be NULL. Cross-validation will be performed on train.names samples, and the dataset will be split into training and test, and each respective matrices will be returned.

train.snames

required if cv=TRUE. A vector of sample names treated as training samples.

type

NULL. Specify type="mut", if datasets is of length 1 and contains binary data only. See details

Details

getDist allows for continuous and binary data type(s) in a matrix passed as a list. If the list only has a binary matrix data type. Set type="mut". All data types are standardized internally. All data types are not expected to have common samples. Non-overlapping samples within data types are replaced with NA, and returned weighted matrix consists of union of all the samples.

Value

  • cv=FALSE,dist.datreturns a list of weighted data matrix/matrices, dist.dat

  • cv=TRUE,dist.dat=list(train, all) returns a list of training train weighted data matrix. And the whole matrix weighed according to the weights computed on the training dataset all.

Author(s)

Arshi Arora

Examples

library(survClust)
dd <- getDist(simdat, simsurvdat)


arorarshi/survClust documentation built on April 21, 2024, 1:51 p.m.