bnclustOmics: Bayesian network based clustering of multi-omics data

View source: R/bnclustOmics.R

bnclustOmicsR Documentation

Bayesian network based clustering of multi-omics data

Description

Bayesian network-based clustering of multi-omics data. This function implements network-based clustering for multiomics data. The mandatory input is a list of matrices consisting from binary, ordinal or continuous variables. Each matrix corresponds to one omics type. At least one matrix with continuous variables must be present. Optional output includes the prior information about interactions between genes and gene products. This can be passed via parameters blacklist and edgepmat. Interactions in blacklist are excluded from the search space. Edgepmat imposes a graphical prior which penalizes certain interactions by a certain penalization factor. The output includes cluster assigments and MAP directed acycluc graphs (DAGs) representing discovered clusters. Optionally, the output may include posterior probabilities of all edges in the discovered graphs.

Usage

bnclustOmics(
  omicdata,
  bnnames,
  blacklist = NULL,
  edgepmat = NULL,
  kclust = 2,
  chixi = 0,
  seed = 100,
  err = 1e-06,
  maxEM = 10,
  hardlim = 6,
  deltahl = 5,
  nit = 5,
  epmatrix = TRUE,
  plus1it = 4,
  startpoint = "mclustPCA",
  baseprob = 0.4,
  commonspace = TRUE,
  verbose = TRUE
)

Arguments

omicdata

a list of matrices corresponding to omics types. For example, "M" (mutations), "CN" (copy numbers), "T" (transcriptome), "P" (proteome) and "PP" (phosphoproteome); at least one continuous type must be present

bnnames

object of class 'bnInfo'; see constructor function bnInfo

blacklist

adjacency matrix containing information about which edges will be blacklisted in structure search

edgepmat

penalization matrix of the edges in structure learning

kclust

the number of clusters (mixture components)

chixi

prior pseudocounts used for computing parameters for binary nodes

seed

integer number set for reproducibility

err

convergence criteria

maxEM

maximum number of outer EM iterations (structural search)

hardlim

maximum number of parents per node when learning networks

deltahl

additional number of parents when sampling from the common search space

nit

number of internal iteration (of parameter estimation) in the EM

epmatrix

(logical) indicates if the matrices containing posterior probabilities of single edges are be returned

plus1it

maximum number of search space expansion iterations when performing structure search

startpoint

defines which algorithm is used to define starting cluster memberships: possible values "random", "mclustPCA" and "mclust"

baseprob

defines the base probability of cluster membership when "mclustPCA" or "mclust" used as starting point

commonspace

(logical) defines if the sampling has to be performed from the common search space

verbose

defines if the output messages should be printed

Value

object of class 'bnclustOmics' containing the results of Bayesian-network based clustering: cluster assignments, networks representing the clusters

Author(s)

Polina Suter, Jack Kuipers

Examples

bnnames<-bnInfo(simdata,c("b","c"),c("M","T"))

fit<-bnclustOmics(simdata,bnnames,maxEM=4, kclust=2, startpoint = "mclustPCA")
clusters(fit)
checkmembership(clusters(fit),simclusters)


bnClustOmics documentation built on Aug. 5, 2022, 5:11 p.m.