smoothMutations_LabelProp: This function applies the random walk with restart...

Description Usage Arguments Details Value Examples

View source: R/smooMutationPropagation.R

Description

This function applies the random walk with restart propagation algorithm to a matrix of patients profiles

Usage

1
smoothMutations_LabelProp(mat, net, numCores = 1L)

Arguments

mat

(data.frame) Sparse matrix of binarized patient profiles, with rownames being unique patients and columns, unique genes. Entry [i,j] is set to 1 if patient j has a mutation in gene i.

net

(data.frame) Interaction network provided as an adjacency matrix (i.e. symmetric)

numCores

(integer) Number of cores for parallel processing

Details

A network is an undirected graph G defined by a set of nodes corresponding to genes, and edges connecting nodes with an experimental evidence of interaction. A priori nodes are genes for which an information is known. A novel node is a candidate for being associated to the nodes above based on their information. A node prediction task leads to detect novel nodes and propagation techniques are largely applied for the purpose. Network-based propagation algorithms for node prediction transfer the information from a priori nodes to any other node in a network. Each node gets an imputation value which assesses how much information got. The prediction is based on the guilty-by-association principle. A node with a high imputation value has a high probability to be associated to a priori nodes. E.g. in a house where room A has one heater, if room B is the second hottest room it means that B is close to A and that there is a high probability that they share a door or wall. These algorithms exploit the global topology of the network. However, when they are applied to detect if unknown nodes are functionally associated to known ones, they may suffer of a drawback depending by the context. In biology, two functionally related fragments interact physically (direct interaction) or interact indirectly thanks to one or very few mediators. Therefore, exploring too far similarities between nodes can introduce noise in the prediction. We apply a random walk with restart propagation algorithm which resolution is set to 0.2 for giving high values only to the close neighbours of the a priori nodes.

Value

(data.frame) Continuous matrix of patient profiles in which each gene has the final propagation score

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
suppressWarnings(suppressMessages(require(MultiAssayExperiment)))
require(doParallel)

# load mutation and phenotype data
genoFile <- system.file("extdata","TGCT_mutSmooth_geno.txt",package="netDx")
geno <- read.delim(genoFile,sep="\t",header=TRUE,as.is=TRUE)
phenoFile <- system.file("extdata", "TGCT_mutSmooth_pheno.txt",
			package="netDx")
pheno <- read.delim(phenoFile,sep="\t",header=TRUE,as.is=TRUE)
rownames(pheno) <- pheno$ID

# load interaction nets to smooth over
require(BiocFileCache)
netFileURL <- paste("https://download.baderlab.org/netDx/",
	"supporting_data/CancerNets.txt",sep="")
cache <- rappdirs::user_cache_dir(appname = "netDx")
bfc <- BiocFileCache::BiocFileCache(cache,ask=FALSE)
netFile <- bfcrpath(bfc,netFileURL)
cancerNets <- read.delim(netFile,sep="\t",header=TRUE,as.is=TRUE)
# smooth mutations
prop_net <- smoothMutations_LabelProp(geno,cancerNets,numCores=1L)

BaderLab/netDx documentation built on Sept. 26, 2021, 9:13 a.m.