# Quantify deregulation of pathways in cancer

### Description

Pathifier is an algorithm that infers pathway deregulation scores for each tumor sample on the basis of expression data. This score is determined, in a context-specific manner, for every particular dataset and type of cancer that is being investigated. The algorithm transforms gene-level information into pathway-level information, generating a compact and biologically relevant representation of each sample.

### Usage

1 2 3 |

### Arguments

`data` |
The n x m mRNA expression matrix, where n is the number of genes and m the number of samples. |

`allgenes` |
A list of n identifiers of genes. |

`syms` |
A list of p pathways, each pathway is a list of the genes it contains (as appear in "allgenes"). |

`pathwaynames` |
The names of the p pathways. |

`normals` |
A list of m logicals, true if a normal sample, false if tumor. |

`ranks` |
External knowledge on the ranking of the m samples, if exists (to use initial guess) |

`attempts` |
Number of runs to determine stability. |

`maximize_stability` |
If true, throw away components leading to low stability of sampling noise. |

`logfile` |
Name of the file the log should be written to (use stdout if empty). |

`samplings` |
A matrix specifying the samples that should be chosen in each sampling attempt, chooses a random matrix if samplings is NULL. |

`min_exp` |
The minimal expression considered as a real signal. Any values below are thresholded to be min_exp. |

`min_std` |
The minimal allowed standard deviation of each gene. Genes with lower standard deviation are divided by min_std instead of their actual standard deviation. (Recommended: set min_std to be the technical noise). |

### Value

`scores` |
The deregulation scores, the main output of pathifier |

`genesinpathway` |
The genes of each pathway used to devise its dregulation score |

`newmeanstd` |
Average standart devaition after omitting noisy components |

`origmeanstd` |
Originial average standart devaition, before omitting noisy components |

`pathwaysize` |
The number of components used to devise the pathway score |

`curves` |
The prinicipal curve learned for every pathway |

`curves_order` |
The order of the points of the prinicipal curve learned for every pathway |

`z` |
Z-scores of the expression matrix used to learn prinicpal curve |

`compin` |
The components not omitted due to noise |

`xm` |
The average expression over all normal samples |

`xs` |
The standart devation of expression over all normal samples |

`center` |
The centering used by the PCA |

`rot` |
The matrix of variable loadings of the PCA |

`pctaken` |
The number of principal components used |

`samplings` |
A matrix specifying the samples that should be chosen in each sampling attempt |

`sucess` |
Pathways for which a deregulation score was sucessfully computed |

`logfile` |
Name of the file the log was written to |

### Author(s)

Yotam Drier <drier.yotam@mgh.harvard.edu> Maintainer: Assif Yitzhaky <assif.yitzhaky@weizmann.ac.il>

### References

Drier Y, Sheffer M, Domany E. Pathway-based personalized analysis of cancer. *Proceedings of the National Academy of Sciences*, 2013, vol. 110(16) pp:6388-6393. (www.pnas.org/cgi/doi/10.1073/pnas.1219651110)

See more information on : http://www.weizmann.ac.il/pathifier/

### Examples

1 2 3 4 5 | ```
data(KEGG) # Two pathways of the KEGG database
data(Sheffer) # The colorectal data of Sheffer et al.
PDS<-quantify_pathways_deregulation(sheffer$data, sheffer$allgenes,
kegg$gs, kegg$pathwaynames, sheffer$normals, attempts = 100,
logfile="sheffer.kegg.log", min_exp=sheffer$minexp, min_std=sheffer$minstd)
``` |