evaluate_coex_partners: Summarise whether the coexpression partners of each...
In russHyde/coxpresdbr: Parses and Summarises Data from coxpresdb.jp

Description Usage Arguments Details Value

For each source gene in the dataset, there is a set of target genes (the neighbourhood of that source gene). A two-tailed p-value for both the source and target genes should be defined (with respect to some experimental comparison).

1	evaluate_coex_partners(x, coex_partners)

`x`	A data-frame containing columns `gene_id`, `p_value` and `direction` (at least). The `p_value` column should contain two-tailed p-values. There should be no duplicate gene identifiers in the data-frame.
`coex_partners`	A subset of the coexpresDB.jp database containing the coexpression partners of a set of source-genes. As returned by `get_coex_partners`. Must contain columns `source_id` and `target_id`.

For each source gene, this function combines the p-values observed in its neighbourhood into a single summary score. We use the sum-of-z-scores method as used in the 'metap' package (although we have reimplemented it for numerical stability); for a source gene with 'n' neighbouring genes, a z-score is computed for each neighbouring gene, and then we take sumz = sum(z_i : i = 1 .. n) / sqrt(n) as the combined score. A two-tailed p-value corresponding to 'sumz' is also returned. (the z-score for a given neighbour is obtained by comparing its p-value against the standard normal distribution).

A data-frame containing a row for each source gene in the input and a summary of the number of partner-genes, the average z-score across all partner genes and the p-value equivalent to this z-score.

russHyde/coxpresdbr documentation built on Dec. 24, 2019, 11:59 a.m.