Perform an integrated analysis of gene expression (GE) and copy number alteration (CNA)

Share:

Description

The function finds CNA-driven differentially expressed gene and returns the corresponding p-value, false discovery rate, and associated statistics. The result includes three tables which collects information for gain-, loss-, and both-driven genes.

Usage

1
2
find_cna_driven_gene(gene_cna, gene_exp, gain_prop = 0.2, loss_prop = 0.2,
  progress = TRUE, progress_width = 32, parallel = FALSE)

Arguments

gene_cna

Joint CNA table from create_gene_cna.

gene_exp

Joint gene expression table from create_gene_exp.

gain_prop

Minimum proportion of the gain samples to be consider CNA-gain. Default is 0.2.

loss_prop

Minimum proportion of the loss samples to be consider CNA-loss. Default is 0.2.

progress

Whether to display a progress bar. By default TRUE.

progress_width

The text width of the shown progress bar. By default is 48 chars wide.

parallel

Enable parallelism by plyr. One has to specify a parallel engine beforehand. See example for more information.

Details

The gene is considered CNA-gain if the proportion of the sample exhibiting gain exceeds the threshold gain_prop, that is, number of samples having gain_loss = 1. Reversely, the gene is considered CNA-loss if %samples that gain_loss = -1 is below a given threshold loss_prop.

When performing the t-test, sample grouping depends on the analysis scenario being either CNA-gain or CNA-loss driven. In CNA-gain driven scenario, two groups, CNA-gain and the other samples, are made. In CNA-loss driven scenario, group CNA-loss and the others are made. Genes that appear in both scenarios will be collected into a third table and excluded from their original tables.

See the vignette for usage of this function by a thorough example.

Value

List of three data.table objects for CNA-driven scenarios: gain, loss, and both, which can be accessed by names: 'gain_driven', 'loss_driven' and 'both'.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
require(data.table)

## Create gene_exp and gene_cna manually. The following shows an example
## consisting of 3 genes (BRCA2, TP53, and GNPAT) and 5 samples (A to E).

gene_exp <- data.table(
    GENE = c("BRCA2", "TP53", "GNPAT"),
    A = c(-0.95, 0.89, 0.21), B = c(1.72, -0.05, NA),
    C = c(-1.18, 1.15, 2.47), D = c(-1.24, -0.07, 1.2),
    E = c(1.01, 0.93, 1.54)
)
gene_cna <- data.table(
    GENE = c("BRCA2", "TP53", "GNPAT"),
    A = c(1, 1, NA), B = c(-1, -1, 1),
    C = c(1, -1, 1), D = c(1, -1, -1),
    E = c(0, 0, -1)
)


## Find CNA-driven genes

cna_driven_genes <- find_cna_driven_gene(
    gene_cna, gene_exp, progress=FALSE
)

# Gain driven genes
cna_driven_genes$gain_driven

# Loss driven genes
cna_driven_genes$loss_driven

# Gene shown in both gain and loss records
cna_driven_genes$both