facopy: Feature-Based Copy Number Association Analysis

Description Usage Arguments Details Value Author(s) Examples

View source: R/facopy_internal_v45.r

Description

Main function in the facopy package. It performs statistical association between copy number data and further variables at each genomic feature of interest.

Usage

1
2
3
4
5
6
7
facopy(fad, alteration, model, nullModel = NULL,
       modelPart = c("response","predictor","unknown","whole")[1],
       strata = NULL, toOrdered = NULL, toIntervals = NULL, 
       sel = NULL, plot = FALSE, pvalThr = 0.05, db = NULL, 
       link = c("logit", "probit")[1], parametric = FALSE,
       design = c("binary", "versus", "lvog")[1],
       FUN, ...)

Arguments

fad

facopyInfo object with a certain study's facopy data. Feature information should have been added beforehand on this object (see addFeatures).

alteration

A character, the name of the combination of alterations to be considered for the analysis.
It should be one of the following:
- amplifications All amplifications (CN>2).
- deletions All deletions (CN<2).
- loh All loss of heterozygosity (LOH), regardless of copy number.
- cnas All copy number alterations (CN<>2).
- any Any kind of alteration.
- all Any kind of alteration, same as any.
- onlygain Only non-LOH amplifications.
- someloss All deletions plus LOH alterations.

model

A character. Model, or part of it (response or predictor), whose association will be measured.
If modelPart="response", the name of a single variable, representing the response of the association model.
If modelPart="predictor", a linear predictor of the copy number, using a combination of the variables.
If modelPart="unknown", the association is measured as the strenght of the relationship between copy number and the specified combination of variables. The model is limited to x1 + ... + xn | strata, see independence_test.
If modelPart="whole", a character representation of a formula in compact symbolic form. Use the @ (at) symbol to refer to the copy number variable.

nullModel

A character. The null model against which to evaluate the fitness of the association model. Only used if modelPart="predictor" or modelPart="whole".

modelPart

A character. Indicates what part of the association model is defined in the model parameter.

strata

A character. The name of a categorical variable that defines stratification blocks in the association model. Only used if modelPart="response" or modelPart="unknown".

toOrdered

Certain categorical variables can also be understood as ordered. This parameter takes a list of named vectors, where the name of each vector is a variable name and its contents reflect the quantification of the variable values, in the same order as defined in addVariables.

toIntervals

Quantitative variables can be broken down into intervals. This parameter takes a list of named vectors, where the name of each vector is a variable name and its contents reflect the breaks of the variable values (excluding bottom and top limits).

sel

A vector of feature names on which to perform the association. Leave to NULL for genome-wide association over all the features specified in addFeatures. Plotting will only be done if sel=NULL.

plot

A logicil indicating whether to output a composite plot with an arm-wise display of genome-wide alteration frequencies. If the model consists of a single variable, frequencies will be broken down by variable value. Features with significant associations and additional information pulled from external databases can be displayed as overlaid layers.

pvalThr

Significant associations under this threshold will be shaded in the output plot. Only used if plot=TRUE.

db

An optional string representing the name of a database whose data will be overlaid in the output plot. Typically, the format is "[database]_[dataset]". The total amplification plus deletion frequencies will be displayed unless alteration indicates either amplification or deletion. In such cases, only the matching alterations are displayed. Only used if plot=TRUE. See getFacopyInfo for a list of available data sets.

link

A character, link function to be used with the multinomial error distribution in logistic regression models. See glm.

parametric

A logical that indicates whether to perform one-way ANOVA instead of Kruskal-Wallis in the association of copy number with quantitative variables.

design

Depending on the chosen alteration, different designs are available. The simplest design is binary: an alteration exists or it does not. The versus design, for CNAs, assigns a value of -1, 0 or 1 depending on whether a deletion, no copy number change or an amplification exists for a given feature. The vlog design, for all (any) alterations, assigns a value of -1, 0 or 1 depending on whether a deletion or LOH, no copy number change or an amplification without LOH exists.

FUN

A function that tests a model. Only used if modelPart="whole". Functions from the coin package and those that inherit from either the lm (such as glm) are supported, as well as those that directly return a pvalue. Thus, the function of interest can be wrapped in a wrapper function that provides the pvalue.

...

Further arguments for the FUN function. Only used if modelPart="whole".

Details

Only the facopyInfo object, the alteration type and a simple model (e.g. the name of a variable) are required. The rest of the parameters tune up the association model and control the graphical output.
Alterations in the selected external database, if selected, are depicted as grey overlaid bars. Significant regions are depicted in turn as overlaid rectangles that go from top to bottom.

Value

A data.frame with the following columns:

feature

Name of the genomic feature.

p_value

Pvalue from the association test under the given model at the genomic feature.

chr_q_arm

Chromosome and arm in which the genomic feature lies.

bp_st

Starting genomic position of the feature within the chromosome.

bp_en

Ending genomic position of the feature within the chromosome.

Author(s)

David Mosen-Ansorena

Examples

1
2
3
4
data(myStudy) # load example study

genes = facopy(myStudy, "amp", "stage")
head(genes)

facopy documentation built on May 2, 2018, 2:30 a.m.