virtualArrayExpressionSets: Combine different ExpressionSets into one

Description Usage Arguments Details Value Author(s) See Also Examples

Description

This function selects all ExpressionSets in the current environment and builds a new single ExpressionSet of the raw data included in the input. This is done by annotating the expression values with the selected identifiers, that are pulled from Bioconductor annotation packages. Then lines targetting the same gene are collapsed by the specified function. In the next step compatible rows of the expression matrices are merged. As a final step batch effects resulting from different platforms or labs can be removed in a supervised or non-supervised mode.

Usage

1
virtualArrayExpressionSets(all_expression_sets=FALSE, identifier = "SYMBOL", covars = 3, collapse_fun = median, removeBatcheffect = "EB", sampleinfo = FALSE, supervised = FALSE, ...)

Arguments

all_expression_sets

Logical or a character vector. If "FALSE", "virtualArray" tries to catch all ExpressionSets in the current environment. If set to a character vector holding names of ExpressionSets, these are used instead of all available ones.

identifier

annotation identifier by which the expression values are combined

covars

numerical or character vector of length "1". See details for more info.

collapse_fun

The function to be used to collapse expression values targetting the same gene/identifier. Defaults to "median".

removeBatcheffect

Logical or character vector. "FALSE" will lead to just a combined ExpressionSet, you will then have to use other functions to remove the batch effects. You can set it to "EB", "GQ", "MRS", QD" ,"NORDI" or "MC" to use empirical Bayes methods, gene quantiles, median rank scores, quantile discretization ,normal discretization or mean centering to remove batch effects, respectively.

sampleinfo

If you run in non-interactive mode, you can specify a data.frame to be used as the input "sample_info". Note that normally this data.frame is created on the fly, so you will need to set it up manually in this case.

supervised

Logical; select if you want to run in supervised or non-supervised mode. In non-supervised mode only contribution of samples to batches are relevant for batch effect removal, whereas in supervised mode the 4th column of "sample_info" or even more columns are used to group samples based on the users decision. Please see the vignette for more information to this option.

...

Can be used to pass on parameters to underlying functions.

Details

The "covars" argument determines the mode of batch removal. It refers to the columns in the sample_info data.frame which contains information about all ExpressionSets, their samples and relations thereof. The default value of "3" will use only the different ExpressionSets for batch effect removal, this is referred to as the non-supervised mode. The supervised mode is to be accessed by using a higher numerical value than "3" or the character vector "all". In this case the sample_info data.frame has to be modified manually to contain more information on the batches in additional columns. Please note, that during computation you will be notified that "sample_info.txt" has been written to your current working directory for you to modify and save it. If you do so, please select "y" to use the additional columns. Also note that you can not provide a covariate that is distributed only in one batch, this way the procedure will fail.

Value

A new ExpressionSet is returned that combines all ExpressionSets from the current environment.

Author(s)

Andreas Heider (2011)

See Also

virtualArray-package, virtualArray.ExpressionSet, virtualArrayCompile, normalize.ExpressionSet.nordi, normalize.ExpressionSet.mrs, normalize.ExpressionSet.qd, normalize.ExpressionSet.gq

Examples

1
2
3
4
# Due to the flexibility of this function and the time 
# it takes to get meaningful results, please see the 
# vignette for a comprehensive example, governing 
# several modes of usage. Thanks.

ShixiangWang/arrayConnector documentation built on May 14, 2019, 6:02 a.m.