preprocess_querydata: preprocess_querydata

Description Usage Arguments Details Value Examples

View source: R/c3_functions.R

Description

This function preprocesses the query data for ID conversion and calculates average value of the replicates

Usage

1
2
3
preprocess_querydata(cell.tissue.data, species = "hsapiens",
  data.format = "matrix", geneID = "ensembl_gene_id",
  experiment.descriptor = NULL, collapse.method = "max")

Arguments

cell.tissue.data

A matrix, list or vector containing cell type / tissue specific gene expression data. For a list, Each element of the list contains a named gene expression vector or matrix with replicates. The list names (denoting cell types) should be unique, otherwise the experiment.descriptor parameter value should be provided with unique names for each cell or tissue. For a matrix, rows denote the genes and the columns denote the cell types or tissues. Duplicate column names are expected in this case denoting replicate samples. All the replicate samples for a specific cell or tissue should have identical column names, otherwise the experiment.descriptor parameter should be used to identify replicate samples of a specific cell type or tissue. For a vector, it should be a contain expression values from a single cell type or tissue and the names of elements should be gene IDs.

species

The species abbreviation of the query data (cell.tissue.data). Default is "hsapiens".

data.format

Format of cell.tissue.data, either 'list', 'matrix' or 'vector', Default is "matrix".

geneID

The code for the type of gene IDs used by cell.tissue.data, as used by the biomaRt database. To find the valid codes for gene IDs for a species, please see the find_valid_geneID() function of the C3 package. Default is "ensembl_gene_id".

experiment.descriptor

A vector corresponding to the column names (matrix) or elements (list) of cell.tissue.data, containing the cell type or tissues of each sample. The names should be identical for a specific cell or tissue. Defaults to NULL.

collapse.method

How to summarise values when one ensembl_gene_id has more then one value (multiple microarray probes or transcripts to one gene for example). Currently two options are implemented, 'max' or 'mean'. Default is "max".

Details

This function preprocesses the query data for ID conversion and calculates average value of the replicates

Value

This function returns a list with average value of the replicates with ensembl gene ID and species name

Examples

1
2
3
4
query.data<-matrix(sample(1:10, 100, replace=TRUE),10,10)
rownames(query.data)<-c("CRYAA", "CRYAB", "CRYBB3", "PAX6", "SOX2", "PROX1", "SIX3", "CRIM1", "CRYBB2", "BMP7")
colnames(query.data)<-c("cell1", "cell1", "cell1", "cell2", "cell2", "cell2", "cell3", "cell3", "cell3", "cell3")
preprocess_querydata(cell.tissue.data=query.data, geneID="external_gene_name")

VCCRI/C3 documentation built on May 14, 2019, 8:41 a.m.