parseData: Parse data to be used 'PsigA' functions

Description Usage Arguments Value Examples

Description

The function converts the gene identifiers (rownames) of a gene expression matrix to a desired format, supplied by the parameter geneIds. If any of the gene identifiers correspond to more than one row of the expression matrix (e.g. probes that map to the same gene), the median expression of each patient..

Usage

1
parseData(data, geneIds)

Arguments

data

a data frame of matrix with gene expression values where rows represent genes and columns represent samples.

geneIds

vector of length nrow(data) containing gene names in of the desired format. The format must be the same as the format of the signatures that will be used in the PsigA functions.

Value

An object of class data where the rows correspond to unique(geneIds).

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
require(Biobase)
require(breastCancerVDX)

data(vdx)

#use fData() to get the gene symbols for vdx.
VDXparsed <- parseData(data = exprs(vdx), geneIds = fData(vdx)$Gene.symbol)


require(ALL)
data(ALL)

ALLexprs <- exprs(ALL)

#featureData ("fData()") for ALL are missing so we need to get the gene
#symbols manually.

annotation(ALL)

require(hgu95av2.db)

keys <- AnnotationDbi::select(hgu95av2.db, rownames(ALLexprs),
                             "SYMBOL", "PROBEID")
#remove probe duplicates
geneIds <- keys$SYMBOL[ !duplicated(keys$PROBEID) ]

ALLparsed <- parseData(ALLexprs, geneIds)

sidiropoulos/PSigA documentation built on May 29, 2019, 9:58 p.m.