retrieveFromGEO: retrieveFromGEO

View source: R/DataFormatting.R

retrieveFromGEOR Documentation

retrieveFromGEO

Description

This function retrieves the count matrix and columns meta-data from GEO. They are formatted to be suitable inputs for conclus.

Usage

retrieveFromGEO(matrixURL, countMatrixPath, species,
seriesMatrixName=NA, metaDataPath=NA, colMetaDataURL=NA,
convertToSymbols=TRUE, annoType="ENSEMBL")

Arguments

matrixURL

URL of the count matrix. The matrix must be un-normalized.

countMatrixPath

Path to the file to which the downloaded count matrix will be saved.

species

Values should be 'mouse' or 'human'. Other organisms can be added on demand.

seriesMatrixName

Name of the columns meta-data file hosted on GEO. This name can usually be found in the 'Series Matrix File(s)' section. Should not be used if colMetaDataURL is defined. Default=NA.

metaDataPath

If colMetaDataURL is used, defines the path to the file to which the downloaded meta-data will be saved.

colMetaDataURL

URL of the columns meta-data file hosted on GEO. This file can be found in 'supplementary file'. Should not be used if seriesMatrixName is defined. Default=NA.

convertToSymbols

Boolean indicating if the genes IDs contained in the row names of the matrix should be converted to official genes symbols. Default: TRUE. To choose the type of IDs contained in the count matrix, see the annoType parameter just below.

annoType

Type of the genes annotations contained in the row names of the count Matrix. Default: "ENSEMBL". See details.

Details

The conversion (TRUE by default) of the row genes IDs (ENSEMBL by default) to official genes symbols is done with the function 'bitr' of the 'clusterProfiler' package. To see a list of all possible values to pass to the annoType parameter use 'keytypes' method on "org.Mm.eg.db" (for mouse) or "org.Hs.eg.db" (for human). For example, copy/paste in a R terminal: library(org.Mm.eg.db);keytypes(org.Mm.eg.db)

Value

A list. The first element contains the count matrix and the second element contains the columns meta-data.

Author(s)

Nicolas DESCOSTES & Ilyess RACHEDI

Examples

outputDirectory <- "./YourOutputDirectory"
dir.create(outputDirectory, showWarnings=FALSE)
species <- "mouse"

countMatrixPath <- file.path(outputDirectory, "countmatrix.txt")
matrixURL <- paste0("https://www.ncbi.nlm.nih.gov/geo/download/?acc=",
"GSE96982&format=file&file=GSE96982%5FcountMatrix%2Etxt%2Egz")
seriesMatrix <- "GSE96982-GPL19057_series_matrix.txt.gz"

result <- retrieveFromGEO(matrixURL, countMatrixPath, species,
seriesMatrixName=seriesMatrix)

countMatrix <- result[[1]]
columnsMetaData <- result[[2]]


ilyessr/conclus documentation built on April 8, 2022, 1:43 p.m.