readGEORawFile: Read in Unmethylated and Methylated signals from a GEO raw...

View source: R/read.geo.R

readGEORawFileR Documentation

Read in Unmethylated and Methylated signals from a GEO raw file.

Description

Read in Unmethylated and Methylated signals from a GEO raw file.

Usage

readGEORawFile(filename, sep = ",", Uname = "Unmethylated signal",
               Mname = "Methylated signal", row.names = 1, pData = NULL,
               array = "IlluminaHumanMethylation450k",
               annotation = .default.450k.annotation, mergeManifest = FALSE,
               showProgress = TRUE, ...)

Arguments

filename

The name of the file to be read from.

sep

The field separator character. Values on each line of the file are separated by this character.

Uname

A string that uniquely identifies the columns containing the unmethylated signals.

Mname

A string that uniquely identifies the columns containing the methylated signals.

row.names

The column containing the feature (CpG) IDs.

pData

A DataFrame or data.frame describing the samples represented by the columns of mat. If the rownames of the pData don't match the colnames of mat these colnames will be changed. If pData is not supplied, a minimal DataFrame is created.

array

Array name.

annotation

The feature annotation to be used. This includes the location of features thus depends on genome build.

mergeManifest

Should the Manifest be merged to the final object.

showProgress

TRUE displays progress on the console. It is produced in fread's C code.

...

Additional arguments passed to data.table::fread().

Details

450K experiments uploaded to GEO typically include a raw data file as part of the supplementary materials. Unfortunately there does not appear to be a standard format. This function provides enough flexibility to read these files. Note that you will likely need to change the sep, Uname, and Mname arguments and make sure the first column includes the feature (CpG) IDs. You can use the readLines function to decipher how to set these arguments.

Note that the function uses the fread function in the data.table package to read the data. To install data.table type install.packages("data.table"). We use this package because the files too large for read.table.

Value

A GenomicMethylSet object.

Author(s)

Rafael A. Irizarryrafa@jimmy.harvard.edu.

See Also

getGenomicRatioSetFromGEO

Examples

## Not run: 
library(GEOquery)
getGEOSuppFiles("GSE29290")
gunzip("GSE29290/GSE29290_Matrix_Signal.txt.gz")
# NOTE: This particular example file uses a comma as the decimal separator
#       (e.g., 0,00 instead of 0.00). We replace all such instances using the
#       command line tool 'sed' before reading in the modified file.
cmd <- paste0("sed s/,/\./g GSE29290/GSE29290_Matrix_Signal.txt > ",
              "GSE29290/GSE29290_Matrix_Signal_mod.txt")
system(cmd)
gmset <- readGEORawFile(filename = "GSE29290/GSE29290_Matrix_Signal_mod.txt",
                        Uname = "Signal_A",
                        Mname = "Signal_B",
                        sep = "\t")

## End(Not run)

hansenlab/minfi documentation built on May 3, 2024, 3:49 p.m.