create_rse_manual: Internal function for creating a recount3...

View source: R/create_rse_manual.R

create_rse_manualR Documentation

Internal function for creating a recount3 RangedSummarizedExperiment object

Description

This function is used internally by create_rse() to construct a recount3 RangedSummarizedExperiment-class object that contains the base-pair coverage counts at the gene or exon feature level for a given annotation.

Usage

create_rse_manual(
  project,
  project_home = project_homes(organism = organism, recount3_url = recount3_url),
  type = c("gene", "exon", "jxn"),
  organism = c("human", "mouse"),
  annotation = annotation_options(organism),
  bfc = recount3_cache(),
  jxn_format = c("ALL", "UNIQUE"),
  recount3_url = getOption("recount3_url", "http://duffel.rail.bio/recount3"),
  verbose = getOption("recount3_verbose", TRUE)
)

Arguments

project

A character(1) with the ID for a given study.

project_home

A character(1) with the home directory for the project. You can find these using project_homes().

type

A character(1) specifying whether you want to access gene, exon, or exon-exon junction counts.

organism

A character(1) specifying which organism you want to download data from. Supported options are "human" or "mouse".

annotation

A character(1) specifying which annotation you want to download. Only used when type is either gene or exon.

bfc

A BiocFileCache-class object where the files will be cached to, typically created by recount3_cache().

jxn_format

A character(1) specifying whether the exon-exon junction files are derived from all the reads (ALL) or only the uniquely mapping read counts (UNIQUE). Note that UNIQUE is only available for some projects: GTEx and TCGA for human.

recount3_url

A character(1) specifying the home URL for recount3 or a local directory where you have mirrored recount3. Defaults to the load balancer http://duffel.rail.bio/recount3, but can also be https://recount-opendata.s3.amazonaws.com/recount3/release from https://registry.opendata.aws/recount/ or SciServer datascope from IDIES at JHU https://sciserver.org/public-data/recount3/data. You can set the R option recount3_url (for example in your .Rprofile) if you have a favorite mirror.

verbose

A logical(1) indicating whether to show messages with updates.

Value

A RangedSummarizedExperiment-class object.

References

https://doi.org/10.12688/f1000research.12223.1 for details on the base-pair coverage counts used in recount2 and recount3.

See Also

Other internal functions for accessing the recount3 data: annotation_ext(), file_retrieve(), locate_url_ann(), locate_url(), project_homes(), read_counts(), read_metadata()

Examples


## Unlike create_rse(), here we create an RSE object by
## fully specifying all the arguments for locating this study
rse_gene_SRP009615_manual <- create_rse_manual(
    "SRP009615",
    "data_sources/sra"
)
rse_gene_SRP009615_manual

## Check how much memory this RSE object uses
pryr::object_size(rse_gene_SRP009615_manual)

## Test with a collection that has a single sample
## NOTE: this requires loading the full data for this study when
## creating the RSE object
rse_gene_ERP110066_collection_manual <- create_rse_manual(
    "ERP110066",
    "collections/geuvadis_smartseq",
    recount3_url = "http://snaptron.cs.jhu.edu/data/temp/recount3"
)
rse_gene_ERP110066_collection_manual

## Check how much memory this RSE object uses
pryr::object_size(rse_gene_ERP110066_collection_manual)

## Mouse example
rse_gene_DRP002367_manual <- create_rse_manual(
    "DRP002367",
    "data_sources/sra",
    organism = "mouse"
)
rse_gene_DRP002367_manual

## Information about how this RSE was made
metadata(rse_gene_DRP002367_manual)

## Test with a collection that has one sample, at the exon level
## NOTE: this requires loading the full data for this study (nearly 6GB!)
## Not run: 
rse_exon_ERP110066_collection_manual <- create_rse_manual(
    "ERP110066",
    "collections/geuvadis_smartseq",
    type = "exon",
    recount3_url = "http://snaptron.cs.jhu.edu/data/temp/recount3"
)
rse_exon_ERP110066_collection_manual


## Check how much memory this RSE object uses
pryr::object_size(rse_exon_ERP110066_collection_manual)
# 409 MB

## Test with a collection that has one sample, at the junction level
## NOTE: this requires loading the full data for this study
system.time(rse_jxn_ERP110066_collection_manual <- create_rse_manual(
    "ERP110066",
    "collections/geuvadis_smartseq",
    type = "jxn",
    recount3_url = "http://snaptron.cs.jhu.edu/data/temp/recount3"
))
rse_jxn_ERP110066_collection_manual

## Check how much memory this RSE object uses
## NOTE: this doesn't run since 2 files are missing on the test site!
pryr::object_size(rse_jxn_ERP110066_collection_manual)

## End(Not run)

## Not run: 
## For testing and debugging
project <- "ERP110066"
project_home <- "collections/geuvadis_smartseq"

project <- "SRP009615"
project_home <- "data_sources/sra"
type <- "gene"
organism <- "human"
annotation <- "gencode_v26"
jxn_format <- "ALL"
bfc <- recount3_cache()
recount3_url <- "http://idies.jhu.edu/recount3/data"
verbose <- TRUE

## End(Not run)

LieberInstitute/recount3 documentation built on Dec. 11, 2024, 8:35 p.m.