read_counts: Read a counts file

View source: R/read_counts.R

read_countsR Documentation

Read a counts file

Description

This function reads in a recount3 gene or gexon counts file into R. You can first locate the file using locate_url() then download it to your computer using file_retrieve().

Usage

read_counts(counts_file, samples = NULL)

Arguments

counts_file

A character(1) with the local path to a recount3 counts file.

samples

A character() with external_id sample IDs to read in. When NULL (default), all samples will be read in. This argument is used by create_rse_manual().

Value

A data.frame() with sample IDs as the column names.

References

https://doi.org/10.12688/f1000research.12223.1 for details on the base-pair coverage counts used in recount2 and recount3.

See Also

Other internal functions for accessing the recount3 data: annotation_ext(), create_rse_manual(), file_retrieve(), locate_url_ann(), locate_url(), project_homes(), read_metadata()

Examples


## Download the gene counts file for project SRP009615
url_SRP009615_gene <- locate_url(
    "SRP009615",
    "data_sources/sra",
    type = "gene"
)
local_SRP009615_gene <- file_retrieve(url = url_SRP009615_gene)

## Read the gene counts, take about 3 seconds
system.time(SRP009615_gene_counts <- read_counts(local_SRP009615_gene))
dim(SRP009615_gene_counts)

## Explore the top left corner
SRP009615_gene_counts[seq_len(6), seq_len(6)]

## Explore the first 6 samples.
summary(SRP009615_gene_counts[, seq_len(6)])

## Note that the count units are in
## base-pair coverage counts just like in the recount2 project.
## See https://doi.org/10.12688/f1000research.12223.1 for more details
## about this type of counts.
## They can be converted to reads per 40 million reads, RPKM and other
## counts. This is more easily done once assembled into a
## RangedSummarizedExperiment object.

## Locate and retrieve an exon counts file
local_SRP009615_exon <- file_retrieve(
    locate_url(
        "SRP009615",
        "data_sources/sra",
        type = "exon"
    )
)
local_SRP009615_exon

## Read the exon counts, takes about 50-60 seconds
system.time(
    SRP009615_exon_counts <- read_counts(
        local_SRP009615_exon
    )
)
dim(SRP009615_exon_counts)
pryr::object_size(SRP009615_exon_counts)

## Explore the top left corner
SRP009615_exon_counts[seq_len(6), seq_len(6)]

## Explore the first 6 samples.
summary(SRP009615_exon_counts[, seq_len(6)])

LieberInstitute/recount3 documentation built on May 4, 2024, 4:16 a.m.