TENxH5: TENxH5: Represent H5 files from 10X

View source: R/TENxH5-class.R

TENxH5R Documentation

TENxH5: Represent H5 files from 10X

Description

This constructor function was developed using the PBMC 3K dataset from 10X Genomics (version 3). Other versions are supported and input arguments version and group can be overridden.

Usage

TENxH5(resource, version, group, ranges, rowidx, colidx, ...)

Arguments

resource

character(1) The path to the file

version

character(1) There are currently two recognized versions associated with 10X data, either version "2" or "3". See details for more information.

group

character(1) The HDF5 group embedded within the file structure, this is usually either the "matrix" or "outs" group but other groups are supported as well (e.g., "mm10").

ranges

character(1) The HDF5 internal folder location embedded within the file that points to the ranged data information, e.g., "/features/interval". Set to NA_character_ if range information is not present.

rowidx, colidx

numeric() A vector of indices corresponding to either rows or columns that will dictate the data imported from the file. The indices will be passed on to the [ method of the TENxMatrix representation.

...

Additional inputs to the low level class generator functions

Details

The various TENxH5 methods including rowData and rowRanges, provide a snapshot of the data using a length 12 head and tail subset for efficiency. In contrast, methods such as dimnames and dim give a full view of the dimensions of the data. The show method provides relevant information regarding the dimensions of the data including metadata such as rowData and "Type" column, if available. The term "projection" refers to the data class that will be provided once the data file is imported.

An additional ref argument can be provided when the file contains multiple feature_type in the file or "Type" in the rowData. By default, the first type reported in table() is set as the mainExpName in the SingleCellExperiment object.

For data that do not contain genomic coordinate information, the TENxH5 will fail to read "/features/interval" and will set the ranges argument to NA_character_.

The data version "3" mainly includes a "matrix" group and "interval" information within the file. Version "2" data does not include ranged-based information and has a different directory structure compared to version "3". See the internal data.frame: TENxIO:::h5.version.map for a map of fields and their corresponding file locations within the H5 file. This map is used to create the rowData structure from the file.

Value

Usually, a SingleCellExperiment instance

See Also

import section in TENxH5

Examples


h5f <- system.file(
    "extdata", "pbmc_granulocyte_ff_bc_ex.h5",
    package = "TENxIO", mustWork = TRUE
)

TENxH5(h5f)

import(TENxH5(h5f))

h5f <- system.file(
    "extdata", "10k_pbmc_ATACv2_f_bc_ex.h5",
    package = "TENxIO", mustWork = TRUE
)

## Optional ref input, most frequent Type used by default
th5 <- TENxH5(h5f, ranges = "/features/id", ref = "Peaks")
th5
TENxH5(h5f, ranges = "/features/id")
import(th5)


LiNk-NY/TENxIO documentation built on Nov. 16, 2024, 7:10 p.m.