TENxH5-class: TENxH5: The HDF5 file representation class for 10X Data

TENxH5-classR Documentation

TENxH5: The HDF5 file representation class for 10X Data

Description

This class is designed to work with 10x Single Cell datasets. It was developed using the PBMC 3k 10X dataset from the CellRanger v2 pipeline.

Usage

## S4 method for signature 'TENxH5'
rowData(x, use.names = TRUE, ...)

## S4 method for signature 'TENxH5'
dim(x)

## S4 method for signature 'TENxH5'
dimnames(x)

## S4 method for signature 'TENxH5'
genome(x)

## S4 method for signature 'TENxH5'
rowRanges(x, ...)

## S4 method for signature 'TENxH5,ANY,ANY'
import(con, format, text, ...)

## S4 method for signature 'TENxH5'
show(object)

Arguments

x

A TENxH5 object

use.names

For rowData: Like mcols(x), by default rowData(x) propagates the rownames of x to the returned DataFrame object (note that for a SummarizedExperiment object or derivative, the rownames are also the names i.e. rownames(x) is always the same as names(x)). Setting use.names=FALSE suppresses this propagation i.e. it returns a DataFrame object with no rownames. Use this when rowData(x) fails, which can happen when the rownames contain NAs (because the rownames of a SummarizedExperiment object or derivative can contain NAs, but the rownames of a DataFrame object cannot).

For combineRows and combineCols: See Combining section below.

...

For assay, arguments in ... are forwarded to assays.

For rbind, cbind, ... contains SummarizedExperiment objects (or derivatives) to be combined.

For other accessors, ignored.

con

The connection from which data is loaded or to which data is saved. If this is a character vector, it is assumed to be a file name and a corresponding file connection is created and then closed after exporting the object. If it is a BiocFile derivative, the data is loaded from or saved to the underlying resource. If missing, the function will return the output as a character vector, rather than writing to a connection.

format

The format of the output. If missing and con is a file name, the format is derived from the file extension. This argument is unnecessary when con is a derivative of BiocFile.

text

If con is missing, this can be a character vector directly providing the string data to import.

object

A TENxH5 class object

Details

The data version "3" mainly includes a "matrix" group and "interval" information within the file. Version "2" data does not include ranged-based information and has a different directory structure compared to version "3". See the internal data.frame: TENxIO:::h5.version.map for a map of fields and their corresponding file locations within the H5 file. This map is used to create the rowData structure from the file.

Value

A TENxH5 class object

Methods (by generic)

  • rowData(TENxH5): Generate the rowData ad hoc from a TENxH5 file

  • dim(TENxH5): Get the dimensions of the data as stored in the file

  • dimnames(TENxH5): Get the dimension names from the file

  • genome(TENxH5): Read genome string from file

  • rowRanges(TENxH5): Read interval data and represent as GRanges

  • import(con = TENxH5, format = ANY, text = ANY): Import TENxH5 data as a SingleCellExperiment; see section below

  • show(TENxH5): Display a snapshot of the contents within a TENxH5 file before import

Slots

version

character(1) There are currently two recognized versions associated with 10X data, either version "2" or "3". See details for more information.

group

character(1) The HDF5 group embedded within the file structure, this is usually either the "matrix" or "outs" group but other groups are supported as well.

ranges

character(1) The HDF5 internal folder location embedded within the file that points to the ranged data information, e.g., "/features/interval".

import

The import method uses DelayedArray::TENxMatrix to represent matrix data. Generally, version 3 datasets contain associated genomic coordinates. The associated feature data, as displayed by the rowData method, is queried for the "Type" column which will indicate that a splitAltExps operation is appropriate. If a ref input is provided to the constructor function TENxH5, it will be used as the main experiment; otherwise, the most frequent category in the "Type" column will be used. For example, the Multiome ATAC + Gene Expression feature data contains both 'Gene Expression' and 'Peaks' labels in the "Type" column.

See Also

TENxH5


LiNk-NY/TENxIO documentation built on Dec. 6, 2024, 8:38 a.m.