encodeSequences: Encode nucleotide sequences
In MarioniLab/DropletUtils: Utilities for Handling Single-Cell Droplet Data

encodeSequences

R Documentation

Encode nucleotide sequences

Description

Encode short nucleotide sequences into integers with a 2-bit encoding.

Usage

encodeSequences(sequences)

Arguments

sequences

A character vector of short nucleotide sequences, e.g., UMIs or cell barcodes.

Details

Each pair of bits encodes a nucleotide - 00 is A, 01 is C, 10 is G and 11 is T. The least significant byte contains the 3'-most nucleotides, and the remaining bits are set to zero. Thus, the sequence “CGGACT” is converted to the binary form:

    01 10 10 00 01 11

... which corresponds to the integer 1671.

A consequence of R's use of 32-bit integers means that no element of sequences can be more than 15 nt long. Otherwise, integer overflow will occur.

Value

An integer vector containing the encoded sequences.

Author(s)

Aaron Lun

References

10X Genomics (2017). Molecule info. https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/output/molecule_info

Examples

encodeSequences("CGGACT")

MarioniLab/DropletUtils documentation built on July 16, 2025, 1:57 p.m.

MarioniLab/DropletUtils index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

MarioniLab/DropletUtils
Utilities for Handling Single-Cell Droplet Data

encodeSequences: Encode nucleotide sequences
In MarioniLab/DropletUtils: Utilities for Handling Single-Cell Droplet Data

Encode nucleotide sequences

Description

Usage

Arguments

Details

Value

Author(s)

References

Examples

Related to encodeSequences in MarioniLab/DropletUtils...

R Package Documentation

Browse R Packages

We want your feedback!

MarioniLab/DropletUtils Utilities for Handling Single-Cell Droplet Data

encodeSequences: Encode nucleotide sequences In MarioniLab/DropletUtils: Utilities for Handling Single-Cell Droplet Data

Encode nucleotide sequences

Description

Usage

Arguments

Details

Value

Author(s)

References

Examples

Related to encodeSequences in MarioniLab/DropletUtils...

R Package Documentation

Browse R Packages

We want your feedback!

MarioniLab/DropletUtils
Utilities for Handling Single-Cell Droplet Data

encodeSequences: Encode nucleotide sequences
In MarioniLab/DropletUtils: Utilities for Handling Single-Cell Droplet Data