encodeGenome: Encode a Single Genome

Description Usage Arguments Value Examples

View source: R/Yin-DFT-Distances-R.r

Description

Encodes a single genomic signal accepted as a character string into either a four by signal length, or two by signal length matrix depending on whether the user specifies 2D or 4D, that contains the encoded genomic signal.

Usage

1
encodeGenome(stringGenome, dimension = "4D", strategy = "AC")

Arguments

stringGenome

The genomic signal the user would like to work with

dimension

Either '2D' or '4D' character vector, '4D' by default, this specifies whether to use the binary four-signal encoding originally described in 2014 by Yin and Yau, or three-vauled two-signal encoding.

strategy

This is only used for the 2D encoding, it determines which of the possible 2D encodings is used, this is determined for ensembles by considering the concentration of nucleotides which are Adonine and Cytosine or Adonine and Guanine.

Value

Returns a matrix of dimension 2xSignalLength, or 4xSignalLength where each row is determined by the encoding strategy, for 4D encoding, each row represents the presence or absence of a specific nucleotide, for the 2D encoding, the two signals of three values -1, 0, 1 are determined according to the strategy choosen.

Examples

1
2
3
EncodedSignal2D <- encodeGenome('ACCACTTGAAGAGAGCCCGGGAT', '4D'); 
EncodedSignal4DAC <- encodeGenome('ACCACTTGAAGAGACCCGGGAT', '2D', 'AC'); 
EncodedSignal4DAG <- encodeGenome('ACCACTTGAAGAGACCCGGGAT','2D','AG'); 

mathornton01/Genomic-DFT-Yin-R documentation built on Dec. 21, 2021, 2:52 p.m.