sqconcatenate: Concatenate sq objects

sqconcatenateR Documentation

Concatenate sq objects

Description

Merges multiple sq and possibly character objects into one larger sq object.

Arguments

...

[sq || character]
Multiple objects. For exact behavior, check Details section. First argument must be of sq class due to R mechanism of single dispatch. If this is a problem, recommended alternative is vec_c method from vctrs-package package.

Details

Whenever all passed objects are of one of standard types (that is, dna_bsc, dna_ext, rna_bsc, rna_ext, ami_bsc or ami_ext), returned object is of the same class, as no changes to alphabet are needed.

It's possible to mix both basic and extended types within one call to c(), however they all must be of the same type (that is, either dna, rna or ami). In this case, returned object is of extended type.

Mixing dna, rna and ami types is prohibited, as interpretation of letters differ depending on the type.

Whenever all objects are either of atp type, returned object is also of this class and resulting alphabet is equal to set union of all input alphabets.

unt type can be mixed with any other type, resulting in unt object with alphabet equal to set union of all input alphabets. In this case, it is possible to concatenate dna and ami objects, for instance, by concatenating one of them first with unt object. However, it is strongly discouraged, as it may result in unwanted concatenation of DNA and amino acid sequences.

Whenever a character vector appears, it does not influence resulting sq type. Each element is treated as separate sequence. If any of letters in this vector does not appear in resulting alphabet, it is silently replaced with NA.

Due to R dispatch mechanism passing character vector as first will return class-less list. This behavior is effectively impossible and definitely unrecommended to fix, as fixing it would involve changing c primitive. If such possibility is necessary, vec_c is a better alternative.

Value

sq object with length equal to sum of lengths of individual objects passed as parameters. Elements of sq are concatenated just as if they were normal lists (see c).

See Also

Functions from utility module: ==.sq(), get_sq_lengths(), is.sq(), sqextract

Examples

# Creating objects to work on:
sq_dna_1 <- sq(c("GGACTGCA", "CTAGTA", ""), alphabet = "dna_bsc")
sq_dna_2 <- sq(c("ATGACA", "AC-G", "-CCAT"), alphabet = "dna_bsc")
sq_dna_3 <- sq(character(), alphabet = "dna_bsc")
sq_dna_4 <- sq(c("BNACV", "GDBADHH"), alphabet = "dna_ext")
sq_rna_1 <- sq(c("UAUGCA", "UAGCCG"), alphabet = "rna_bsc")
sq_rna_2 <- sq(c("-AHVRYA", "G-U-HYR"), alphabet = "rna_ext")
sq_rna_3 <- sq("AUHUCHYRBNN--", alphabet = "rna_ext")
sq_ami <- sq("ACHNK-IFK-VYW", alphabet = "ami_bsc")
sq_unt <- sq("AF:gf;PPQ^&XN")

# Concatenating dna_bsc sequences:
c(sq_dna_1, sq_dna_2, sq_dna_3)
# Concatenating rna_ext sequences:
c(sq_rna_2, sq_rna_3)
# Mixing dna_bsc and dna_ext:
c(sq_dna_1, sq_dna_4, sq_dna_2)

# Mixing DNA and RNA sequences doesn't work:
## Not run: 
c(sq_dna_3, sq_rna_1)

## End(Not run)

# untsq can be mixed with DNA, RNA and amino acids:
c(sq_ami, sq_unt)
c(sq_unt, sq_rna_1, sq_rna_2)
c(sq_dna_2, sq_unt, sq_dna_3)

# Character vectors are also acceptable:
c(sq_dna_2, "TGCA-GA")
c(sq_rna_2, c("UACUGGGACUG", "AUGUBNAABNRYYRAU"), sq_rna_3)
c(sq_unt, "&#JIA$O02t30,9ec", sq_ami)


michbur/tidysq documentation built on April 1, 2022, 5:18 p.m.