ReadAmplSeqs: Read a fasta file with haplotypes and frequencies
In QSutils: Quasispecies Diversity

Description Usage Arguments Value Author(s) References See Also Examples

Loads an alignment of haplotypes and their frequencies from a fasta file.

1	ReadAmplSeqs(flnm,type="DNA")

`flnm`	File name of a fasta file with haplotype sequences and their frequencies. The header of each haplotype in the fasta file is composed of an ID followed by a vertical bar "\|" followed by the read count, and eventually followed by another vertical bar and additional information (eg, Hpl.2.0001\|15874\|25.2).
`type`	Character string specifying the types of sequences in the fasta file. This must be either "DNA" or "AA". It is "DNA" by default.

Returns a list with two elements:

`nr`	Vector of the haplotype counts.
`hseqs`	DNAStringSet or AAStringSet with the haplotype DNA sequences or amino acid sequences.

Mercedes Guerrero-Murillo and Josep Gregori

Gregori J, Esteban JI, Cubero M, Garcia-Cehic D, Perales C, Casillas R, Alvarez-Tejado M, Rodríguez-Frías F, Guardia J, Domingo E, Quer J. Ultra-deep pyrosequencing (UDPS) data treatment to study amplicon HCV minor variants. PLoS One. 2013 Dec 31;8(12):e83361. doi: 10.1371/journal.pone.0083361. eCollection 2013. PubMed PMID: 24391758; PubMed Central PMCID: PMC3877031.

Ramírez C, Gregori J, Buti M, Tabernero D, Camós S, Casillas R, Quer J, Esteban R, Homs M, Rodriguez-Frías F. A comparative study of ultra-deep pyrosequencing and cloning to quantitatively analyze the viral quasispecies using hepatitis B virus infection as a model. Antiviral Res. 2013 May;98(2):273-83. doi: 10.1016/j.antiviral.2013.03.007. Epub 2013 Mar 20. PubMed PMID: 23523552.

GetQSData

1
2
3

filepath<-system.file("extdata","ToyData_10_50_1000.fna", package="QSutils")
lst <- ReadAmplSeqs(filepath,type="DNA")
lst

Loading required package: Biostrings
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from ‘package:stats’:

    IQR, mad, sd, var, xtabs

The following objects are masked from ‘package:base’:

    anyDuplicated, append, as.data.frame, basename, cbind, colnames,
    dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
    grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget,
    order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
    rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply,
    union, unique, unsplit, which.max, which.min

Loading required package: S4Vectors
Loading required package: stats4

Attaching package: ‘S4Vectors’

The following object is masked from ‘package:base’:

    expand.grid

Loading required package: IRanges
Loading required package: XVector

Attaching package: ‘Biostrings’

The following object is masked from ‘package:base’:

    strsplit

$nr
 [1] 464  62  39  27  37  16  33  54 248  20

$hseqs
DNAStringSet object of length 10:
     width seq                                              names               
 [1]    50 CACCCTGTGACCAGTGTGTGCGA...GTGCCCCGTTGGCATTGACTAT Hpl_0_0001|464|46.4
 [2]    50 CACTCTGTGACCAGTGTGTGCGA...GTGCCCCGTTGGCATTGACTAT Hpl_1_0001|62|6.2
 [3]    50 CACCCTGTGACCAGCGTGTGCGA...GTGCCCCGTTGGCATTGACTAT Hpl_1_0002|39|3.9
 [4]    50 CACCCTGTGACCAGTGTGTGCGA...GTGCCCCGTTGGCATTGACTAC Hpl_1_0003|27|2.7
 [5]    50 CACTCTGTGACCAGTGTGTGCGA...GTGCCCCGTTAGCATTGACTAT Hpl_2_0001|37|3.7
 [6]    50 CACTCGGTGACCAGTGTGCGTGA...GTGCCCCGTTGGCATTGACTAT Hpl_4_0001|16|1.6
 [7]    50 CACTCTGTGACCAGTGTGCGCGA...GTGCCCTGTCGGCATTGACTAT Hpl_5_0001|33|3.3
 [8]    50 CACTCTGTGATCAGTGTGCGCGA...GTGCCCTGCCGGCATCGACTAT Hpl_8_0001|54|5.4
 [9]    50 CACTCTGTGATCAGTGTGCGCGA...GTGCCCTGCCGGCATCGACTAC Hpl_9_0001|248|24.8
[10]    50 CACTCTGTGATCAGTGTGCGCGA...GTGCCCTGCCGGCACCGACTAC Hpl_10_0001|20|2