read.cds: Read the CDS of a given organism
In HajkD/seqreadr: Read Biological Sequence File Formats

Description Usage Arguments Details Value Author(s) Examples

Read an organism specific Coding Sequence (CDS) file stored in fasta or fastq format. In case the input file includes corrupt sequences (= sequences that do not fulfill the triplet criteria) users can specify the delete.corrupt = TRUE argument to remove corrupt sequences from the input file.

1	read.cds(file, format, delete.corrupt = FALSE, ...)

`file`	a character string specifying the path to the file storing the CDS.
`format`	a character string specifying the file format used to store the CDS, e.g. "fasta", "fatsq".
`delete.corrupt`	a logical value indicating whether corrupt base triplets should be removed from the input `file`.
`...`	additional arguments that are used by the `readDNAStringSet` function.

The read.cds function takes a string specifying the path to the cds file of interest as first argument.

For example, CDS files fulfilling the fasta file format can be downloaded from http://www.ensembl.org/info/data/ftp/index.html.

Alternatively users

A data.frame storing the gene id in the first column, the corresponding sequence as string in the second column, and the sequence length in the third column.

Hajk-Georg Drost

### Example Non-Corrupt File
# reading a cds file stored in fasta format
Ath.cds <- read.cds(system.file('seqs/ortho_thal_cds.fasta', package = 'seqreadr'),
                    format = "fasta")

dplyr::glimpse(Ath.cds)

### Example Corrupt File
# reading a cds file stored in fasta format
Ath.cds <- read.cds(system.file('seqs/ortho_thal_cds_corrupt.fasta', package = 'seqreadr'),
                    format         = "fasta",
                    delete.corrupt = TRUE)
                    
dplyr::glimpse(Ath.cds)