translate_cds_to_protein_all: Translate coding sequences into amino acid sequences for...
In HajkD/orthologr: Comparative Genomics with R

View source: R/translate_cds_to_protein_all.R

translate_cds_to_protein_all

R Documentation

Translate coding sequences into amino acid sequences for multiple files

Description

A helper function that takes fasta files storing coding sequences as input and translates these coding sequences into amino acid sequences storing them as fasta output files.

Usage

translate_cds_to_protein_all(
  input_folder,
  output_folder,
  delete_corrupt_cds = FALSE
)

Arguments

`input_folder`	file path to folder storing the coding sequences `fasta` files.
`output_folder`	name or file path to a folder that that shall be generated to store the output `fasta` files.
`delete_corrupt_cds`	delete_corrupt_cds a logical value indicating whether sequences with corrupt base triplets should be removed from the input `file`. This is the case when the length of coding sequences cannot be divided by 3 and thus the coding sequence contains at least one corrupt base triplet.

Author(s)

Hajk-Georg Drost

Examples

## Not run: 
# install.packages("biomartr")
# download coding sequences of Arabidopsis thaliana, Arabidopsis lyrata, and Capsella rubella
org_list <- c("Arabidopsis thaliana", "Arabidopsis lyrata", "Capsella rubella")
biomartr::getCDSSet(db = "refseq",
             organism = org_list,
             gunzip = TRUE,
             path = "cds_examples")
# translate coding sequences into amino acid sequences
translate_cds_to_protein_all(input_folder = "cds_examples", 
                             output_folder = "translated_seqs",
                             delete_corrupt_cds = FALSE)

## End(Not run)

HajkD/orthologr documentation built on Oct. 13, 2023, 12:11 a.m.