Description Usage Arguments Value Examples
The function splits a fasta formatted file to a defined number of smaller .fasta files for further processing.
1 2 3 4 5 6 7 8 | split_fasta(
path_in,
path_out,
num_seq = 20000,
trim = FALSE,
trunc = NULL,
id = FALSE
)
|
path_in |
A path to the .FASTA formatted file that is to be processed |
path_out |
A path where the resulting .FASTA formatted files should be stored. The path should also contain the prefix name of the fasta files on which _n (integer from 1 to number of fasta files generated) will be appended along with the extension ".fa" |
num_seq |
Integer defining the number of sequences to be in each resulting .fasta file. Defaults to 20000. |
trim |
Logical, should the sequences be trimmed to 4000 amino acids to bypass the CBS server restrictions. Defaults to FALSE. |
trunc |
Integer, truncate the sequences to this length. First 1:trunc amino acids will be kept. |
id |
Logical, should the protein id's be returned. Defaults to FALSE. |
if id = FALSE, A Character vector of the paths to the resulting .FASTA formatted files.
if id = TRUE, A list with two elements:
Character, protein identifiers.
Character, paths to the resulting .FASTA formatted files.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | ## Not run:
library(ragp)
#create a fasta file to be processed, not needed if the input file is already present
data(at_nsp)
library(seqinr)
write.fasta(sequence = strsplit(at_nsp$sequence, ""),
name = at_nsp$Transcript.id,
file = "at_nsp.fasta")
#assumes input/output file are in working directory:
file_paths <- split_fasta(path_in = "at_nsp.fasta",
path_out = "at_nsp_split",
num_seq = 500)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.