split_fasta: Split a fasta formatted file.
In missuse/ragp: Mining for Hydroxyproline rich glycoprotein sequences

Description Usage Arguments Value Examples

The function splits a fasta formatted file to a defined number of smaller .fasta files for further processing.

split_fasta(
  path_in,
  path_out,
  num_seq = 20000,
  trim = FALSE,
  trunc = NULL,
  id = FALSE
)

`path_in`	A path to the .FASTA formatted file that is to be processed
`path_out`	A path where the resulting .FASTA formatted files should be stored. The path should also contain the prefix name of the fasta files on which _n (integer from 1 to number of fasta files generated) will be appended along with the extension ".fa"
`num_seq`	Integer defining the number of sequences to be in each resulting .fasta file. Defaults to 20000.
`trim`	Logical, should the sequences be trimmed to 4000 amino acids to bypass the CBS server restrictions. Defaults to FALSE.
`trunc`	Integer, truncate the sequences to this length. First 1:trunc amino acids will be kept.
`id`	Logical, should the protein id's be returned. Defaults to FALSE.

if id = FALSE, A Character vector of the paths to the resulting .FASTA formatted files.

if id = TRUE, A list with two elements:

id: Character, protein identifiers.
file_list: Character, paths to the resulting .FASTA formatted files.

## Not run: 
library(ragp)
#create a fasta file to be processed, not needed if the input file is already present
data(at_nsp)
library(seqinr)
write.fasta(sequence = strsplit(at_nsp$sequence, ""),
            name = at_nsp$Transcript.id,
            file = "at_nsp.fasta")

#assumes input/output file are in working directory:
file_paths <- split_fasta(path_in = "at_nsp.fasta",
                          path_out = "at_nsp_split",
                          num_seq = 500)

## End(Not run)