View source: R/fastq_helpers.R
collapse.fastq | R Documentation |
For each unique read in the file, collapse into 1 and state in the fasta header how many reads existed of that type. This is done after trimming usually, works best for reads < 50 read length. Not so effective for 150 bp length mRNA-seq etc.
collapse.fastq(
files,
outdir = file.path(dirname(files[1]), "collapsed"),
header.out.format = "ribotoolkit",
compress = FALSE,
prefix = "collapsed_"
)
files |
paths to fasta / fastq files to collapse. I tries to detect format per file, if file does not have .fastq, .fastq.gz, .fq or fq.gz extensions, it will be treated as a .fasta file format. |
outdir |
outdir to save files, default:
|
header.out.format |
character, default "ribotoolkit", else must be "fastx". How the read header of the output fasta should be formated: ribotoolkit: ">seq1_x55", sequence 1 has 55 duplicated reads collapsed. fastx: ">1-55", sequence 1 has 55 duplicated reads collapsed |
compress |
logical, default FALSE |
prefix |
character, default "collapsed_" Prefix to name of output file. |
invisible(NULL), files saved to disc in fasta format.
fastq.folder <- tempdir() # <- Your fastq files
infiles <- dir(fastq.folder, "*.fastq", full.names = TRUE)
# collapse.fastq(infiles)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.