readFastq | R Documentation |
Reads and writes files in the FASTQ format.
readFastq(in.file)
writeFastq(fdta, out.file)
in.file |
url/directory/name of (gzipped) FASTQ file to read. |
fdta |
FASTQ object to write. |
out.file |
url/directory/name of (gzipped) FASTQ file to write. |
These functions handle input/output of sequences in the commonly used FASTQ format,
typically used for storing DNA sequences (reads) after sequencing. If
filenames (in.file
or out.file
) have the extension .gz
they will automatically be
compressed/uncompressed.
The sequences are stored in a tibble
, opening up all the possibilities in R for
fast and easy manipulations. The content of the file is stored as three columns, ‘Header’,
‘Sequence’ and ‘Quality’. If other columns are added, these will be ignored by
writeFastq
.
readFastq
returns a tibble
with the contents of the (gzipped) FASTQ
file stored in three columns of text. The first, named ‘Header’, contains
the headerlines, the second, named ‘Sequence’, contains the sequences and the third, named
‘Quality’ contains the base quality scores.
writeFastq
produces a (gzipped) FASTQ file.
These functions will only handle files where each entry spans one single line, i.e. not the (uncommon) multiline FASTQ format.
Lars Snipen and Kristian Hovde Liland.
codereadFasta.
## Not run:
# We need a FASTQ-file to read, here is one example file:
fq.file <- file.path(file.path(path.package("microseq"),"extdata"),"small.fastq.gz")
# Read and write
fdta <- readFastq(fq.file)
ok <- writeFastq(fdta[1:3,], out.file = "delete_me.fq")
# Make use of dplyr to copy parts of the file to another file
readFastq(fq.file) %>%
mutate(Length = str_length(Sequence)) %>%
filter(Length > 200) %>%
writeFasta(out.file = "long_reads.fa") # writing to FASTA file
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.