View source: R/merge_with_metadata.R
| merge_fastq_with_metadata | R Documentation |
Merge a dataframe of sequence and quality data (as produced by
read_fastq() from an unmodified FASTQ file) with a dataframe of
metadata, reverse-complementing sequences if required such that all
reads are now in the forward direction.
merge_methylation_with_metadata() is the equivalent function for
working with FASTQs that contain DNA modification information.
FASTQ dataframe must contain columns of "read" (unique read ID),
"sequence" (DNA sequence), and "quality" (FASTQ quality score).
Other columns are allowed but not required, and will be preserved unaltered
in the merged data.
Metadata dataframe must contain "read" (unique read ID) and "direction"
(read direction, either "forward" or "reverse" for each read) columns,
and can contain any other columns with arbitrary information for each read.
Columns that might be useful include participant ID and family designations
so that each read can be associated with its participant and family.
Important: A key feature of this function is that it uses the direction
column from the metadata to identify which rows are reverse reads. These reverse
reads will then be reversed-complemented and have quality scores reversed
such that all reads are in the forward direction, ideal for consistent analysis or
visualisation. The output columns are "forward_sequence" and "forward_quality".
Calls reverse_sequence_if_needed() and reverse_quality_if_needed()
to implement the reversing - see documentation for these functions for more details.
merge_fastq_with_metadata(
fastq_data,
metadata,
reverse_complement_mode = "DNA"
)
fastq_data |
|
metadata |
|
reverse_complement_mode |
|
dataframe. A merged dataframe containing all columns from the input dataframes, as well as forward versions of sequences and qualities.
## Locate files
fastq_file <- system.file("extdata",
"example_many_sequences_raw.fastq",
package = "ggDNAvis")
metadata_file <- system.file("extdata",
"example_many_sequences_metadata.csv",
package = "ggDNAvis")
## Read files
fastq_data <- read_fastq(fastq_file)
metadata <- read.csv(metadata_file)
## Merge data (including reversing if needed)
merge_fastq_with_metadata(fastq_data, metadata)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.