epi2me2r vignette"

library(epi2me2r)

The epi2me2r package includes fully automated methods to take raw CSVs of CARD ARMA and WIMP output from the EPI2ME pipeline from Oxford Nanopore and quickly convert it into common R package formats, namely phyloseq and metagenomeSeq.

There are three main types of functions in epi2me2r:

Preparation of metadata file

Prior to starting, making sure the metadata file is formatted appropriately will ensure your data is imported correctly. You can use one combined metadata file for both your AMR and WIMP samples or a separate file for each. Both options are described below.

Combo metadata file

This file has 4 required columns that must be named as follows:

An example of a combo metadata file is included with this package.

epi2me.metadata <- read.csv(system.file("extdata", "example_metadata.csv", package = "epi2me2r"))
head(epi2me.metadata)

Individual metadata file

If you are just importing WIMP or CARD ARMA files, you do not need all the metadata associated with the other workflow.

If you are just processing ARMA CARD data, the required columns are: arma_filename arma_barcode * other metadata such as treatment and sample names

On the other hand, if you are just processing WIMP data the required columns are: wimp_filename wimp_barcode * other metadata such as treatment and sample names

Even if you are just processing one type of data, both ARMA and WIMP information can be included in the metadata (as seen in the section on combo metadata above).

Fully automated data import

For both AMR and WIMP data, the raw CSVs downloaded from the epi2me website need to be in their own directory (without any other files). Note that if you are processing both WIMP and ARMA data you will need two directories, one for each set of data.

AMR data

amr_raw_to_phyloseq

Reading the AMR data requires a directory and a metadata file. The directory should have only the CSV files generated by EPI2ME in it. An example of the metadata file is above. The data we will be using is from an example run on the EPI2ME pipeline. There are four options:

In the following code example, we use the amr_raw_to_phyloseq() function and the included example metadata file read in above, as well as a directory containing example AMR files also included with the epi2me2r package. This code creates a phyloSeq object from the example AMR files and metadata.

example.amr.dir <- system.file("extdata", "example_amr_data", package = "epi2me2r")

ps.amr.object <- amr_raw_to_phyloseq(path.to.amr.files = example.amr.dir,
                                     metadata = epi2me.metadata,
                                     coveragenumber = 80, 
                                     keepSNP = FALSE)

amr_raw_to_metagenomeseq

The amr_raw_to_metagenomeseq() function uses the same arguments as above for importing to metagenomeSeq:

mgs.amr.object <- amr_raw_to_metagenomeseq(path.to.amr.files = example.amr.dir,
                                           metadata = epi2me.metadata,
                                           coveragenumber = 80, 
                                           keepSNP = FALSE)
mgs.amr.object

WIMP

wimp_raw_to_phyloseq

WIMP files are similar to the AMR files but use the package taxonomizr to add taxonomic hierarchical information.

Reading in the WIMP data requires a directory and a metadata file. The directory should have only the CSV files generated by EPI2ME in it. An example of the metadata file is above. The data we will be using is from an example run on the EPI2ME pipeline. There are four options:

The following code uses the wimp_raw_to_phyloseq() function and the example metadata we read in above as well as a directory of example WIMP files included with the package to convert the raw WIMP files to a phyloSeq object:

example.wimp.dir <- system.file("extdata", "example_wimp_data", package = "epi2me2r")

ps.wimp.object <- wimp_raw_to_phyloseq(path.to.wimp.files = example.wimp.dir,
                                       metadata = epi2me.metadata,
                                       keep.unclassified = FALSE, 
                                       keep.human = FALSE)

wimp_raw_to_metagenomeseq

Like the functions for AMR, the wimp_raw_to_metagenomeSeq() function uses the same arguments for importing to metagenomeSeq are used as were used above in the wimp_raw_to_phyloseq() function:

mgs.wimp.object <- wimp_raw_to_metagenomeseq(path.to.wimp.files = example.wimp.dir,
                                             metadata = epi2me.metadata,
                                             keep.unclassified = FALSE, 
                                             keep.human = FALSE)

Step-by-step import

In some cases you might not want a phyloseq or metagenomeSeq object, but instead just a count matrix or taxonomic list. In these cases you can use the below functions.

AMR data

read_in_amr_file

This takes the directory that the AMR CSV files are in and creates a count matrix that can be used in downstream analysis. The inputs are similar to those in the previous examples (but metadata is not required):

amr.count.table <- read_in_amr_files(path.to.amr.files = example.amr.dir,
                                     coveragenumber = 80, 
                                     keepSNP = FALSE)
head(amr.count.table)

generate_amr_taxonomy

This function assigns AMR taxonomic hierarchical information from CARD using a count table with CV TERM ID's as the first column ("CVTERMID"). Only one input is needed:

amr.taxonomy <- generate_amr_taxonomy(amr.count.table = amr.count.table,
                                         verbose = FALSE)
head(amr.taxonomy)

WIMP data

read_in_wimp_file

This takes the directory the WIMP CSV files are in and creates a count matrix that can be used in downstream analysis. The inputs are similar to those in the previous examples (but metadata is not required):

example.wimp.dir <- system.file("extdata", "example_wimp_data", package = "epi2me2r")

wimp.count.table <- read_in_wimp_files(path.to.wimp.files = example.wimp.dir)
head(wimp.count.table)

generate_wimp_taxonomy

This function assigns phylogenetic taxonomic hierarchical information with the help of taxonomizr. A count table with NCBI taxonomic ID's ("taxID") as the first column is required.

wimp.taxonomy <- generate_wimp_taxonomy(wimp.count.table = wimp.count.table)

Other functions

Another useful function is amr_read_taxonomy, which matches any classified AMR read with the phylogenetic taxonomy (if it is assigned) using read_id(). This function takes the following arguments:

amr.read.classification <- amr_read_taxonomy(path.to.amr.files = example.amr.dir,
                                             path.to.wimp.files = example.wimp.dir)


Try the epi2me2r package in your browser

Any scripts or data that you put into this service are public.

epi2me2r documentation built on June 3, 2022, 9:07 a.m.