Preprocessing of raw data

For filtering and trimming of the raw reads we usually use the DADA2 functions but wrap them in a reproducible workflow step.

library(mbtools)

Finding your files

We will again use our helper function to get a list of sequencing files.

path <- system.file("extdata/16S", package = "mbtools")
files <- find_read_files(path)
print(files)

Configuration

All mbtools workflow step come with corresponding config_* that returns an example/default configuration. Changes can be done a-posteriori or by directly passing in the parameters. We will specify a temporary directory as storage point for the preprocessed data and truncate the forward reads to 240 bp and the reverse reads to 200 bp (based on our previous quality assessment).

config <- config_preprocess(out_dir = tempdir(), truncLen = c(240, 200))
config

We can see that there are some more parameters that we could specify.

Running the preprocessing step

We can now run our preprocessing step.

filtered <- preprocess(files, config)

This will report the percentage of passed reads on the logging interface but you can also inspect that in detail by

filtered$passed


Gibbons-Lab/mbtools documentation built on Jan. 28, 2024, 11:08 a.m.