run_fastp: Run FastP
In GrahamHamilton/pipelineTools: Pipeline Tools for NGS Sequence Analysis Pipelines

run_fastp

R Documentation

Run FastP

Description

Run the FastP tool to remove contaminating sequencing adapters and low quality bases.

Usage

run_fastp(
  input1 = NULL,
  input2 = NULL,
  output1 = NULL,
  output2 = NULL,
  adapter1 = NULL,
  adapter2 = NULL,
  sample.name = NULL,
  out.dir = NULL,
  phred.quality = 15,
  min.length = NULL,
  trim.front.1 = NULL,
  trim.tail.1 = NULL,
  trim.front.2 = NULL,
  trim.tail.2 = NULL,
  threads = 10,
  parallel = FALSE,
  cores = 4,
  execute = TRUE,
  fastp = NULL,
  version = FALSE
)

Arguments

`input1`	List of the paths to files containing to the forward reads
`input2`	List of the paths to files containing to the reverse reads
`output1`	List of paths to the files to write the trimmed forward reads
`output2`	List of paths to the files to write the trimmed reverse reads
`adapter1`	Sequence for the adapter for the forward read
`adapter2`	Sequence for the adapter for the reverse read
`sample.name`	List of the sample names
`out.dir`	Name of the directory to write quality control results files. If NULL, which is the default, a directory named "fastP" is created in the current working directory.
`phred.quality`	The lower limit for the phred score
`min.length`	The length at which a trimmed read will be discarded
`trim.front.1`	Trim 'n' bases from front of read1, default is 0
`trim.tail.1`	Trim 'n' bases from tail of read1, default is 0
`trim.front.2`	Trim 'n' bases from front of read2, default is 0
`trim.tail.2`	Trim 'n' bases from tail of read2, default is 0
`threads`	Number of threads for FastP to use, default set to 10
`parallel`	Run in parallel, default set to FALSE
`cores`	Number of cores/threads to use for parallel processing, default set to 4
`execute`	Whether to execute the commands or not, default set to TRUE
`fastp`	Path to the FastP program, required
`version`	Returns the version number

Value

A file with the FastP commands and creates a directory of adapter and quality trimmed reads

Examples

 ## Not run: 
# Set the directory containing the raw fastq files
reads_path <- "raw_reads"
mate1 <- list.files(path = reads_path, pattern = "*_R1_001.fastq.gz$", full.names = TRUE)
mate2 <- list.files(path = reads_path, pattern = "*_R2_001.fastq.gz$", full.names = TRUE)

# Set the directory for writing the trimmend reads to
trimmed_reads_dir <- "trimmed_reads"
mate1.out <- paste(trimmed_reads_dir,
             (list.files(path = path, pattern = "*_R1_001.fastq.gz$", full.names = FALSE)),
             sep = "/")
mate2.out <- paste(trimmed_reads_dir,
             (list.files(path = path, pattern = "*_R2_001.fastq.gz$", full.names = FALSE)),
             sep = "/")

# Get the sample names from the first reads
sample_names <- unlist(lapply(strsplit
                (list.files(path = path, pattern = "*_R1_001.fastq.gz$", full.names = FALSE),"_"),
                `[[`, 1))

# Set the adapter sequences, these are for Illumina
adapter1 <- "AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC"
adapter2 <- "AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATT"

fastp.cmds <- run_fastp(input1 = mate1,
                        input2 = mate2,
                        output1 = mate1.out,
                        output2 = mate2.out,
                        adapter1 = adapter1,
                        adapter2 = adapter2,
                        sample.name =  sample.names,
                        out.dir = fastp.results.dir,
                        fastp = "/software/bin/fastp")

## End(Not run)

GrahamHamilton/pipelineTools documentation built on Jan. 14, 2025, 10:13 p.m.