run_fastp: Run FastP

View source: R/run_fastp.R

run_fastpR Documentation

Run FastP

Description

Run the FastP tool to remove contaminating sequencing adapters and low quality bases.

Usage

run_fastp(
  input1 = NULL,
  input2 = NULL,
  output1 = NULL,
  output2 = NULL,
  adapter1 = NULL,
  adapter2 = NULL,
  sample.name = NULL,
  out.dir = NULL,
  phred.quality = 15,
  min.length = NULL,
  trim.front.1 = NULL,
  trim.tail.1 = NULL,
  trim.front.2 = NULL,
  trim.tail.2 = NULL,
  threads = 10,
  parallel = FALSE,
  cores = 4,
  execute = TRUE,
  fastp = NULL,
  version = FALSE
)

Arguments

input1

List of the paths to files containing to the forward reads

input2

List of the paths to files containing to the reverse reads

output1

List of paths to the files to write the trimmed forward reads

output2

List of paths to the files to write the trimmed reverse reads

adapter1

Sequence for the adapter for the forward read

adapter2

Sequence for the adapter for the reverse read

sample.name

List of the sample names

out.dir

Name of the directory to write quality control results files. If NULL, which is the default, a directory named "fastP" is created in the current working directory.

phred.quality

The lower limit for the phred score

min.length

The length at which a trimmed read will be discarded

trim.front.1

Trim 'n' bases from front of read1, default is 0

trim.tail.1

Trim 'n' bases from tail of read1, default is 0

trim.front.2

Trim 'n' bases from front of read2, default is 0

trim.tail.2

Trim 'n' bases from tail of read2, default is 0

threads

Number of threads for FastP to use, default set to 10

parallel

Run in parallel, default set to FALSE

cores

Number of cores/threads to use for parallel processing, default set to 4

execute

Whether to execute the commands or not, default set to TRUE

fastp

Path to the FastP program, required

version

Returns the version number

Value

A file with the FastP commands and creates a directory of adapter and quality trimmed reads

Examples

 ## Not run: 
# Set the directory containing the raw fastq files
reads_path <- "raw_reads"
mate1 <- list.files(path = reads_path, pattern = "*_R1_001.fastq.gz$", full.names = TRUE)
mate2 <- list.files(path = reads_path, pattern = "*_R2_001.fastq.gz$", full.names = TRUE)

# Set the directory for writing the trimmend reads to
trimmed_reads_dir <- "trimmed_reads"
mate1.out <- paste(trimmed_reads_dir,
             (list.files(path = path, pattern = "*_R1_001.fastq.gz$", full.names = FALSE)),
             sep = "/")
mate2.out <- paste(trimmed_reads_dir,
             (list.files(path = path, pattern = "*_R2_001.fastq.gz$", full.names = FALSE)),
             sep = "/")

# Get the sample names from the first reads
sample_names <- unlist(lapply(strsplit
                (list.files(path = path, pattern = "*_R1_001.fastq.gz$", full.names = FALSE),"_"),
                `[[`, 1))

# Set the adapter sequences, these are for Illumina
adapter1 <- "AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC"
adapter2 <- "AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATT"

fastp.cmds <- run_fastp(input1 = mate1,
                        input2 = mate2,
                        output1 = mate1.out,
                        output2 = mate2.out,
                        adapter1 = adapter1,
                        adapter2 = adapter2,
                        sample.name =  sample.names,
                        out.dir = fastp.results.dir,
                        fastp = "/software/bin/fastp")

## End(Not run)


GrahamHamilton/pipelineTools documentation built on Dec. 8, 2024, 3:53 p.m.