qcMetadata: QC Sequence Metadata

View source: R/get_neon_data.R

qcMetadataR Documentation

QC Sequence Metadata

Description

Performs basic QAQC checks on sequence metadata prior to downloading sequence data and performing bioinformatics processing. Running this function will remove metadata records for samples that do not meet user specifications. This will reduce the number of sequence files that are downloaded to only those that will be used for analysis, thereby saving file space and reducing download times.

Usage

qcMetadata(
  metadata,
  outDir = NULL,
  pairedReads = "Y",
  rmDupes = TRUE,
  rmFlagged = "N",
  verbose = FALSE
)

Arguments

metadata

The output of downloadSequenceMetadata. Must be provided as either the data.frame returned by downloadSequenceMetadata or as a filepath to the csv file produced by downloadSequenceMetadata.

outDir

Directory where QC'd metadata will be saved By default (NULL), QC'd metadata will be saved to file.path(NEONMICROBE_DIR_SEQMETA(), "qc_metadata")

pairedReads

"Y" (default) or "N". Should the forward reads for a sample be removed if the corresponding reverse read is missing? If "Y", then only samples that have both the forward (R1) and reverse (R2) reads will be retained.

rmDupes

TRUE (default) or FALSE. Should records with duplicated dnaSampleIDs be removed? If TRUE, then only the first records encountered for a particular dnaSampleID will be retained.

Value

QC'd dataframe is returned as an object and saved as csv file.


claraqin/neonMicrobe documentation built on April 11, 2024, 11:47 a.m.