pre_genomad: Preprocess geNomad and CheckV Output Results

pre_genomadR Documentation

Preprocess geNomad and CheckV Output Results

Description

This function automatically processes geNomad output files by detecting sample names from the directory structure and optionally integrates CheckV quality assessment results.

Usage

pre_genomad(
  genomad_out_dir = "",
  checkV_out_dir = NULL,
  provirus = TRUE,
  filter = TRUE,
  checkV_out_prefix = NULL,
  min_length = 1000,
  min_completeness = 50
)

Arguments

genomad_out_dir

Character. Path to the geNomad output directory. This directory should contain sample-specific subdirectories with the pattern "*.contigs_summary".

checkV_out_dir

Character. Optional path to the CheckV output directory. If provided, quality summary will be integrated. Default is NULL.

provirus

Logical. Whether to identify and separate provirus sequences. Default is TRUE.

filter

Logical. Whether to apply quality filtering to viral sequences. Default is TRUE.

checkV_out_prefix

Character. Optional prefix to remove from CheckV contig IDs.

min_length

Numeric. Minimum sequence length for filtering. Default is 1000.

min_completeness

Numeric. Minimum completeness score for CheckV filtering. Default is 50.

Details

The function automatically detects sample names by searching for directories with the pattern "*.contigs_summary" within the genomad_out_dir. It then extracts the sample name by removing the ".contigs_summary" suffix.

Value

An object of class "virus_res" containing four components:

sample

Detected sample name

virus_summary

Integrated data frame with geNomad and optional CheckV results

virus_genes

Gene-level annotations from geNomad

valid_virus

Filtered high-quality viral sequences

Examples

## Not run: 
# Basic usage - sample name will be automatically detected
virus_results <- pre_genomad(genomad_out_dir = "~/Documents/R/Lung_virome/data/genomad_out2/")

# Access the detected sample name
sample_name <- virus_results$sample
print(paste("Detected sample:", sample_name))

## End(Not run)


pctax documentation built on Feb. 9, 2026, 9:06 a.m.