track_reference_genome: Track Reference Genome

View source: R/reference_tracking.R

track_reference_genomeR Documentation

Track Reference Genome

Description

Track reference genome files, annotations, and indices for reproducibility. This is critical for genomics/transcriptomics pipelines where the exact reference version affects results.

Usage

track_reference_genome(
  fasta_path,
  gtf_path = NULL,
  gff_path = NULL,
  genome_build = NULL,
  species = NULL,
  source_url = NULL,
  indices = list(),
  metadata = list(),
  registry_file,
  data_registry_file
)

Arguments

fasta_path

Character. Path to reference genome FASTA file

gtf_path

Character. Path to GTF annotation file. Optional.

gff_path

Character. Path to GFF annotation file. Optional.

genome_build

Character. Genome build identifier (e.g., "GRCh38", "mm10")

species

Character. Species name (e.g., "Homo sapiens", "Mus musculus")

source_url

Character. URL where reference was downloaded from

indices

Named list. Paths to aligner indices (STAR, BWA, etc.)

metadata

List. Additional metadata about the reference

registry_file

Character. Path to reference registry (required).

data_registry_file

Character. Path to data registry for tracking files (required).

Value

List containing reference genome information

Examples

## Not run: 
track_reference_genome(
  fasta_path = "ref/GRCh38.fa",
  gtf_path = "ref/gencode.v38.annotation.gtf",
  genome_build = "GRCh38",
  species = "Homo sapiens",
  source_url = "https://www.gencodegenes.org/",
  indices = list(
    star = "ref/STAR_index/",
    bwa = "ref/bwa_index/GRCh38"
  ),
  registry_file = tempfile(fileext = ".json"),
  data_registry_file = tempfile(fileext = ".json")
)

## End(Not run)

Capsule documentation built on Nov. 11, 2025, 5:14 p.m.