track_data: Track Data Provenance
In Capsule: Comprehensive Reproducibility Framework for R and Bioinformatics Analysis

track_data

R Documentation

Track Data Provenance

Description

Records comprehensive provenance information for data files including checksums, sources, timestamps, and metadata. Supports fast hashing for large files.

Usage

track_data(
  data_path,
  source = c("downloaded", "generated", "manual", "reference", "other"),
  source_url = NULL,
  description = NULL,
  metadata = NULL,
  fast_hash = TRUE,
  size_threshold_gb = 1,
  registry_file
)

Arguments

`data_path`	Character. Path to data file or directory.
`source`	Character. Source of the data (e.g., "downloaded", "generated", "manual", "reference").
`source_url`	Character. URL if data was downloaded. Optional.
`description`	Character. Description of the data. Optional.
`metadata`	List. Additional metadata. Optional.
`fast_hash`	Logical. Use faster xxHash for large files (>1GB). Default TRUE.
`size_threshold_gb`	Numeric. Size threshold (GB) for using fast hash. Default 1.
`registry_file`	Character. Path to provenance registry (required).

Value

A list containing data provenance information

Examples

## Not run: 
# Track a downloaded dataset
track_data("data/mydata.csv",
  source = "downloaded",
  source_url = "https://example.com/data.csv",
  description = "Customer data from API",
  registry_file = tempfile(fileext = ".json")
)

# Track generated data
track_data("results/simulation.rds",
  source = "generated",
  description = "Monte Carlo simulation results",
  registry_file = tempfile(fileext = ".json")
)

# Track large file with fast hashing
track_data("data/large_file.bam",
  source = "generated",
  fast_hash = TRUE,
  registry_file = tempfile(fileext = ".json")
)

## End(Not run)

Capsule documentation built on Nov. 11, 2025, 5:14 p.m.

Capsule index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Capsule
Comprehensive Reproducibility Framework for R and Bioinformatics Analysis

track_data: Track Data Provenance
In Capsule: Comprehensive Reproducibility Framework for R and Bioinformatics Analysis

Track Data Provenance

Description

Usage

Arguments

Value

Examples

Related to track_data in Capsule...

R Package Documentation

Browse R Packages

We want your feedback!

Capsule Comprehensive Reproducibility Framework for R and Bioinformatics Analysis

track_data: Track Data Provenance In Capsule: Comprehensive Reproducibility Framework for R and Bioinformatics Analysis

Track Data Provenance

Description

Usage

Arguments

Value

Examples

Related to track_data in Capsule...

R Package Documentation

Browse R Packages

We want your feedback!

Capsule
Comprehensive Reproducibility Framework for R and Bioinformatics Analysis

track_data: Track Data Provenance
In Capsule: Comprehensive Reproducibility Framework for R and Bioinformatics Analysis