subsetbd: Subset a binary dosage file

View source: R/subset.R

subsetbdR Documentation

Subset a binary dosage file

Description

Creates a new Format 5 binary dosage file containing a subset of the SNPs and/or subjects from an existing binary dosage file. The input file may be in any format (1-5). At least one filtering criterion must be supplied, and all supplied criteria must be met for a SNP or subject to be retained.

Usage

subsetbd(
  bdfiles,
  bdose_file,
  minmaf = NULL,
  locations = NULL,
  startloc = NULL,
  endloc = NULL,
  subjectids = NULL
)

Arguments

bdfiles

Vector of file names for the input binary dosage file. Format 4 files require one file name. Formats 1, 2, and 3 require three file names: the binary dosage file, the family file, and the map file. Format 5 files require two file names: the .bdose file and the .bdinfo file.

bdose_file

Path for the output .bdose file. The companion .bdi metadata file is written to paste0(bdose_file, ".bdi").

minmaf

Minimum minor allele frequency. SNPs whose MAF (computed over the retained subjects) is below this value are excluded. Must be a single numeric value between 0 and 0.5.

locations

Integer or numeric vector of SNP base-pair locations to retain. Cannot be used together with startloc and endloc.

startloc

Start of the location range to retain (inclusive). Must be used together with endloc. Cannot be used together with locations.

endloc

End of the location range to retain (inclusive). Must be used together with startloc. Cannot be used together with locations.

subjectids

Character vector of subject IDs to retain.

Value

NULL (invisibly)

Examples

bdfile <- system.file("extdata", "vcf1a.bdose", package = "BinaryDosage")
bdinfo     <- getbdinfo(bdfile)
bdose_file <- tempfile(fileext = ".bdose")
subsetbd(bdfiles    = bdfile,
         bdose_file = bdose_file,
         subjectids = bdinfo$samples$sid[1:30])

BinaryDosage documentation built on April 30, 2026, 1:09 a.m.