setCallFilter: Filter out each genotype call meeting criteria

setCallFilterR Documentation

Filter out each genotype call meeting criteria

Description

Perform filtering of each genotype call, neither markers nor samples. Each genotype call is supported by its read counts for the reference allele and the alternative allele of a marker of a sample. setCallFilter() set missing to the genotype calls which are not reliable enough and set zero to reference and alternative read counts of the genotype calls.

Usage

setCallFilter(
  object,
  dp_count = c(0, Inf),
  ref_count = c(0, Inf),
  alt_count = c(0, Inf),
  dp_qtile = c(0, 1),
  ref_qtile = c(0, 1),
  alt_qtile = c(0, 1),
  ...
)

## S4 method for signature 'GbsrGenotypeData'
setCallFilter(
  object,
  dp_count,
  ref_count,
  alt_count,
  dp_qtile,
  ref_qtile,
  alt_qtile
)

Arguments

object

A GbsrGenotypeData object.

dp_count

A numeric vector with length two specifying lower and upper limit of total read counts (reference reads + alternative reads).

ref_count

A numeric vector with length two specifying lower and upper limit of reference read counts.

alt_count

A numeric vector with length two specifying lower and upper limit of alternative read counts.

dp_qtile

A numeric vector with length two specifying lower and upper limit of quantile of total read counts in each sample.

ref_qtile

A numeric vector with length two specifying lower and upper limit of quantile of reference read counts in each sample.

alt_qtile

A numeric vector with length two specifying lower and upper limit of quantile of alternative read counts in each sample.

...

Unused.

Details

dp_qtile, ref_qtile, and alt_qtile use quantile values of read counts of each sample to decide the lower and upper limit of read counts. This function generate two new nodes in the GDS file linked with the given GbsrGenotypeData object. The filtered read counts and genotype calls will be stored in the data node in the "FAD" folder and the data node in the "FGT" folder, while the data node in the "CFT" stores call fitering informatin. To reset the filter applied by setCallFilter(), run resetCallFilter().

Value

A GbsrGenotypeData object with filters on genotype calls.

Examples

# Create a GDS file from a sample VCF file.
vcf_fn <- system.file("extdata", "sample.vcf", package = "GBScleanR")
gds_fn <- tempfile("sample", fileext = ".gds")
gbsrVCF2GDS(vcf_fn = vcf_fn, out_fn = gds_fn, force = TRUE)

# Load data in the GDS file and instantiate a [GbsrGenotypeData] object.
gds <- loadGDS(gds_fn)

# Filter out genotype calls supported by less than 5 reads.
gds <- setCallFilter(gds, dp_count = c(5, Inf))

# Filter out genotype calls supported by reads less than
# the 20 percentile of read counts per marker in each sample.
gds <- setCallFilter(gds, dp_qtile = c(0.2, 1))

# Reset the filter
gds <- resetCallFilter(gds)

# Close the connection to the GDS file.
closeGDS(gds)


tomoyukif/GBScleanR documentation built on Oct. 31, 2024, 2:43 a.m.