seqMissing: Missing genotype percentage

View source: R/Methods.R

seqMissingR Documentation

Missing genotype percentage

Description

Calculates the missing rates per variant or per sample.

Usage

seqMissing(gdsfile, per.variant=TRUE, parallel=seqGetParallel(), verbose=FALSE)

Arguments

gdsfile

a SeqVarGDSClass object

per.variant

missing rate per variant if TRUE, missing rate per sample if FALSE, or calculating missing rates for variants and samples if NA

parallel

FALSE (serial processing), TRUE (multicore processing), numeric value or other value; parallel is passed to the argument cl in seqParallel, see seqParallel for more details.

verbose

if TRUE, show progress information

Details

If the gds node 'genotype/data' (integer genotypes) is not available, the node 'annotation/format/DS' (numeric genotype dosages for alternative alleles) will be used to calculate allele frequencies. At a site, it assumes 'annotation/format/DS' stores the dosage of the 1st alternative allele in the 1st column, 2nd alt. allele in the 2nd column if it is multi-allelic, and so on.

Value

A vector of missing rates, or a list(variant, sample) for both variants and samples.

Author(s)

Xiuwen Zheng

See Also

seqAlleleFreq, seqNumAllele, seqParallel, seqGetParallel

Examples

# the GDS file
(gds.fn <- seqExampleFileName("gds"))

# display
(f <- seqOpen(gds.fn))

summary(m1 <- seqMissing(f, TRUE, verbose=TRUE))
summary(m2 <- seqMissing(f, FALSE, verbose=TRUE))

str(m <- seqMissing(f, NA, verbose=TRUE))
identical(m1, m$variant)  # should be TRUE
identical(m2, m$sample)   # should be TRUE

# close the GDS file
seqClose(f)

zhengxwen/SeqArray documentation built on Dec. 14, 2024, 8:36 p.m.