Subsetting and Filtering Haplotypes

Description

This function selects haplotypes based on their (absolute) frequencies and/or proportions of missing nucleotides.

Usage

1
2
## S3 method for class 'haplotype'
subset(x, minfreq = 1, maxfreq = Inf, maxna = Inf, na = c("N", "?"), ...)

Arguments

x

an object of class c("haplotype", "DNAbin").

minfreq, maxfreq

the lower and upper limits of (absolute) haplotype frequencies. By default, all haplotypes are selected whatever their frequency.

maxna

the maximum frequency (absolute or relative; see details) of missing nucleotides within a given haplotype.

na

a vector of mode character specifying which nucleotide symbols should be treated as missing data; by default, unknown nucleotide (N) and completely unknown site (?) (can be lower- or uppercase). There are two shortcuts: see details.

...

unused.

Details

The value of maxna can be either less than one, or greater or equal to one. In the former case, it is taken as specifying the maximum proportion (relative frequency) of missing data within a given haplotype. In the latter case, it is taken as the maximum number (absolute frequency).

na = "all" is a shortcut for all ambiguous nucleotides (including N) plus alignment gaps and completely unknown site (?).

na = "ambiguous" is a shortcut for only ambiguous nucleotides (including N).

Value

an object of class c("haplotype", "DNAbin").

Author(s)

Emmanuel Paradis

See Also

haplotype

Examples

1
2
3
4
data(woodmouse)
h <- haplotype(woodmouse)
subset(h, maxna = 20)
subset(h, maxna = 20/ncol(h)) # same thing than above

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.