pickMarkerSubset: Identify the largest subset of markers that are some distance...

View source: R/pickMarkerSubset.R

pickMarkerSubsetR Documentation

Identify the largest subset of markers that are some distance apart

Description

Identify the largest subset of markers for which no two adjacent markers are separated by less than some specified distance; if weights are provided, find the marker subset for which the sum of the weights is maximized.

Usage

pickMarkerSubset(locations, min.distance, weights)

Arguments

locations

A vector of marker locations.

min.distance

Minimum distance between adjacent markers in the chosen subset.

weights

(Optional) vector of weights for the markers. If missing, we take weights == 1.

Details

Let d_i be the location of marker i, for i \in 1, \dots, M. We use the dynamic programming algorithm of Broman and Weber (1999) to identify the subset of markers i_1, \dots, i_k for which d_{i_{j+1}} - d_{i_j} \le min.distance and \sum w_{i_j} is maximized.

If there are multiple optimal subsets, we pick one at random.

Value

A vector of marker names.

Author(s)

Karl W Broman, broman@wisc.edu

References

Broman, K. W. and Weber, J. L. (1999) Method for constructing confidently ordered linkage maps. Genet. Epidemiol., 16, 337–343.

See Also

drop.markers, pull.markers, findDupMarkers

Examples

data(hyper)

# subset of markers on chr 4 spaced >= 5 cM
pickMarkerSubset(pull.map(hyper)[[4]], 5)

# no. missing genotypes at each chr 4 marker
n.missing <- nmissing(subset(hyper, chr=4), what="mar")

# weight by -log(prop'n missing), but don't let 0 missing go to +Inf
wts <- -log( (n.missing+1) / (nind(hyper)+1) )

# subset of markers on chr 4 spaced >= 5 cM, with weights = -log(prop'n missing)
pickMarkerSubset(pull.map(hyper)[[4]], 5, wts)

qtl documentation built on Sept. 11, 2024, 5:43 p.m.