calc.maps.pc: Estimate marker positions using Principal Curves

Description Usage Arguments Details Value References See Also Examples

Description

Reads a text file of pairwise recombination fractions and LOD scores, reduces to 2 or 3 dimensions using wMDS and projects onto a single dimension using principal curves to estimate marker positions.

Usage

1
2
calc.maps.pc(fname, spar = NULL, n = NULL, ndim = 2,
  weightfn = "lod2", mapfn = "haldane")

Arguments

fname

Character string the name of the file of recombination fractions and scores it should not contain any suffices (the file should be a .txt file as described below).

spar

Integer - the smoothing parameter for the principal curve. If NULL this will be done using leave one out cross validation.

n

Vector of integers or character strings containing markers to be omitted from the analysis.

ndim

Number of dimensions in which to perform the wMDS and fit the curve - can be 2 or 3.

weightfn

Character string specifying the values to use for the weight matrix in the MDS 'lod2' or 'lod'.

mapfn

Character string specifying the map function to use on the recombination fractions 'haldane' is default, 'kosambi' or 'none'.

Details

Reads a file of the form described below and casts the data into matrices of pairwise recombination fractions and weights determined by the weightfn parameter (LOD or LOD^2^) calculates a distance matrix from the map function. Haldane is the default map function, none just uses recombination fractions and the other alternative is Kosambi (see dmap for details).

Performs both an weighted MDS on the distance matrix using smacofSym and smacofSphere (de Leeuw & Mair 2009) and fits a principal curve to map this to an interval (principal_curve for details).

File names should be of the form fname.txt and it is assumed that they are in a tab or space separated file of the format displayed below. The first entry on the first row is the number of markers to be analysed. Underneath this is a table in which the first two columns contain marker names, the third column contains the pairwise recombination fractions between the markers and the fourth column the associated lod score. Note that marker names in the first column vary more slowly than in the second column. Missing recombination pairs are acceptable. Recombination fractions greater than 0.499999 are set to that value.

nmarkers
marker_1 marker_2 recombination fraction LOD
1 2 . .
1 3 . .
1 4 . .
. . . .
. . . .
. . . .
2 3 . .
2 4 . .
. . . .

Value

A list (S3 class pcmap or pcmap3d depending on ndim) with the following elements:

smacofsym

The unconstrained wMDS results.

pc

The results from the principal curve fit.

distmap

A symmetric matrix of pairwise distances between markers where the columns are in the estimated order.

lodmap

A symmetric matrix of lod scores associated with the distances in distmap.

locimap

A data frame of the markers containing the name of each marker, the number in the configuration plot if that is being used, the position of each marker in order of increasing distance and the nearest neighbour fit of the marker.

length

Integer giving the total length of the segment.

removed

A vector of the names of markers removed from the analysis.

locikey

A data frame showing the number associated with each marker name for interpreting the wMDS configuration plots.

meannnfit

The mean across all markers of the nearest neighbour fits.

References

de Leeuw J, Mair P (2009) Multidimensional scaling using majorization: SMACOF in R. J Stat Softw 31: 1-30 http://www.jstatsoft.org/v31/i03/

Hastie T, Weingessel A, Bengtsson H, Cannoodt R (1999) princurve: Fits a Principal Curve in Arbitrary Dimension. ) R package version 2.1.2. https://CRAN.R-project.org/package=princurve

See Also

calc.maps.sphere, calc.pair.rf.lod, smacofSym, smacofSphere, map.to.interval, dmap

Examples

1
2
3
map<-calc.maps.pc(system.file("extdata", "lgV.txt", package="MDSMap"),
ndim=2,weightfn='lod2',mapfn='kosambi')
plot(map)

MDSMap documentation built on May 1, 2019, 6:51 p.m.