ld.snp: Function to calculate pairwise D', $r^2$

Description Usage Arguments Details Value Note Author(s) References See Also Examples

View source: R/ld.snp.R

Description

ld.snp takes an object of snp.matrix class and suitable range and depth and calculation the pairwise D', $r^2$, LOD and return the result as a snp.dprime object.

Usage

1
ld.snp(snpdata, depth = 100, start = 1, end = dim(snpdata)[2], signed.r=FALSE)

Arguments

snpdata

An object of snp.matrix class with M samples of N snps

depth

The depth or lag of pair-wise calculation. Should be between 1 and N-1; default 100. Using 0 (an invalid value) is the same as picking the maximum

start

The index of the start of the range of interest. Should be between 1 and (N-1); default 1

end

The index of the end of the range of interest. Should be between 2 and N. default N.

signed.r

Boolean for whether to returned signed $r$ values instead of $r^2$

Details

The cubic equation and quadratic equation solver code is borrowed from GSL (GNU Scientific Library).

Value

return a snp.dprime object, which is a list of 3 named matrices dprime, rsq2 (or r depending on the input), lod, and an attribute snp.names for the list of snps involved. (Note that if $x$ snps are involved, the row numbers of the 3 matrices are $(x-1)$). Only one of rsq2 or r is present.

dprime

D'

rsq2

$r^2$

r

signed $r$

lod

Log of Odd's

All the matrices are defined such that the ($n, m$)th entry is the pair-wise value between the ($n$)th snp and the $(n+m)$th snp. Hence the lower right triangles are always filled with zeros. (See example section for the actual layout)

Invalid values are represented by an out-of-range value - currently we use -1 for D', $r^2$ (both of which are between 0 and 1), and -2 for $r$ (valid values are between -1 and +1). lod is set to zero in most of these invalid cases. (lod can be any value so it is not indicative).

Note

The output snp.dprime object is suitable for input to plot.snp.dprime for drawing.

The speed of “ld.snp” LD calculation, on a single-processor opteron 2.2GHz box:

unsigned $r^2$, 13191 snps, depth 100 = 36.4 s (~ 1.3 mil pairs)

signed r , 13191 snps, depth 100 = 40.94s (~ 1.3 mil pairs)

signed r , 13191 snps, depth 1500 = 582s (~ 18.5 mil pairs)

For depth=1500, it uses 500MB just for the three matrices. So I actually cannot do the full depth at ~13,000; full depth should be under 50 minutes for 87 mil pairs, even in the signed-r version.

The LD code can be ran outside of R - mainly for debugging:

1
2
3
    gcc -DWITHOUT_R -o /tmp/hello pairwise_linkage.c solve_cubic.c \
       solve_quadratic.c -lm
  

When used in this form, it takes 9 numbers:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
  $/tmp/hello 4 0 0 0 30 0 0 0 23
  case 3               <- internal code for which cases it falls in
  root count 1         <- how many roots
  trying 1.000000
  p = 1.000000
  4      0       0       6.333333        0.000000        0.000000
  0      30      0       0.000000        25.333333       0.000000
  0      0       23      0.000000        0.000000        25.333333
  57 8 38.000000 38 38
  8 0 0 46 30, 38 38 76 76
  0.333333 0.000000 0.000000 0.666667
  d' = 1.000000 , r2 = 1.000000, lod= 22.482643
  

Author(s)

Hin-Tak Leung htl10@users.sourceforge.net

References

Clayton, D.G. and Leung, Hin-Tak (2007) An R package for analysis of whole-genome association studies. Human Heredity 64:45-51.
GSL (GNU Scientific Library) http://www.gnu.org/software/gsl/

See Also

snp.dprime-class, plot.snp.dprime, ld.with

Examples

1
2
3
# LD stats between 500 SNPs at a depth of 50
data(testdata)
ldinfo <- ld.snp(Autosomes, start=1, end=500, depth=50)

Example output

Loading required package: survival
Warning message:
In methods:::bind_activation(TRUE) :
  methods:::bind_activation(TRUE) is deprecated;
 you should rather provide methods for cbind2() / rbind2()
Information: The input contains 400 samples with 9445 snps
... Done

chopsticks documentation built on Nov. 8, 2020, 7:51 p.m.