ibs.pairwise.db: Pairwise comparison of all database profiles on IBS alleles

Description Usage Arguments Details Value See Also Examples

Description

Compares every database profile with every other database profile and keeps track of the number of pairs that match fully and partially on all numbers of loci.

Usage

1
2
ibs.pairwise.db(db, hit = 0, showprogress = TRUE, multicore = FALSE,
  ncores = 0)

Arguments

db

An integer matrix which is the database of profiles.

hit

Integer; when > 0, the function keeps track of the pairs with at least this number of matching loci

showprogress

Logical; show progress bar? (not available when multicore=TRUE)

multicore

Logical; use multicore implementation?

ncores

Integer value, with multicore=TRUE, the number of cores to use or 0 for auto-detect.

Details

Makes all pairwise comparisons of profiles in db. Counts the number of profiles that match fully/partially for each number of loci.

The number of pairwise comparisons equals N*(N-1)/2, where N equals the number of database profiles, so the computation time grows quadratically in N. The procedure using a single core takes a few minutes applied to a database of size 100.000 (Intel I5@2.5GHz), but the time quadruples each time the database becomes twice as large.

A similar function with additional functionality is available in the DNAtools package. That function however does not handle large databases (about 70k is the maximum) and is a few times slower than the implementation used here. The DNAtools package comes with a specialized plotting function that can be used with the output of the db.compare.pairwise function after converting with as.dbcompare.

Value

Matrix with the number of full/partial matches on 0,1,2,... loci.

See Also

as.dbcompare

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
data(freqsNLsgmplus)

# sample small db and make all pairwise comparisons
db <- sample.profiles(N=10^3,freqs=freqsNLsgmplus)
ibs.pairwise.db(db)

## Not run: 
# the multicore function has some overhead and is not faster when applied to small databases
db.small <- sample.profiles(N=10^4,freqs=freqsNLsgmplus)

system.time(Msingle <- ibs.pairwise.db(db.small))
system.time(Mmulti <- ibs.pairwise.db(db.small,multicore=T))

all.equal(Msingle,Mmulti)

# but significant speed gains are seen for large databases (46 vs 23 secs on my system)

db.large <- sample.profiles(N=5*10^4,freqs=freqsNLsgmplus)

system.time(Msingle <- ibs.pairwise.db(db.large))
system.time(Mmulti <- ibs.pairwise.db(db.large,multicore=T))

all.equal(Msingle,Mmulti)

## End(Not run)

DNAprofiles documentation built on Jan. 15, 2017, 9:27 p.m.