CLQD: partitioning into cliques

Description Usage Arguments Value Author(s) See Also Examples

Description

CLQD partitioning the given data into subgroups that contain SNPs which are highly correlated.

Usage

1
2
3
CLQD(geno, SNPinfo, CLQcut=0.5, clstgap=40000,
hrstType=c("near-nonhrst", "fast", "nonhrst"), hrstParam=200,
CLQmode=c("density", "maximal"), LD=c("r2", "Dprime"))

Arguments

geno

Data frame or matrix of additive genotype data, each column is additive genotype of each SNP. (Use data of non-monomorphic SNPs)

SNPinfo

Data frame or matrix of SNPs information. 1st column is rsID and 2nd column is bp position.

CLQcut

Numeric constant; a threshold for the LD measure value |r|, between 0 to 1. Default 0.5.

clstgap

Numeric constant; a threshold of physical distance (bp) between two consecutive SNPs which do not belong to the same clique, i.e., if a physical distance between two consecutive SNPs in a clique greater than clstgap, then the algorithm split the cliques satisfying each clique do not contain such consecutive SNPs. Default 40000.

hrstType

Character constant; heuristic methods. If you want to do not use heuristic algorithm, set hrstType = "nonhrst". If you want to use heuristic algorithm suggested in Kim et al.,(2017), set hrstType = "fast". That algorithm is fastest heuristic algorithm and suitable when your memory capacity is not greater than 8GB. If you want to obtain the results similar to the that of non-heuristic algorithm, set hrstType = "near-nonhrst".

hrstParam

Numeric constant; parameter for heuristic algorithm "near-nonhrst". Default is 200. It is recommended that you set the parameter to greater than 150.

CLQmode

Character constant; the way to give priority among detected cliques. if CLQmode = "density" then the algorithm gives priority to the clique of largest value of (Number of SNPs)/(range of clique), else if CLQmode = "maximal", then the algorithm gives priority to the largest clique. The default is "density".

LD

Character constant; LD measure to use, "r2" or "Dprime". Default "r2".

Value

A vector of cluster numbers of all SNPs (NA represents singleton cluster).

Author(s)

Sun-Ah Kim <sunny03@snu.ac.kr>, Yun Joo Yoo <yyoo@snu.ac.kr>

See Also

BigLD

Examples

1
2
3
4
5
6
7
8
data(geno)
data(SNPinfo)
CLQD(geno=geno[,1:100],SNPinfo=SNPinfo[1:100,])
## Not run: 
CLQD(geno=geno[,1:100],SNPinfo=SNPinfo[1:100,], CLQmode = 'maximal')
CLQD(geno=geno[,1:100],SNPinfo=SNPinfo[1:100,], LD='Dprime')

## End(Not run)

sunnyeesl/gpart documentation built on May 9, 2019, 7:40 a.m.