convertBlockList: Function to convert haplotype block list from PLINK to...

View source: R/other_useful_functions.R

convertBlockListR Documentation

Function to convert haplotype block list from PLINK to RAINBOWR format

Description

Function to convert haplotype block list from PLINK to RAINBOWR format

Usage

convertBlockList(
  fileNameBlocksDetPlink,
  map,
  blockNamesHead = "haploblock_",
  imputeOneSNP = FALSE,
  insertZeros = FALSE,
  n.core = 1,
  parallel.method = "mclapply",
  count = FALSE
)

Arguments

fileNameBlocksDetPlink

File name of the haplotype block list generated by PLINK (See reference). The file names must contain ".blocks.det" in the tail.

map

Data frame with the marker names in the first column. The second and third columns contain the chromosome and map position.

blockNamesHead

You can specify the header of block names for the returned data.frame.

imputeOneSNP

As default, blocks including only one SNP will be discarded from the returned data. If you want to include them when creating haplotype-block list for RAINBOWR, please set 'imputeOneSNP = TRUE'.

insertZeros

When naming blocks, whether or not inserting zeros to the name of blocks. For example, if there are 1,000 blocks in total, the function will name the block 1 as "block_1" when 'insertZeros = FALSE' and "block_0001" when 'insertZeros = TRUE'.

n.core

Setting n.core > 1 will enable parallel execution on a machine with multiple cores. This argument is not valid when 'parallel.method = "furrr"'.

parallel.method

Method for parallel computation. We offer three methods, "mclapply", "furrr", and "foreach".

When 'parallel.method = "mclapply"', we utilize pbmclapply function in the 'pbmcapply' package with 'count = TRUE' and mclapply function in the 'parallel' package with 'count = FALSE'.

When 'parallel.method = "furrr"', we utilize future_map function in the 'furrr' package. With 'count = TRUE', we also utilize progressor function in the 'progressr' package to show the progress bar, so please install the 'progressr' package from github (https://github.com/HenrikBengtsson/progressr). For 'parallel.method = "furrr"', you can perform multi-thread parallelization by sharing memories, which results in saving your memory, but quite slower compared to 'parallel.method = "mclapply"'.

When 'parallel.method = "foreach"', we utilize foreach function in the 'foreach' package with the utilization of makeCluster function in 'parallel' package, and registerDoParallel function in 'doParallel' package. With 'count = TRUE', we also utilize setTxtProgressBar and txtProgressBar functions in the 'utils' package to show the progress bar.

We recommend that you use the option 'parallel.method = "mclapply"', but for Windows users, this parallelization method is not supported. So, if you are Windows user, we recommend that you use the option 'parallel.method = "foreach"'.

count

When count is TRUE, you can know how far RGWAS has ended with percent display.

Value

A data.frame object of

$block

Block names for SNP-set methods in RAINBOWR

$marker

Marker names in each block for SNP-set methods in RAINBOWR

Purcell, S. and Chang, C. (2018). PLINK 1.9, www.cog-genomics.org/plink/1.9/. Chang CC, Chow CC, Tellier LCAM, Vattikuti S, Purcell SM, Lee JJ (2015) Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience, 4. Gaunt T, Rodríguez S, Day I (2007) Cubic exact solutions for the estimation of pairwise haplotype frequencies: implications for linkage disequilibrium analyses and a web tool 'CubeX'. BMC Bioinformatics, 8. Taliun D, Gamper J, Pattaro C (2014) Efficient haplotype block recognition of very long and dense genetic sequences. BMC Bioinformatics, 15.


RAINBOWR documentation built on July 4, 2024, 1:11 a.m.