read_plink: Reads PLINK tped and bed files

View source: R/plink.R

read_plinkR Documentation

Description

The function reads PLINK tped and bed files. radiator prefers the use of BED file. These files are converted to a connection SeqArray SeqArray GDS object/file of class SeqVarGDSClass (Zheng et al. 2017). The Genomic Data Structure (GDS) file format is detailed in gdsfmt.

Used internally in radiator and might be of interest for users.

Usage

read_plink(
  data,
  filename = NULL,
  parallel.core = parallel::detectCores() - 1,
  verbose = TRUE,
  ...
)

Arguments

data

The PLINK file.

  • bi-allelic data only. For haplotypes use VCF.

  • tped file format: the corresponding tfam file must be in the directory.

  • bed file format: IS THE PREFERRED format, the corresponding fam and bim files must be in the directory.

filename

(optional) The file name of the Genomic Data Structure (GDS) file. radiator will append .gds.rad to the filename. If the filename chosen exists in the working directory, the default radiator_datetime.gds is chosen. Default: filename = NULL.

parallel.core

(optional) The number of core used for parallel execution during import. Default: parallel.core = parallel::detectCores() - 1.

verbose

(optional, logical) When verbose = TRUE the function is a little more chatty during execution. Default: verbose = TRUE.

...

(optional) To pass further arguments for fine-tuning the function.

Details

Large PLINK files will require the use of BED plink format. Look below in the example for conversion with PLINK.

Large PLINK bed files will take longer to import and transform in GDS, but after the file is generated, you can close your computer and come back to it a month later and it's now a matter of sec to open a connection!

Value

For tped the function returns a list object with the non-modified tped and the strata corresponding to the tfam. With bed, the function returns a GDS object.

Author(s)

Thierry Gosselin thierrygosselin@icloud.com

References

Zheng X, Gogarten S, Lawrence M, Stilp A, Conomos M, Weir BS, Laurie C, Levine D (2017). SeqArray – A storage-efficient high-performance data format for WGS variant calls. Bioinformatics.

PLINK: a tool set for whole-genome association and population-based linkage analyses. American Journal of Human Genetics. 2007: 81: 559–575. doi:10.1086/519795

See Also

PLINK

Examples

## Not run: 
data <- radiator::read_plink(data = "my_plink_file.bed")
# when conversion is required from TPED to BED, in Terminal:
# plink --tfile my_plink_file --make-bed --allow-no-sex --allow-extra-chr --chr-set 95

## End(Not run)

thierrygosselin/radiator documentation built on May 5, 2024, 5:12 a.m.