read_plink: Read genotype and sample data in a Plink BED/BIM/FAM file...

Description Usage Arguments Value See Also Examples

View source: R/read_plink.R

Description

This function reads a genotype matrix (X, encoded as reference allele dosages) and its associated locus (bim) and individual (fam) data tables in the three Plink files in BED, BIM, and FAM formats, respectively. All inputs must exist or an error is thrown. This function is a wrapper around the more basic functions read_bed(), read_bim(), read_fam(), which simplifies data parsing and additionally better guarantees data integrity. Below suppose there are m loci and n individuals.

Usage

1
read_plink(file, verbose = TRUE)

Arguments

file

Input file path, without extensions (each of .bed, .bim, .fam extensions will be added automatically as needed). Alternatively, input file path may have .bed extension (but not .bim, .fam, or other extensions).

verbose

If TRUE (default), function reports the paths of the files being read (after autocompleting the extensions).

Value

A named list with items in this order: X (genotype matrix, see description in return value of read_bed()), bim (tibble, see read_bim()), fam (tibble, see read_fam()). X has row and column names corresponding to the id values of the bim and fam tibbles.

See Also

read_bed(), read_bim(), and read_fam() for individual parsers of each input table, including a description of each object returned.

geno_to_char() for translating numerical genotypes into more human-readable character encodings.

Plink BED/BIM/FAM format reference: https://www.cog-genomics.org/plink/1.9/formats

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# to read "data.bed" etc, run like this:
# obj <- read_plink("data")
# this also works
# obj <- read_plink("data.bed")
#
# you get a list with these three items:
# genotypes
# obj$X
# locus annotations
# obj$bim
# individual annotations
# obj$fam

# The following example is more awkward
# because package sample data has to be specified in this weird way:

# first get path to BED file
file <- system.file("extdata", 'sample.bed', package = "genio", mustWork = TRUE)

# read genotypes and annotation tables
plink_data <- read_plink(file)
# genotypes
plink_data$X
# locus annotations
plink_data$bim
# individual annotations
plink_data$fam

# the same works without .bed extension
file <- sub('\\.bed$', '', file) # remove extension
# it works!
plink_data <- read_plink(file)

genio documentation built on June 11, 2021, 5:12 p.m.