read_eigenvec: Read Plink eigenvec file

View source: R/read_eigenvec.R

read_eigenvecR Documentation

Read Plink eigenvec file

Description

This function reads a Plink eigenvec file, parsing columns strictly. First two must be 'fam' and 'id', which are strings, and all remaining columns (eigenvectors) must be numeric.

Usage

read_eigenvec(
  file,
  ext = "eigenvec",
  plink2 = FALSE,
  comment = if (plink2) "" else "#",
  verbose = TRUE
)

Arguments

file

The input file path, potentially excluding extension.

ext

File extension (default "eigenvec") can be changed if desired. Set to NA to force file to exist as-is.

plink2

If TRUE, the header is parsed and preserved in the returned data. The first two columns must be FID and IID, which are mandatory.

comment

A string used to identify comments. Any text after the comment characters will be silently ignored. Passed to readr::read_table(). '#' (default when plink2 = FALSE) works for Plink 2 eigenvec files, which have a header lines that starts with this character (the header is therefore ignored). However, plink2 = TRUE forces the header to be parsed instead.

verbose

If TRUE (default) function reports the path of the file being written (after autocompleting the extension).

Value

A list with two elements:

  • eigenvec: A numeric R matrix containing the parsed eigenvectors. If plink2 = TRUE, the original column names will be preserved in this matrix.

  • fam: A tibble with two columns, fam and id, which are the first two columns of the parsed file. These column names are always the same even if plink2 = TRUE (i.e. they won't be FID or IID).

See Also

write_eigenvec() for writing an eigenvec file.

Plink 1 eigenvec format reference: https://www.cog-genomics.org/plink/1.9/formats#eigenvec

Plink 2 eigenvec format reference: https://www.cog-genomics.org/plink/2.0/formats#eigenvec

GCTA eigenvec format reference: https://cnsgenomics.com/software/gcta/#PCA

Examples

# to read "data.eigenvec", run like this:
# data <- read_eigenvec("data")
# this also works
# data <- read_eigenvec("data.eigenvec")
#
# either way you get a list with these two items:
# numeric eigenvector matrix
# data$eigenvec
# fam/id tibble
# data$fam

# The following example is more awkward
# because package sample data has to be specified in this weird way:

# read an existing *.eigenvec file created by GCTA
file <- system.file("extdata", 'sample-gcta.eigenvec', package = "genio", mustWork = TRUE)
data <- read_eigenvec(file)
# numeric eigenvector matrix
data$eigenvec
# fam/id tibble
data$fam

# can specify without extension
file <- sub('\\.eigenvec$', '', file) # remove extension from this path on purpose
file # verify .eigenvec is missing
data <- read_eigenvec(file) # load it anyway!
data$eigenvec

# read an existing *.eigenvec file created by Plink 2
file <- system.file("extdata", 'sample-plink2.eigenvec', package = "genio", mustWork = TRUE)
# this version ignores header
data <- read_eigenvec(file)
# numeric eigenvector matrix
data$eigenvec
# fam/id tibble
data$fam

# this version uses header
data <- read_eigenvec(file, plink2 = TRUE)
# numeric eigenvector matrix
data$eigenvec
# fam/id tibble
data$fam


genio documentation built on Jan. 7, 2023, 1:12 a.m.