write_eigenvec: Write eigenvectors table into a Plink-format file

View source: R/write_eigenvec.R

write_eigenvecR Documentation

Write eigenvectors table into a Plink-format file

Description

This function writes eigenvectors in Plink 1 (same as GCTA) format (table with no header, with first two columns being fam and id), which is a subset of Plink 2 format (which optionally allows column names and does not require fam column). Main expected case is eigenvec passed as a numeric matrix and fam provided to complete first two missing columns. However, input eigenvec may also be a data.frame already containing the fam and id columns, and other reasonable intermediate cases are also handled. If both eigenvec and fam are provided and contain overlapping columns, those in eigenvec get overwritten with a warning.

Usage

write_eigenvec(
  file,
  eigenvec,
  fam = NULL,
  ext = "eigenvec",
  plink2 = FALSE,
  verbose = TRUE
)

Arguments

file

The output file name (possibly without extension)

eigenvec

A matrix or tibble containing the eigenvectors to include in the file. Column names other than fam and id can be anything and are all treated as eigenvectors (not written to file).

fam

An optional fam table, which is used to add the fam and id columns to eigenvec (which overwrite columns of the same name in eigenvec if present, after a warning is produced). Individuals in fam and eigenvec are assumed to be the same and in the same order.

ext

Output file extension. Since the general "covariates" file format in GCTA and Plink are the same as this, this function may be used to write more general covariates files if desired, in which case users may wish to change this extension for clarity.

plink2

If TRUE, prints a header in the style of plink2 (starts with hash, fam -> FID, id -> IID, and the default PCs are named PC1, PC2, etc. Returned data.frame will also have these names.

verbose

If TRUE (default), function reports the path of the file being written (after autocompleting the extension).

Value

Invisibly, the final eigenvec data.frame or tibble written to file, starting with columns fam and id (merged from the fam input, if it was passed) followed by the rest of columns in the input eigenvec. Column names are instead #FID, IID, etc if plink2 = TRUE.

See Also

read_eigenvec() for reading an eigenvec file.

Plink 1 eigenvec format reference: https://www.cog-genomics.org/plink/1.9/formats#eigenvec

Plink 2 eigenvec format reference: https://www.cog-genomics.org/plink/2.0/formats#eigenvec

GCTA eigenvec format reference: https://cnsgenomics.com/software/gcta/#PCA

Examples

# to write an existing matrix `eigenvec` and optional `fam` tibble into file "data.eigenvec",
# run like this:
# write_eigenvec("data", eigenvec, fam = fam)
# this also works
# write_eigenvec("data.eigenvec", eigenvec, fam = fam)

# The following example is more detailed but also more awkward
# because (only for these examples) the package must create the file in a *temporary* location

# create dummy eigenvectors matrix, in this case from a small identity matrix
# number of individuals
n <- 10
eigenvec <- eigen( diag( n ) )$vectors
# subset columns to use top 3 eigenvectors only
eigenvec <- eigenvec[ , 1:3 ]
# dummy fam data
library(tibble)
fam <- tibble( fam = 1:n, id = 1:n )

# write this data to .eigenvec file
# output path without extension
file <- tempfile('delete-me-example')
eigenvec_final <- write_eigenvec( file, eigenvec, fam = fam )
# inspect the tibble that was written to file (returned invisibly)
eigenvec_final

# remove temporary file (add extension before deletion)
file.remove( paste0( file, '.eigenvec' ) )


genio documentation built on Jan. 7, 2023, 1:12 a.m.