Write files for analysis in the PLINK toolset

Share:

Description

Given a SnpMatrix object, together with associated subject and SNP support dataframes, this function writes .bed, .bim, and .fam files for processing in the PLINK toolset

Usage

1
2
3
4
write.plink(file.base, snp.major = TRUE, snps,
   subject.data, pedigree, id, father, mother, sex, phenotype,
   snp.data, chromosome, genetic.distance, position, allele.1, allele.2,
   na.code = 0)

Arguments

file.base

A character string giving the base filename. The extensions .bed, .bim, and .fam are appended to this string to give the filenames of the three output files

snp.major

Logical variable controlling whether the .bed file is in SNP-major or subject-major order

snps

The SnpMatrix or XSnpMatrix object to be written out

subject.data

(Optional) A subject support dataframe. If supplied, the next six arguments (which define the fields of the PLINK .fam file) will be evaluated in this environment, after matching row names with the row names of snps. Otherwise they will be evaluated in the calling environment; they then must be of the right length and in the correct order.

pedigree

A pedigree (family) identifier. Default is the row names of snps.

id

An identifier of an individual within family. Default is a vector of na.code.

father

The within-family identifier of the subject's father. Default is a vector of na.code.

mother

The within-family identifier of the subject's mother. Default is a vector of na.code.

sex

Sex of the individual. Default is a vector of na.code. This will be coerced to type numeric.

phenotype

The primary phenotype value. Default is a vector of na.code. This will be coerced to type numeric.

snp.data

(Optional) A SNP support dataframe. If supplied, the next five arguments (which define the columns of the PLINK .bim file) will be evaluated in this environment, after matching row names with the column names of snps. Otherwise they will be evaluated in the calling environment; they then must be of the right length and in the correct order.

chromosome

The chromosome on which the SNP is located. This should either be numeric, as specified by SPLINK, or character, with "X", "Y", "XY", and "MT" for the non-numeric values. Default is a vector of na.code, or a vector of 23's if snps is a XSnpMatrix.

genetic.distance

The location of the SNP, expressed as a genetic distance. Default is a vector of na.code. This will be coerced to type numeric.

position

The physical location of the SNP, expressed in base pairs. Default is a vector of na.code. This will be coerced to type numeric.

allele.1

A character vector giving the first allele. Default is a vector of "A"s.

allele.2

A character vector giving the first allele. Default is a vector of "B"s.

na.code

The code to be written for NA in the .fam and .bin output files. It should be numeric (or capable of coercion to numeric).

Details

For more details of required codings in .fam and .bim files, see the PLINK documentation.

Value

Returns NULL.

Author(s)

David Clayton dc208@cam.ac.uk

References

PLINK: Whole genome association analysis toolset. http://pngu.mgh.harvard.edu/~purcell/plink/

See Also

read.plink, SnpMatrix-class, XSnpMatrix-class

Examples

1
2
3
4
5
6
7
8
data(testdata)
## the use of as.numeric() below is not necessary since this is done
## automatically. It just illustrates use of expressions for these arguments
## Note that cc and sex are variables within the subject.data frame
## This command writes files test.bed. test.fam and test.bim
write.plink(file.base="test", snps=Autosomes,
    subject.data=subject.data, phenotype = as.numeric(cc), sex=as.numeric(sex),
    snp.major=FALSE)

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.