View source: R/het_reencode_bed.R
het_reencode_bed | R Documentation |
Given an existing plink-formatted BED (binary) file, this function reads it, transforms genotypes on the go, and writes a new BED file such that heterozygotes are encoded as 2 and homozygotes as 0.
In other words, it transforms the numerical genotype values c( 0, 1, 2, NA )
into c( 0, 2, 0, NA )
.
Heterozygotes are encoded as 2, rather than 1, so existing code for calculating allele frequencies and related quantities, such as kinship estimates, works on this data as intended.
Intended to transform extremely large files that should not be loaded entirely into memory at once.
het_reencode_bed(
file_in,
file_out,
m_loci = NA,
n_ind = NA,
make_bim_fam = TRUE,
verbose = TRUE
)
file_in |
Input file path.
*.bed extension may be omitted (will be added automatically if |
file_out |
Output file path. *.bed extension may be omitted (will be added automatically if it is missing). |
m_loci |
Number of loci in the input genotype table.
If |
n_ind |
Number of individuals in the input genotype table.
If |
make_bim_fam |
If |
verbose |
If |
read_bed()
and write_bed()
, from which much of the code of this function is derived, which explains additional BED format requirements.
# define input and output, both of which will also work without extension
# read an existing Plink *.bed file
file_in <- system.file("extdata", 'sample.bed', package = "genio", mustWork = TRUE)
# write to a *temporary* location for this example
file_out <- tempfile('delete-me-example')
# in default mode, deduces dimensions from paired *.bim and *.fam tables
het_reencode_bed( file_in, file_out )
# delete output when done
delete_files_plink( file_out )
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.