View source: R/sim_and_write_plink.R
sim_and_write_plink | R Documentation |
To save memory, simulate small chunks of variants at the time and write them to file as you go.
This is a wrapper around write_plink()
and readr::write_lines()
(for ancestral allele frequencies, optional) with append = TRUE
that simplifies looping somewhat.
The function always appends to the output, so it can be called several times if convenient, for example to simulate separate chromosomes.
sim_and_write_plink(
sim_chunk,
m_loci,
fam,
file,
file_p_anc = NA,
n_data_cut = 10^6
)
sim_chunk |
A function that generates a small number of loci at the time, to be called iteratively until the whole genome is complete. It should accept a single parameter, the number of loci to simulate at one time, and returns a list with these named elements:
|
m_loci |
Total number of loci to simulate. |
fam |
Sample table of simulation to write. |
file |
Output file path, without extensions (each of .bed, .bim, .fam extensions will be added automatically as needed). |
file_p_anc |
Complete file path (with extensions) of vector of ancestral allele frequencies, if |
n_data_cut |
Number of cells (individuals times loci) to aim to simulate at the time. Actual number may be smaller to ensure that the number of loci is an integer, except if the number of individuals is greater than |
write_plink()
# some global constants that will be accessed by simulator function
n <- 10
# and a global variable updated as we go
m_last <- 0
# define a trivial but complete genotype simulator function
my_sim_chunk <- function( m_chunk ) {
# construct ancestral allele frequencies
p_anc <- runif( m_chunk )
# simulate genotypes from HWE
X <- matrix( rbinom( m_chunk * n, 2, p_anc ), m_chunk, n )
# construct a trivial BIM table
bim <- make_bim( n = m_chunk )
# but make sure count continues across chunks without repeats
# (so IDs and positions don't clash!)
bim$id <- m_last + ( 1 : m_chunk )
# update global value (use <<-) for next round
m_last <<- m_last + m_chunk
# return all of these elements in a named list!
return( list( X = X, bim = bim, p_anc = p_anc ) )
}
# the fam table is created fully now
fam <- make_fam( n = n )
# set other parameters
m_loci <- 100
# this is only necessary for example files to be in a *temporary* location
# (don't use `tempfile` in real cases)
# plink files path without extension
file <- tempfile('test')
# p_anc file should have extension
filep <- tempfile('test-p-anc.txt.gz')
# simulate and write as we go!
sim_and_write_plink( my_sim_chunk, m_loci, fam, file, filep )
# clean up: delete sample outputs
delete_files_plink( file )
file.remove( filep )
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.