View source: R/process_plink.R
| process_plink | R Documentation | 
bigsnpr packagePreprocess PLINK files using the bigsnpr package
process_plink(
  data_dir,
  data_prefix,
  rds_dir = data_dir,
  rds_prefix,
  logfile = NULL,
  impute = TRUE,
  impute_method = "mode",
  id_var = "IID",
  parallel = TRUE,
  quiet = FALSE,
  overwrite = FALSE,
  ...
)
| data_dir | The path to the bed/bim/fam data files, without a trailing "/" (e.g., use  | 
| data_prefix | The prefix (as a character string) of the bed/fam data files (e.g.,  | 
| rds_dir | The path to the directory in which you want to create the new '.rds' and '.bk' files. Defaults to  | 
| rds_prefix | String specifying the user's preferred filename for the to-be-created .rds file (will be create insie  | 
| logfile | Optional: the name (character string) of the prefix of the logfile to be written in 'rds_dir'. Default to NULL (no log file written). Note: if you supply a file path in this argument, it will error out with a "file not found" error. Only supply the string; e.g., if you want my_log.log, supply 'my_log', the my_log.log file will appear in rds_dir. | 
| impute | Logical: should data be imputed? Default to TRUE. | 
| impute_method | If 'impute' = TRUE, this argument will specify the kind of imputation desired. Options are:
* mode (default): Imputes the most frequent call. See  | 
| id_var | String specifying which column of the PLINK  | 
| parallel | Logical: should the computations within this function be run in parallel? Defaults to TRUE. See  | 
| quiet | Logical: should messages to be printed to the console be silenced? Defaults to FALSE | 
| overwrite | Logical: if existing  | 
| ... | Optional: additional arguments to  | 
Three files are created in the location specified by rds_dir:
 'rds_prefix.rds': This is a list with three items:
(1) X: the filebacked bigmemory::big.matrix object pointing to the imputed genotype data.
This matrix has type 'double', which is important for downstream operations in create_design()
(2) map: a data.frame with the PLINK 'bim' data (i.e., the variant information)
(3) fam: a data.frame with the PLINK 'fam' data (i.e., the pedigree information)
'prefix.bk': This is the backingfile that stores the numeric data of the genotype matrix
'rds_prefix.desc'" This is the description file, as needed by the
Note that process_plink() need only be run once for a given set of PLINK
files; in subsequent data analysis/scripts, get_data() will access the '.rds' file.
For an example, see vignette on processing PLINK files
The filepath to the '.rds' object created; see details for explanation.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.