knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
In this vignette, we'll use GCAE on the PLINK tutorial's example data, to train and use a GCAE neural network.
Here are the steps:
First we determine if GCAE is installed and if the PLINK tutorial data is installed.
To determine if GCAE is installed, we'll first load gcaer
:
library(gcaer)
Now, gcaer
detects if GCAE is installed:
if (!is_gcae_installed()) { message("GCAE is not installed") message("Tip: use 'gcaer::install_gcae()'") }
Here we use plinkr
to detect if the PLINK tutorial data is installed:
if (!plinkr::is_plink_tutorial_data_installed()) { message("PLINK tutorial data is not installed") message("Tip: use 'plinkr::install_plink_tutorial_data'") }
If both requirements are met, we are good to go:
is_gcae_installed <- has_cloned_gcae_repo() # 'is_gcae_installed' is too slow is_plink_tutorial_installed <- plinkr::is_plink_tutorial_data_installed() good_to_go <- is_gcae_installed && is_plink_tutorial_installed
Without these requirements met, this vignette will be rather dull :-)
These are the PLINK tutorial files:
if (good_to_go) { plink_tutorial_data_filenames <- plinkr::get_plink_tutorial_data_filenames() plink_tutorial_data_filenames }
We will be using hapmap1
files,
that are in the non-binary/plain-text PLINK file format:
if (good_to_go) { plain_text_plink_filenames <- stringr::str_subset( string = plink_tutorial_data_filenames, pattern = "hapmap1" ) plain_text_plink_filenames }
As GCAE needs these files to be in binary PLINK format, we convert the text files to binary files and store these in a temporary folder:
if (good_to_go) { temp_folder <- plinkr::get_plinkr_tempfilename() binary_plink_filenames <- plinkr::make_bed( base_input_filename = tools::file_path_sans_ext( plain_text_plink_filenames[1] ), base_output_filename = file.path(temp_folder, "hapmap1_binary") ) binary_plink_filenames }
The three PLINK binary files are the .fam
and .bim
, .bed
files.
This is the .fam
file:
n_individuals <- "[unknown]" if (good_to_go) { fam_filename <- stringr::str_subset( binary_plink_filenames, pattern = "\\.fam$" ) fam_table <- plinkr::read_plink_fam_file(fam_filename) n_individuals <- nrow(fam_table) knitr::kable(utils::head(fam_table)) }
This table shows us the r n_individuals
individuals.
This is the .bim
file:
n_snps <- "[unknown]" if (good_to_go) { bim_filename <- stringr::str_subset( binary_plink_filenames, pattern = "\\.bim$" ) bim_table <- plinkr::read_plink_bim_file(bim_filename) n_snps <- nrow(bim_table) knitr::kable(utils::head(bim_table)) }
The .bim
file shows the information of the r n_snps
SNPs.
The .bed
file connects the individuals and the SNPs:
if (good_to_go) { bed_filename <- stringr::str_subset( binary_plink_filenames, pattern = "\\.bed$" ) bed_table <- plinkr::read_plink_bed_file( bed_filename, names_loci = bim_table$id, names_ind = fam_table$fam ) testthat::expect_equal(ncol(bed_table), n_individuals) testthat::expect_equal(nrow(bed_table), n_snps) knitr::kable(utils::head(bed_table[, 1:10])) }
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.