Description Usage Arguments Details Value See Also Examples
View source: R/wga_stream.GxE.R
Creates job files for running GxE.scan on a parallel processing system.
1 |
snp.list |
See |
pheno.list |
See |
op |
See details for this list of options. The default is NULL. |
This function will create files needed for running a GWAS scan on a computing cluster.
The user must know how to submit jobs and know how to use their particular cluster.
On many clusters, the command for submitting a job is "qsub".
The scan is partitioned into smaller jobs by either setting the values for snp.list$start.vec
and
snp.list$stop.vec
or by setting the value for snp.list$include.snps
. The partitioning is done
so that each job will process an equal number of SNPs.
In the output directory (see option out.dir
), three types of files will be created. One type of file
will be the
R program file containing R statements defining snp.list
, pheno.list
and op
for the GxE.scan
function. These files have the ".R" file extension.
Another type of file will be the job file which calls the R program file. These files are named
paste(op$out.dir, "job_", op$id.str, 1:op$n.jobs, sep="")
The third type of file is a single file containing the names of all the job files. This file has the prefix "Rjobs_".
This function will automatically set the name of the output file created by GxE.scan
to a file in the op$out.dir
directory with the prefix "GxEout_".
Options list op:
Below are the names for the options list op
. All names have default values
if they are not specified.
n.jobs
The (maximum) number of jobs to run.
The default is 100.
out.dir
Directory to save all files. If NULL, then the files will be
created in the working directory getwd
.
GxE.scan.op
List of options for the GxE.scan
function.
The default is NULL.
R.cmd
Character string for calling R.
The default is "R –vanilla".
begin.commands.R
Character vector of R statements to be placed at the top of each R program file.
For example,
begin.commands.R
=c("rm(list=ls(all=TRUE))", "gc()", 'library(CGEN, lib.loc="/home/Rlibs/")')
The default is "library(CGEN)".
qsub.cmd
Character string for the command to submit a single job.
The default is "qsub".
begin.commands.qsub
Character vector of statements to be placed at the top of each job file.
For example, begin.commands.qsub
="module load R".
The default is NULL.
id.str
A character string to be appended to the file names.
The default is "".
snp.list
The objects start.vec
and stop.vec
in snp.list
are set automatically, so they do
not need to be set by the user.
In general, it is more efficient in terms of memory usage and speed to have the genotype data
partitioned into many files. Thus, snp.list$file
can not only be set to a single file but also set to
a character vector of the partitioned files when calling this function. In this case, the number of jobs
to create (op$n.jobs
) must be greater than or equal to the number of partitioned files.
An object in snp.list
that is unique to the GxE.scan.partition
function is
nsnps.vec
. Each element of snp.list$nsnps.vec
is the number of SNPs in each file of
snp.list$file
.
If nsnps.vec
is not specified and snp.list$file
contains more than one file,
then each job will process an entire file in snp.list$file
.
For the scenarios when the genotype data must be transformed and the data is contained in a single file, then
snp.list$include.snps
should also be set. This will create a separate list of SNPs for each job to
process.
The name of the file containing names of the job files to be submitted. See details.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | # Define the list for the genotype data. There are 50 SNPs in the TPED file.
snp.list <- list(nsnps.vec=50, format="tped")
snp.list$file <- system.file("sampleData", "geno_data.tped.gz", package="CGEN")
snp.list$subject.list <- system.file("sampleData", "geno_data.tfam", package="CGEN")
# Define pheno.list
pheno.list <- list(id.var=c("Family", "Subject"), delimiter="\t", header=1,
response.var="CaseControl")
pheno.list$file <- system.file("sampleData", "pheno.txt", package="CGEN")
pheno.list$main.vars <- ~Gender + Exposure
pheno.list$int.vars <- ~Exposure
pheno.list$strata.var <- "Study"
# Define the list of options.
# Specifying n.jobs=5 will let each job process 10 SNPs.
op <- list(n.jobs=5, GxE.scan.op=list(model=1))
# GxE.scan.partition(snp.list, pheno.list, op=op)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.