snptest: Run snptest gwas on multiple cores

Description Usage Arguments Details Value Examples

Description

Run snptest gwas on multiple cores

Usage

1
2
3
4
snptest(indir, sample_file, outdir, pheno, covs = NULL,
        exclusion_file = NULL, add_args = NULL, ncore = 1L,
        pattern = "\\.gz$", chr_chunk = ".*chr([^_]+)_(\\d+)",
        executable = "snptest")

Arguments

indir

Directory with gz-compressed chunk files.

sample_file

Phenotype file.

outdir

Directory where snptest output files should go.

pheno

Name of phenotype variable as used in ‘sample_file’.

covs

Character vector with names of covariates as used in ‘sample_file’.

exclusion_file

File with ids of individuals to be excluded from analysis.

add_args

Additional command-line arguments to snptest.

ncore

Number of cores to use in parallel.

pattern

Regex pattern used to match input chunk files in ‘indir’.

chr_chunk

Extended regular expression with two parenthesized subexpressions matching chromosome and chunk number in input chunk file names.

executable

Path to snptest executable.

Details

Function parameters without default values are mandatory. All other parameters are optional.

Sometimes you do not want all chunk files in ‘indir’ to be analyzed. A typical reason would be that different chunk files need to be run with different snptest options. If you are able to specify a regular expression that matches the set of chunk files that you do want to analyze, you can supply this regular expression as the ‘pattern’ argument to snptest. This will cause snptest to be run only with files that are located in ‘indir’ _and_ whose path—which includes ‘indir’ as a prefix (!)—matches the regular expression ‘pattern’. Make sure that your pattern a) matches only chunk files and b) that all matched chunk files can be snptest'ed using the same set of snptest options.

If your chunk files follow a naming scheme where chromosome and chunk number cannot be matched by the default regular expression for ‘chr_chunk’, then you will have to specify your own regular expression (see regex). This (extended) regular expression must contain two parenthesized subexpressions. The first must match the chromosome, the second the chunk part of your chunk file names.

Value

Returns a data frame with columns: ‘chr’ (chromosome number), ‘chunk’ (chunk number), ‘input’ (gz-compressed input file), ‘output’ (snptest output file), ‘log’ (snptest log file), ‘done’ (TRUE for successfully snptest'ed chunks, otherwise FALSE).

After an entirely successful run of snptest all values in the ‘done’ column will be "TRUE". Otherwise columns with 'done == "FALSE"' should give you a hint where to look for possible problems. Also take a look at the warning, if any, that snptest emits while running.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
library(genFun)

## Not run: 
## Run snptest on men-only nonPAR chunks on chromosome X.
##
## Note: - The sample file must contain the same individuals as the
##         imputed chunk files.
##       - You have to specify "-assume_chromosome X".
##       - You have to craft your regular expression for `chr_chunk'
##         because the nonPAR chunk files don't follow the default
##         naming convention where chromosome and chunk are specified as
##         in "chr3_7".
##       - You have to specify a custom `pattern' because you need to
##         restrict the set of chunk files to those that contain only
##         male subjects.
snptest(
    indir          = "path/to/directory/with/gz-compressed/input/files",
    sample_file    = "path/to/MEN_ONLY_phenotype_file",
    outdir         = "path/to/directory/for/snptest/output/files",
    pheno          = "blood_sugar",
    covs           = c("age", "bmi"),
    exclusion_file = "path/to/file/with/exclusion/ids",
    add_args       = c("-missing_code NA",
                       "-frequentist 1",
                       "-method expected",
                       "-hwe",
                       "-printids",
                       "-use_raw_covariates",
                       "-use_raw_phenotypes",
                       "-assume_chromosome XY"),
    ncore          = 9L,
    pattern        = "males.*\\.gz$",
    chr_chunk      = ".*chr([^_]+)_(nonPAR_\\d+)",
    executable     = "/opt/snptest_v2.5_linux_x86_64_static/snptest_v2.5")

## End(Not run)

cbaumbach/genFun documentation built on May 13, 2019, 1:47 p.m.