Description Usage Arguments Value
View source: R/SAIGE_SPATest.R
Run single variant or gene- or region-based score tests with SPA based on the linear/logistic mixed model.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | SPAGMMATtest(
bgenFile = "",
bgenFileIndex = "",
vcfFile = "",
vcfFileIndex = "",
vcfField = "DS",
savFile = "",
savFileIndex = "",
sampleFile = "",
idstoExcludeFile = "",
idstoIncludeFile = "",
rangestoExcludeFile = "",
rangestoIncludeFile = "",
chrom = "",
start = 1,
end = 2.5e+08,
IsDropMissingDosages = FALSE,
minMAC = 0.5,
minMAF = 0,
maxMAFforGroupTest = 0.5,
minInfo = 0,
GMMATmodelFile = "",
varianceRatioFile = "",
SPAcutoff = 2,
SAIGEOutputFile = "",
numLinesOutput = 10000,
IsSparse = TRUE,
IsOutputAFinCaseCtrl = FALSE,
IsOutputNinCaseCtrl = FALSE,
LOCO = FALSE,
condition = "",
sparseSigmaFile = "",
groupFile = "",
kernel = "linear.weighted",
method = "optimal.adj",
weights.beta.rare = c(1, 25),
weights.beta.common = c(1, 25),
weightMAFcutoff = 0.01,
weightsIncludeinGroupFile = FALSE,
weights_for_G2_cond = NULL,
r.corr = 0,
IsSingleVarinGroupTest = TRUE,
cateVarRatioMinMACVecExclude = c(0.5, 1.5, 2.5, 3.5, 4.5, 5.5, 10.5, 20.5),
cateVarRatioMaxMACVecInclude = c(1.5, 2.5, 3.5, 4.5, 5.5, 10.5, 20.5),
dosageZerodCutoff = 0.2,
IsOutputPvalueNAinGroupTestforBinary = FALSE,
IsAccountforCasecontrolImbalanceinGroupTest = TRUE,
IsOutputBETASEinBurdenTest = FALSE
)
|
bgenFile |
character. Path to bgen file. Currently version 1.2 with 8 bit compression is supported |
bgenFileIndex |
character. Path to the .bgi file (index of the bgen file) |
vcfFile |
character. Path to vcf file |
vcfFileIndex |
character. Path to index for vcf file by tabix, ".tbi" by "tabix -p vcf file.vcf.gz" |
vcfField |
character. genotype field in vcf file to use. "DS" for dosages or "GT" for genotypes. By default, "DS". |
savFile |
character. Path to sav file |
savFileIndex |
character. Path to index for sav file .s1r |
sampleFile |
character. Path to the file that contains one column for IDs of samples in the dosage, vcf, sav, or bgen file with NO header |
idstoExcludeFile |
character. Path to the file containing variant ids to be excluded from the bgen file. The file does not have a header and each line is for a marker ID. |
idstoIncludeFile |
character. Path to the file containing variant ids to be included from the bgen file. The file does not have a header and each line is for a marker ID. |
rangestoExcludeFile |
character. Path to the file containing genome regions to be excluded from the bgen file. The file contains three columns for chromosome, start, and end respectively with no header |
rangestoIncludeFile |
character. Path to the file containing genome regions to be included from the bgen file. The file contains three columns for chromosome, start, and end respectively with no header |
chrom |
character. string for the chromosome to include from vcf file. Required for vcf file. Note: the string needs to exactly match the chromosome string in the vcf/sav file. For example, "1" does not match "chr1". If LOCO is specified, providing chrom will save computation cost |
start |
numeric. start genome position to include from vcf file. By default, 1 |
end |
numeric. end genome position to include from vcf file. By default, 250000000 |
IsDropMissingDosages |
logical. whether to drop missing dosages (TRUE) or to mean impute missing dosages (FALSE). By default, FALSE. This option only works for bgen, vcf, and sav input. |
minMAC |
numeric. Minimum minor allele count of markers to test. By default, 0.5. The higher threshold between minMAC and minMAF will be used |
minMAF |
numeric. Minimum minor allele frequency of markers to test. By default 0. The higher threshold between minMAC and minMAF will be used |
maxMAFforGroupTest |
numeric. Maximum minor allele frequency of markers to test in group test. By default 0.5. |
minInfo |
numeric. Minimum imputation info of markers to test. By default, 0. This option only works for bgen, vcf, and sav input |
GMMATmodelFile |
character. Path to the input file containing the glmm model, which is output from previous step. Will be used by load() |
varianceRatioFile |
character. Path to the input file containing the variance ratio, which is output from the previous step |
SPAcutoff |
by default = 2 (SPA test would be used when p value < 0.05 under the normal approximation) |
SAIGEOutputFile |
character. Path to the output file containing assoc test results |
numLinesOutput |
numeric. Number of markers to be output each time. By default, 10000 |
IsSparse |
logical. Whether to exploit the sparsity of the genotype vector for less frequent variants to speed up the SPA tests or not for dichotomous traits. By default, TRUE |
IsOutputAFinCaseCtrl |
logical. Whether to output allele frequency in cases and controls. By default, FALSE |
IsOutputNinCaseCtrl |
logical. Whether to output sample sizes in cases and controls. By default, FALSE |
LOCO |
logical. Whether to apply the leave-one-chromosome-out option. By default, FALSE |
condition |
character. For conditional analysis. Genetic marker ids (chr:pos_ref/alt if sav/vcf dosage input , marker id if bgen input) seperated by comma. e.g.chr3:101651171_C/T,chr3:101651186_G/A, Note that currently conditional analysis is only for bgen,vcf,sav input. |
sparseSigmaFile |
character. Path to the file containing the sparseSigma from step 1. The suffix of this file is ".mtx". |
groupFile |
character. Path to the file containing the group information for gene-based tests. Each line is for one gene/set of variants. The first element is for gene/set name. The rest of the line is for variant ids included in this gene/set. For vcf/sav, the genetic marker ids are in the format chr:pos_ref/alt. For bgen, the genetic marker ids should match the ids in the bgen file. Each element in the line is seperated by tab. |
kernel |
character. For gene-based test. By default, "linear.weighted". More options can be seen in the SKAT library |
method |
character. method for gene-based test p-values. By default, "optimal.adj". More options can be seen in the SKAT library |
weights.beta.rare |
vector of numeric. parameters for the beta distribution to weight genetic markers with MAF <= weightMAFcutoff in gene-based tests.By default, "c(1,25)". More options can be seen in the SKAT library |
weights.beta.common |
vector of numeric. parameters for the beta distribution to weight genetic markers with MAF > weightMAFcutoff in gene-based tests.By default, "c(1,25)". More options can be seen in the SKAT library. NOTE: this argument is not fully developed. currently, weights.beta.common is euqal to weights.beta.rare |
weightMAFcutoff |
numeric. Between 0 and 0.5. See document above for weights.beta.rare and weights.beta.common. By default, 0.01 |
weightsIncludeinGroupFile |
logical. Whether to specify customized weight for makers in gene- or region-based tests. If TRUE, weights are included in the group file. For vcf/sav, the genetic marker ids and weights are in the format chr:pos_ref/alt;weight. For bgen, the genetic marker ids should match the ids in the bgen filE, e.g. SNPID;weight. Each element in the line is seperated by tab. By default, FALSE |
weights_for_G2_cond |
vector of float. weights for conditioning markers for gene- or region-based tests. The length equals to the number of conditioning markers, delimited by comma. By default, "c(1,2)" |
r.corr |
numeric. bewteen 0 and 1. parameters for gene-based tests. By default, 0. More options can be seen in the SKAT library |
IsSingleVarinGroupTest |
logical. Whether to perform single-variant assoc tests for genetic markers included in the gene-based tests. By default, FALSE |
cateVarRatioMinMACVecExclude |
vector of float. Lower bound of MAC for MAC categories. The length equals to the number of MAC categories for variance ratio estimation. By default, c(0.5,1.5,2.5,3.5,4.5,5.5,10.5,20.5). If groupFile="", only one variance ratio corresponding to MAC >= 20 is used |
cateVarRatioMaxMACVecInclude |
vector of float. Higher bound of MAC for MAC categories. The length equals to the number of MAC categories for variance ratio estimation minus 1. By default, c(1.5,2.5,3.5,4.5,5.5,10.5,20.5). If groupFile="", only one variance ratio corresponding to MAC >= 20 is used |
dosageZerodCutoff |
numeric. In gene- or region-based tests, for each variants with MAC <= 10, dosages <= dosageZerodCutoff with be set to 0. By default, 0.2. |
IsOutputPvalueNAinGroupTestforBinary |
logical. In gene- or region-based tests for binary traits. if IsOutputPvalueNAinGroupTestforBinary is TRUE, p-values without accounting for case-control imbalance will be output. By default, FALSE |
IsAccountforCasecontrolImbalanceinGroupTest |
logical. In gene- or region-based tests for binary traits. If IsAccountforCasecontrolImbalanceinGroupTest is TRUE, p-values after accounting for case-control imbalance will be output. By default, TRUE |
IsOutputBETASEinBurdenTest |
logical. Output effect size (BETA and SE) for burden tests. By default, FALSE |
SAIGEOutputFile
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.