sqtl.seeker.p: Permuted sQTL seeker

Description Usage Arguments Details Value Author(s)

View source: R/sqtl.seeker.p.R

Description

sqtl.seeker.p performs the same test as sqtl.seeker between SNPs and relative transcript expression values, prior permutation of individual labels (see Details).

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
sqtl.seeker.p(
  tre.df,
  genotype.f,
  gene.loc,
  covariates = NULL,
  genic.window = 5000,
  nb.perm.min = 100,
  nb.perm.max = 1000,
  min.nb.ext.scores = 100,
  min.nb.ind.geno = 10,
  verbose = FALSE
)

Arguments

tre.df

a data.frame with transcript relative expression produced by prepare.trans.exp. Same as in sqtl.seeker.

genotype.f

the name of the genotype file. This file needs to be ordered by position, compressed and indexed using index.genotype or externally using tabix (samtools). Must have column 'snpId'. Same as in sqtl.seeker.

gene.loc

a data.frame with the genes location. Columns 'chr', 'start', 'end' and 'geneId' are required. Same as in sqtl.seeker.

covariates

a data.frame with covariate information per sample (samples x covariates). Rownames should be the sample ids. Covariates can be either numeric or factor. When provided, they are regressed out before testing the genotype effect. Default is NULL.

genic.window

the window(bp) around the gene in which the SNPs are tested. Default is 5000 (i.e. 5kb).

nb.perm.min

the minimum number of permutations. Default is 100.

nb.perm.max

the maximum number of permutations. Default is 1000.

min.nb.ext.scores

the minimum number of permuted nominal P-values lower than the lowest observed nominal P-value to allow the computation to stop. Default is 100.

min.nb.ind.geno

SNPs with less samples than min.nb.ind.geno in any genotype group are filtered out. Default is 10.

verbose

Default is FALSE.

Details

sqtl.seeker.p implements an adaptive permutation procedure to control for multiple testing (i.e. multiple genetic variants are tested per gene, see also sqtls.p). The outcome of the permutations is then modeled using beta distributions, as in FastQTL (Ongen et al., 2015), allowing to compute an adjusted empirical P-value per gene.

Value

A data.frame with columns:

geneId

the gene name.

variants.cis

the number of variants tested in cis.

LD

a linkage disequilibrium estimate for the genomic window (median r2).

best.snp

ID of the SNP with the smallest observed nominal P-value.

best.nominal.pv

P-value corresponding to the best SNP.

shape1

Beta distribution parameter shape1.

shape2

Beta distribution parameter shape2.

nb.perms

the number of permutations used for the empirical P-value computation.

pv.emp

empirical P-value based on permutations.

pv.emp.beta

empirical P-value based on the beta approximation.

runtime

approximated computation time per gene.

Author(s)

Diego Garrido-Martín


guigolab/sQTLseekeR2 documentation built on Nov. 20, 2021, 3:21 a.m.