Run the VC-C2 method on a genomic region defined by a start and a stop base pair coordinate

Share:

Description

Runs the VC-C2 method on a given genomic region

Usage

1
2
3
4
VCC2.region(y = NULL, X = NULL, Phi = NULL, type = "bed",
  filename = NULL, map = NULL, chr = 0, startpos = 0, endpos = 0,
  regionname = NULL, U = NULL, S = NULL, RH.Null = NULL,
  weights = NULL, Nperm = 100, Ncores = 1)

Arguments

y

vector of phenotype data (one entry per individual), of length n.

X

matrix of covariates including intercept (dimension: n \times p, with p the number of covariates)

Phi

Relationship matrix (i.e. twice the kinship matrix); an n \times n square symmetric positive-definite matrix.

type

character, 'ped', 'bed' (default) or 'shapeit-haps' format of input file containing haplotype data

filename

character, path to input file containing haplotype data

map

object, data.frame contains 3 columns: rsID, chromosome, position in bp as output by e.g. readMapFile.

chr

character, chromosome number (basically from 1 to 22 as used by Plink), on which the region of interest is located

startpos

numeric, start position (in bp, base pairs) of the region of interest (default: 0)

endpos

numeric, end position (in bp, base pairs) of the region of interest (default: 0)

regionname

(character) Name of the region/gene on which you are running the association test. This name is used in the output of this function and can be used to distinguish different regions if this function is run multiple times.

U

(optional) Matrix of Eigenvectors of the relationship matrix obtained from spectral decomposition of the relationship matrix: Φ = U S U^T. If this parameter is not given, it will be computed, so when running this function for many regions time can be saved by specifying not only Phi, but also S and U.

S

(optional) Matrix of Eigenvalues of the relationship matrix obtained from spectral decomposition of the relationship matrix: Φ = U S U^T. If this parameter is not given, it will be computed, so when running this function for many regions, time can be saved by specifying not only Phi, but also S and U.

RH.Null

(optional) output of Estim.H0.VCC function. Practically, you don't need to calculate the null hypothesis for every region. One estimation per trait is enough.

weights

optional numeric vector of genotype weights. If this option is not specified, the beta distribution is used for weighting the variants, with each weight given by w_i = dbeta(f_i, 1, 25)^2, with f_i the minor allele frequency (MAF) of variant i. This default is the same as used by the SKAT package. This vector is used as the diagonal of the m \times m matrix W, with m the number of variants.

Nperm

Integer, number of permutations to use for empirical p-value estimation (default: 100).

Ncores

(integer) Number of processor (CPU) cores to be used in parallel when doing the permutations to determine the p-value (default: 1).

Value

A data frame containing the results of the association test. The data frame contains the following columns:

  • Score.Test: the score of the given association test

  • P.value: the p-value of the association test

  • N.Markers: the number of markers in the region

  • regionname: Name of the region/gene on which you are running the association test

Author(s)

Lennart C. Karssen

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.