forward_test: Forward inclusion tests with latent factor mixed models

Description Usage Arguments Details Value Author(s) Examples

View source: R/lfmm.R

Description

This function tests for association between each column of the response matrix, Y, and the explanatory variables, X, by recursively conditioning on the top hits in the set of explanatory variables. The conditional tests are based on LFMMs with ridge penalty.

Usage

1
2
forward_test(Y, X, K, niter = 5, scale = FALSE, candidate.list = NULL,
  rev.confounder = TRUE, lambda = 1e-05)

Arguments

Y

a response variable matrix with n rows and p columns. Each column is a response variable (numeric).

X

an explanatory variable matrix with n rows and d = 1 column (eg. phenotype).

K

an integer for the number of latent factors in the regression model.

niter

an integer value for the number of forward inclusion tests.

scale

a boolean value, TRUE if the explanatory variable, X, is scaled (recommended option).

candidate.list

a vector of integers corresponding to response variables (columns in Y), which are known candidates for association. If NULL, a list of candidates is built in during the algorithm run.

rev.confounder

a boolean value. If TRUE confounders are revaluated in each conditional test. May take some time (default = TRUE).

lambda

a numeric value for the regularization parameter.

Details

The response variable matrix Y and the explanatory variable are centered.

Value

a list with the following attributes:

Author(s)

cayek, francoio

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
library(lfmm)
data("example.data")
Y <- example.data$genotype
X <- example.data$phenotype #scaled variable

## fits an LFMM, i.e, computes B, U, V:
mod.lfmm <- lfmm_ridge(Y = Y,
                       X = X, 
                       K = 6)
                       
## performs initial association testing using the fitted model:
pv <- lfmm_test(Y = Y, 
                X = X,
                lfmm = mod.lfmm,
                calibrate = "gif")
## Manhattan plot 
plot(-log10(pv$calibrated.pvalue), 
      pch = 19, 
      cex = .2,
      col = "grey")
      
## Start forward tests (3 iterations)       
obj <- forward_test(Y, 
                    X, 
                    K = 6, 
                    niter = 3, 
                    scale = TRUE)

## Record Log p.values for the 3 top hits
log.p <-  obj$log.p
log.p

## Check perfect hits for each causal SNPs (labelled from 1 to 20)
obj$candidate %in% example.data$causal.set

## Check for candidates at distance 20 SNPs (about 10kb)
theta <- 20
## Number of hits for each causal SNPs (1-20)
 hit.3 <- as.numeric(
          apply(sapply(obj$candidate, 
          function(x) abs(x - example.data$causal.set) < theta), 
          2, 
          which))
## Number of hits for each causal SNPs (1-20) 
table(hit.3)


## Continue forward tests (2 additional iterations)       
obj <- forward_test(Y, 
                    X, 
                    K = 6, 
                    niter = 2,
                    candidate.list = obj$candidates,
                    scale = TRUE)

## Record Log p.values for all 5 top hits
log.p <-  c(log.p, obj$log.p)
log.p

## Check perfect hits for each causal SNPs (labelled from 1 to 20)
obj$candidate %in% example.data$causal.set

## Check for candidates at distance 5 SNPs (about 2.5kb)
theta <- 5
## Number of hits for each causal SNPs (1-20)
 hit.5 <- as.numeric(
          apply(sapply(obj$candidate, 
          function(x) abs(x - example.data$causal.set) < theta), 
          2, 
          which))
## Number of hits for each causal SNPs (1-20)          
table(hit.5)

## Plot log P
plot(log.p, xlab = "Conditional test iteration", ylab="Top hit log(p)")

cayek/MatrixFactorizationR documentation built on June 17, 2020, 4:39 p.m.