lfmm_ridge: LFMM least-squares estimates with ridge penalty

Description Usage Arguments Details Value Author(s) Examples

View source: R/lfmm.R

Description

This function computes regularized least squares estimates for latent factor mixed models using a ridge penalty.

Usage

1
2
lfmm_ridge(Y, X, K, lambda = 1e-05, algorithm = "analytical",
  it.max = 100, relative.err.min = 1e-06)

Arguments

Y

a response variable matrix with n rows and p columns. Each column corresponds to a distinct response variable (e.g., SNP genotype, gene expression level, beta-normalized methylation profile, etc). Response variables must be encoded as numeric.

X

an explanatory variable matrix with n rows and d columns. Each column corresponds to a distinct explanatory variable (eg. phenotype). Explanatory variables must be encoded as numeric variables.

K

an integer for the number of latent factors in the regression model.

lambda

a numeric value for the regularization parameter.

algorithm

exact (analytical) algorithm or numerical algorithm. The exact algorithm is based on the global minimum of the loss function and computation is very fast. The numerical algorithm converges toward a local minimum of the loss function. The exact method should preferred. The numerical method is for very large n.

it.max

an integer value for the number of iterations for the numerical algorithm.

relative.err.epsilon

a numeric value for a relative convergence error. Test whether the numerical algorithm converges or not (numerical algorithm only).

Details

The algorithm minimizes the following penalized least-squares criterion

L(U, V, B) = \frac{1}{2} ||Y - U V^{T} - X B^T||_{F}^2 + \frac{λ}{2} ||B||^{2}_{2} ,

where Y is a response data matrix, X contains all explanatory variables, U denotes the score matrix, V is the loading matrix, B is the (direct) effect size matrix, and lambda is a regularization parameter.

The response variable matrix Y and the explanatory variable are centered.

Value

an object of class lfmm with the following attributes:

Author(s)

cayek, francoio

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
library(lfmm)

## a GWAS example with Y = SNPs and X = phenotype
data(example.data)
Y <- example.data$genotype
X <- example.data$phenotype

## Fit an LFMM with K = 6 factors
mod.lfmm <- lfmm_ridge(Y = Y, 
                       X = X, 
                       K = 6)

## Perform association testing using the fitted model:
pv <- lfmm_test(Y = Y, 
                X = X, 
                lfmm = mod.lfmm, 
                calibrate = "gif")

## Manhattan plot with causal loci shown

pvalues <- pv$calibrated.pvalue
plot(-log10(pvalues), pch = 19, 
     cex = .2, col = "grey", xlab = "SNP")
points(example.data$causal.set, 
      -log10(pvalues)[example.data$causal.set], 
       type = "h", col = "blue")


## An EWAS example with Y = methylation data and X = exposure
Y <- scale(skin.exposure$beta.value)
X <- scale(as.numeric(skin.exposure$exposure))

## Fit an LFMM with 2 latent factors
mod.lfmm <- lfmm_ridge(Y = Y,
                       X = X, 
                       K = 2)
                       
## Perform association testing using the fitted model:
pv <- lfmm_test(Y = Y, 
                X = X,
                lfmm = mod.lfmm, 
                calibrate = "gif")
                
## Manhattan plot with true associations shown
pvalues <- pv$calibrated.pvalue
plot(-log10(pvalues), 
     pch = 19, 
     cex = .3,
     xlab = "Probe",
     col = "grey")
     
causal.set <- seq(11, 1496, by = 80)
points(causal.set, 
      -log10(pvalues)[causal.set], 
       col = "blue")

cayek/MatrixFactorizationR documentation built on Feb. 19, 2018, 2:04 p.m.