lfmm_ridge: LFMM least-squares estimates with ridge penalty
In cayek/MatrixFactorizationR: Latent Factor Mixed Models

Description Usage Arguments Details Value Author(s) Examples

This function computes regularized least squares estimates for latent factor mixed models using a ridge penalty.

1 2	lfmm_ridge(Y, X, K, lambda = 1e-05, algorithm = "analytical", it.max = 100, relative.err.min = 1e-06)

`Y`	a response variable matrix with n rows and p columns. Each column corresponds to a distinct response variable (e.g., SNP genotype, gene expression level, beta-normalized methylation profile, etc). Response variables must be encoded as numeric.
`X`	an explanatory variable matrix with n rows and d columns. Each column corresponds to a distinct explanatory variable (eg. phenotype). Explanatory variables must be encoded as numeric variables.
`K`	an integer for the number of latent factors in the regression model.
`lambda`	a numeric value for the regularization parameter.
`algorithm`	exact (analytical) algorithm or numerical algorithm. The exact algorithm is based on the global minimum of the loss function and computation is very fast. The numerical algorithm converges toward a local minimum of the loss function. The exact method should preferred. The numerical method is for very large n.
`it.max`	an integer value for the number of iterations for the numerical algorithm.
`relative.err.epsilon`	a numeric value for a relative convergence error. Test whether the numerical algorithm converges or not (numerical algorithm only).

The algorithm minimizes the following penalized least-squares criterion

L(U, V, B) = \frac{1}{2} ||Y - U V^{T} - X B^T||_{F}^2 + \frac{λ}{2} ||B||^{2}_{2} ,

where Y is a response data matrix, X contains all explanatory variables, U denotes the score matrix, V is the loading matrix, B is the (direct) effect size matrix, and lambda is a regularization parameter.

The response variable matrix Y and the explanatory variable are centered.

an object of class lfmm with the following attributes:

U the latent variable score matrix with dimensions n x K,
V the latent variable axis matrix with dimensions p x K,
B the effect size matrix with dimensions p x d.

cayek, francoio

library(lfmm)

## a GWAS example with Y = SNPs and X = phenotype
data(example.data)
Y <- example.data$genotype
X <- example.data$phenotype

## Fit an LFMM with K = 6 factors
mod.lfmm <- lfmm_ridge(Y = Y, 
                       X = X, 
                       K = 6)

## Perform association testing using the fitted model:
pv <- lfmm_test(Y = Y, 
                X = X, 
                lfmm = mod.lfmm, 
                calibrate = "gif")

## Manhattan plot with causal loci shown

pvalues <- pv$calibrated.pvalue
plot(-log10(pvalues), pch = 19, 
     cex = .2, col = "grey", xlab = "SNP")
points(example.data$causal.set, 
      -log10(pvalues)[example.data$causal.set], 
       type = "h", col = "blue")


## An EWAS example with Y = methylation data and X = exposure
Y <- scale(skin.exposure$beta.value)
X <- scale(as.numeric(skin.exposure$exposure))

## Fit an LFMM with 2 latent factors
mod.lfmm <- lfmm_ridge(Y = Y,
                       X = X, 
                       K = 2)
                       
## Perform association testing using the fitted model:
pv <- lfmm_test(Y = Y, 
                X = X,
                lfmm = mod.lfmm, 
                calibrate = "gif")
                
## Manhattan plot with true associations shown
pvalues <- pv$calibrated.pvalue
plot(-log10(pvalues), 
     pch = 19, 
     cex = .3,
     xlab = "Probe",
     col = "grey")
     
causal.set <- seq(11, 1496, by = 80)
points(causal.set, 
      -log10(pvalues)[causal.set], 
       col = "blue")