main_LFMM: Fitting Latent Factor Mixed Models (MCMC algorithm)

Description Usage Arguments Value Author(s) References See Also Examples

Description

Latent Factor Mixed Models (LFMMs) are factor regression models in which the response variable is a genotypic matrix, and the explanatory variables are environmental measures of ecological interest or trait values. The lfmm function estimates latent factors and effect sizes based on an MCMC algorithm. The resulting object can be used in the function lfmm.pvalues to identify genetic polymorphisms exhibiting association with ecological gradients or phenotypes, while correcting for unobserved confounders. An exact and computationally efficient least-squares method is implemented in the function lfmm2.

Usage

1
2
3
4
5
6
7
8
lfmm(input.file, environment.file, K, 
    project = "continue", 
    d = 0, all = FALSE, 
    missing.data = FALSE, CPU = 1, 
    iterations = 10000, burnin = 5000, 
    seed = -1, repetitions = 1, 
    epsilon.noise = 1e-3, epsilon.b = 1000, 
    random.init = TRUE)

Arguments

input.file

A character string containing a path to the input file, a genotypic matrix in the lfmm{lfmm_format} format. The matrix must not contain missing values. See impute for completion based on nonnegative matrix factorization.

environment.file

A character string containing a path to the environmental file, an environmental data matrix in the env format.

K

An integer corresponding to the number of latent factors.

project

A character string among "continue", "new", and "force". If "continue", the results are stored in the current project. If "new", the current project is removed and a new project is created. If "force", the results are stored in the current project even if the input file has been modified since the creation of the project.

d

An integer corresponding to the fit of an lfmm model with the d-th variable only from environment.file. By default (if NULL and all are FALSE), lfmm fits each variable from environment.file sequentially and independently.

all

A Boolean option. If TRUE, lfmm fits all variables from the environment.file at the same time. This option is not compatible with the d option.

missing.data

A Boolean option. If TRUE, the input.file contains missing genotypes. Caution: lfmm requires imputed genotype matrices. See impute.

CPU

A number of CPUs to run the parallel version of the algorithm. By default, the number of CPUs is 1.

iterations

The total number of cycles for the Gibbs Sampling algorithm.

burnin

The burnin number of cycles for the Gibbs Sampling algorithm.

seed

A seed to initialize the random number generator. By default, the seed is randomly chosen. The seed is initialized in each run of the program. For modifying the default setting, provide one seed per run.

repetitions

A number of replicate runs for the Gibbs Sampler algorithm.

epsilon.noise

A prior parameter for variances.

epsilon.b

A prior parameter for the variance of correlation coefficients.

random.init

A Boolean option. If TRUE, the Gibbs Sampler is initiliazed randomly. Otherwise, it is initialized with zero values.

Value

lfmm returns an object of class lfmmProject.

The following methods can be applied to an object of class lfmmProject:

show

Display information about all analyses.

summary

Summarize analyses.

z.scores

Return the lfmm output vector of z.scores for some runs.

lfmm.pvalues

Return the vector of adjusted p-values for a combination of runs with K latent factors, and for the d-th predictor.

load.lfmmProject (file = "character")

Load the file containing an lfmmProject objet and show the object.

remove.lfmmProject (file = "character")

Erase a lfmmProject object. Caution: All the files associated with the object will be removed.

export.lfmmProject(file.lfmmProject)

Create a zip file containing the full lfmmProject object. It allows users to move the project to a new directory or a new computer (using import). If you want to overwrite an existing export, use the option force == TRUE.

import.lfmmProject(file.lfmmProject)

Import and load an lfmmProject object from a zip file (made with the export function) into the chosen directory. If you want to overwrite an existing project, use the option force == TRUE.

combine.lfmmProject(file.lfmmProject, toCombine.lfmmProject)

Combine to.Combine.lfmmProject into file.lfmmProject. Caution: Only projects with runs coming from the same input file can be combined. If the same input file has different names in the two projects, use the option force == TRUE.

Author(s)

Eric Frichot Olivier Francois

References

Frichot E, Schoville SD, Bouchard G, Francois O. (2013). Testing for associations between loci and environmental gradients using latent factor mixed models. Molecular biology and evolution, 30(7), 1687-1699.

See Also

lfmm.data z.scores lfmm.pvalues pca lfmm tutorial

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
### Example of analysis using lfmm ###

data("tutorial")
# creation of a genotype file: genotypes.lfmm.
# The file contains 400 SNPs for 50 individuals.
write.lfmm(tutorial.R, "genotypes.lfmm")

# Creation of a phenotype/environment file: gradient.env.
# One environmental predictor for 40 individuals.
write.env(tutorial.C, "gradients.env")

################
# running lfmm #
################

# main options, K: (the number of latent factors), 
#           CPU: the number of CPUs.

# Runs with K = 6 and 5 repetitions.
# runs with 6000 iterations 
# including 3000 iterations for burnin.
# Around 30 seconds per run.
project = lfmm( "genotypes.lfmm", 
                "gradients.env", 
                 K = 6, 
                 repetitions = 5, 
                 project = "new")

# get adjusted p-values using all runs
pv = lfmm.pvalues(project, K = 6)

# Evaluate FDR and POWER (TPR)
for (alpha in c(.05,.1,.15,.2)) {
    # expected FDR
    print(paste("expected FDR:", alpha))
    L = length(pv$pvalues)
    # Benjamini-Hochberg's mehod for an expected FDR = alpha.
    w = which(sort(pv$pvalues) < alpha * (1:L)/L)
    candidates = order(pv$pvalues)[w]

    # estimated FDR and True Positive Rate
    # The targets SNPs are loci 351 to 400
    Lc = length(candidates)
    estimated.FDR = length(which(candidates <= 350))/Lc
    estimated.TPR = length(which(candidates > 350))/50
    print(paste("FDR:", estimated.FDR, "True Positive Rate:", estimated.TPR))
}

###################
# Post-treatments #
###################

# show the project
show(project)

# summary of the project
summary(project)

# get the z-scores for the 2nd run for K = 6
z = z.scores(project, K = 6, run = 2)

# get the p-values for K = 6 and run 2
p = lfmm.pvalues(project, K = 6, run = 2)

##########################
# Manage an lfmm project #
##########################

# All the runs of lfmm for a given file are 
# automatically saved into an lfmm project directory and a file.
# The name of the lfmmProject file is the concatenation of 
# the name of the input file and the environment file 
# with a .lfmmProject extension ("genotypes_gradient.lfmmProject").
# The name of the lfmmProject directory is the same name as
# the lfmmProject file with a .lfmm extension ("genotypes_gradient.lfmm/")
# There is a unique lfmm Project for each input file.

# An lfmmProject can be loaded in an R session as follows
project = load.lfmmProject("genotypes_gradients.lfmmProject")

# An lfmmProject can be exported to be imported in another directory
# or in another computer as follows
export.lfmmProject("genotypes_gradients.lfmmProject")

dir.create("test", showWarnings = TRUE)
#import
newProject = import.lfmmProject("genotypes_gradients_lfmmProject.zip", "test")

# combine projects
combinedProject <- combine.lfmmProject(
                  "genotypes_gradients.lfmmProject", 
                  "test/genotypes_gradients.lfmmProject"
                  )

# remove
remove.lfmmProject("test/genotypes_gradients.lfmmProject")



# An lfmmProject can be removed as follows.
# Caution: All the files associated with the project will be removed.
remove.lfmmProject("genotypes_gradients.lfmmProject")

LEA documentation built on Nov. 8, 2020, 8:19 p.m.