expokit_dgexpv_Qmat: EXPOKIT dgexpv matrix exponentiation on Q matrix

View source: R/rexpokit.R

expokit_dgexpv_QmatR Documentation

EXPOKIT dgexpv matrix exponentiation on Q matrix


This function converts a matrix to COO format and exponentiates it via the EXPOKIT dgexpv function (designed for sparse matrices) and wrapper functions wrapalldgexpv_ around dgexpv.


  expokit_dgexpv_Qmat(Qmat = NULL, t = 2.1,
    inputprobs_for_fast = NULL, transpose_needed = TRUE,
    transform_to_coo_TF = TRUE, coo_n = NULL, anorm = NULL,
    check_for_0_rows = TRUE)



an input Q transition matrix


a time value to exponentiate by


If NULL (default), the full probability matrix (Pmat) is returned. However, the full speed of EXPOKIT on sparse matrices will be exploited if inputprobs_for_fast=c(starting probabilities). In this case these starting probabilities are input to myDMEXPV directly, as v, and w, the output probabilities, are returned.


If TRUE (default), matrix will be transposed (apparently EXPOKIT needs the input matrix to be transposed compared to normal)


Should the matrix be tranposed to COO? COO format is required for EXPOKIT's sparse-matrix functions (like dmexpv and unlike the padm-related functions. Default TRUE; if FALSE, user must put a COO-formated matrix in Qmat. Supplying the coo matrix is probably faster for repeated calculations on large matrices.


If a COO matrix is input, coo_n specified the order (# rows, equals # columns) of the matrix.


dgexpv requires an initial guess at the norm of the matrix. Using the R function norm might get slow with large matrices. If so, the user can input a guess manually (Lagrange seems to just use 1 or 0, if I recall correctly).


If TRUE or a numeric value, the input Qmat is checked for all-zero rows, since these will crash the FORTRAN wrapalldmexpv function. A small nonzero value set to check_for_0_rows or the default (0.0000000000001) is input to off-diagonal cells in the row (and the diagonal value is normalized), which should fix the problem.


NOTE: DGEXPV vs. DMEXPV. According to the EXPOKIT documentation, DGEXPV should be faster than DMEXPV, however DMEXPV runs an accuracy check appropriate for Markov chains, which is not done in DGEXPV.


* The method used is based on Krylov subspace projection
* techniques and the matrix under consideration interacts only
* via the external routine 'matvec' performing the matrix-vector
* product (matrix-free method).
* This [DMEXPV, not DGEXPV -- NJM] is a customised version for Markov Chains. This means that a
* check is done within this code to ensure that the resulting vector
* w is a probability vector, i.e., w must have all its components
* in [0,1], with sum equal to 1. This check is done at some expense
* and the user may try DGEXPV which is cheaper since it ignores
* probability constraints.

I (NJM) have not noticed a difference between the outputs of these two functions, but it might occur with large matrices.

COO (coordinated list) format is a compressed format that is required for EXPOKIT's sparse-matrix functions (like dgexpv and unlike EXPOKIT's padm-related functions. COO format is described here:


If Qmat is NULL (default), a default matrix is input.


tmpoutmat the output matrix. wrapalldgexpv_ produces additional output relating to accuracy of the output matrix etc.; these can be by a direct call of dgexpv.


Nicholas J. Matzke nickmatzke.ncse@gmail.com and Drew Schmidt schmidt@math.utk.edu

See Also




# Example:
# Make a square instantaneous rate matrix (Q matrix)
# This matrix is taken from Peter Foster's (2001) "The Idiot's Guide
# to the Zen of Likelihood in a Nutshell in Seven Days for Dummies,
# Unleashed" at:
# \url{http://www.bioinf.org/molsys/data/idiots.pdf}
# The Q matrix includes the stationary base freqencies, which Pmat
# converges to as t becomes large.
Qmat = matrix(c(-1.218, 0.504, 0.336, 0.378, 0.126, -0.882, 0.252, 0.504, 0.168,
0.504, -1.05, 0.378, 0.126, 0.672, 0.252, -1.05), nrow=4, byrow=TRUE)

# Make a series of t values
tvals = c(0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1, 2, 5, 14)

# Exponentiate each with EXPOKIT's dgexpv (should be fast for large sparse matrices)
for (t in tvals)
	Pmat = expokit_dgexpv_Qmat(Qmat=Qmat, t=t, transpose_needed=TRUE)
	cat("\n\nTime=", t, "\n", sep="")

# DMEXPV and DGEXPV are designed for large, sparse Q matrices (sparse = lots of zeros).
# DMEXPV is specifically designed for Markov chains and so may be slower, but more accurate.

# DGEXPV, single t-value
expokit_wrapalldgexpv_tvals(Qmat=Qmat, tvals=tvals[1], transpose_needed=TRUE)
expokit_wrapalldgexpv_tvals(Qmat=Qmat, tvals=2)

# This function runs the for-loop itself (sadly, we could not get mapply() to work
# on a function that calls dmexpv/dgexpv), returning a list of probability matrices.

# DGEXPV functions
list_of_P_matrices_dgexpv = expokit_wrapalldgexpv_tvals(Qmat=Qmat,
tvals=tvals, transpose_needed=TRUE)

nmatzke/rexpokit documentation built on Nov. 28, 2023, 9:35 p.m.