RCM_NB: Fit the RC(M) model with the negative binomial distribution.

View source: R/F_RCM_NB.R

RCM_NBR Documentation

Fit the RC(M) model with the negative binomial distribution.

Description

Fit the RC(M) model with the negative binomial distribution.

Usage

RCM_NB(
  X,
  k,
  rowWeights = "uniform",
  colWeights = "marginal",
  tol = 0.001,
  maxItOut = 1000L,
  Psitol = 0.001,
  verbose = FALSE,
  global = "dbldog",
  nleqslv.control = list(maxit = 500L, cndtol = 1e-16),
  jacMethod = "Broyden",
  dispFreq = 10L,
  convNorm = 2,
  prior.df = 10,
  marginEst = "MLE",
  confModelMat = NULL,
  confTrimMat = NULL,
  prevCutOff,
  minFraction = 0.1,
  covModelMat = NULL,
  centMat = NULL,
  responseFun = c("linear", "quadratic", "dynamic", "nonparametric"),
  record = FALSE,
  control.outer = list(trace = FALSE),
  control.optim = list(),
  envGradEst = "LR",
  dfSpline = 3,
  vgamMaxit = 100L,
  degree = switch(responseFun[1], nonparametric = 3, NULL),
  rowExp = if (is.null(covModelMat)) 1 else 0.5,
  colExp = rowExp,
  allowMissingness = FALSE
)

Arguments

X

a nxp data matrix

k

an scalar, number of dimensions in the RC(M) model

rowWeights

a character string, either 'uniform' or 'marginal' row weights.

colWeights

a character string, either 'uniform' or 'marginal' column weights.

tol

a scalar, the relative convergende tolerance for the row scores and column scores parameters.

maxItOut

an integer, the maximum number of iterations in the outer loop.

Psitol

a scalar, the relative convergence tolerance for the psi parameters.

verbose

a boolean, should information on iterations be printed?

global

global strategy for solving non-linear systems, see ?nleqslv

nleqslv.control

a list with control options, see nleqslv

jacMethod

Method for solving non-linear equations, ?see nleqslv. Defaults to Broyden. The difference with the newton method is that the Jacobian is not recalculated at every iteration, thereby speeding up the algorithm

dispFreq

an integer, how many iterations the algorithm should wait before reestimationg the dispersions.

convNorm

a scalar, the norm to use to determine convergence

prior.df

an integer, see estDisp()

marginEst

a character string, either 'MLE' or 'marginSums', indicating how the independence model should be estimated

confModelMat

an nxg matrix with confounders, with no reference levels and with intercept

confTrimMat

an nxh matrix with confounders for filtering, with all levels and without intercept

prevCutOff

a scalar the minimum prevalence needed to retain a taxon before the the confounder filtering

minFraction

a scalar, total taxon abundance should equal minFraction*n if it wants to be retained before the confounder filtering

covModelMat

an nxd matrix with covariates. If set to null an unconstrained analysis is carried out, otherwise a constrained one. Factors must have been converted to dummy variables already

centMat

a fxd matrix containing the contrasts to center the categorical variables. f equals the number of continuous variables + the total number of levels of the categorical variables.

responseFun

a characters string indicating the shape of the response function

record

A boolean, should intermediate parameter estimates be stored?

control.outer

a list of control options for the outer loop constrOptim.nl function

control.optim

a list of control options for the optim() function

envGradEst

a character string, indicating how the environmental gradient should be fitted. 'LR' using the likelihood-ratio criterion, or 'ML' a full maximum likelihood solution

dfSpline

a scalar, the number of degrees of freedom for the splines of the non-parametric response function, see VGAM::s()

vgamMaxit

an integer, the maximum number of iteration in the vgam() function

degree

an integer, the degree of the polynomial fit if the spline fit fails

rowExp, colExp

exponents for the row and column weights of the singular value decomposition used to calculate starting values. Can be played around with in case of numerical troubles.

allowMissingness

See RCM()

Details

Includes fitting of the independence model, filtering out the effect of confounders and fitting the RC(M) components in a constrained or an unconstrained way for any dimension k. Not intended to be called directly but only through the RCM() function

Value

A list with elements

converged

a vector of booleans of length k indicating if the algorithm converged for every dimension

rMat

if not constrained a nxk matrix with estimated row scores

cMat

a kxp matrix with estimated column scores

psis

a vector of length k with estimates for the importance parameters psi

thetas

a vector of length p with estimates for the overdispersion

rowRec

(if not constrained) a n x k x maxItOut array with a record of all rMat estimates through the iterations

colRec

a k x p x maxItOut array with a record of all cMat estimates through the iterations

psiRec

a k x maxItOut array with a record of all psi estimates through the iterations

thetaRec

a matrix of dimension pxmaxItOut with estimates for the overdispersion along the way

iter

number of iterations

Xorig

(if confounders provided) the original fitting matrix

X

the trimmed matrix if confounders provided, otherwise the original one

fit

type of fit, either 'RCM_NB' or 'RCM_NB_constr'

lambdaRow

(if not constrained) vector of Lagrange multipliers for the rows

lambdaCol

vector of Lagrange multipliers for the columns

rowWeights

(if not constrained) the row weights used

colWeights

the column weights used

alpha

(if constrained) the kxd matrix of environmental gradients

alphaRec

(if constrained) the kxdxmaxItOut array of alpha estimates along the iterations

covariates

(if constrained) the matrix of covariates

libSizes

a vector of length n with estimated library sizes

abunds

a vector of length p with estimated mean relative abundances

confounders

(if provided) the confounder matrix

confParams

the parameters used to filter out the confounders

nonParamRespFun

A list of the non parametric response functions

degree

The degree of the alternative parametric fit

NApresent

A boolean, were NA values present?

Note

Plotting is not supported for quadratic response functions

See Also

RCM

Examples

data(Zeller)
require(phyloseq)
tmpPhy = prune_taxa(taxa_names(Zeller)[seq_len(100)],
prune_samples(sample_names(Zeller)[seq_len(50)], Zeller))
mat = as(otu_table(tmpPhy), "matrix")
mat = mat[rowSums(mat)>0, colSums(mat)>0]
zellerRCM = RCM_NB(mat, k = 2)
#Needs to be called directly onto a matrix

CenterForStatistics-UGent/RCM documentation built on April 24, 2023, 8:26 p.m.