compute_optimal_encoding: Compute the optimal encoding for each state

View source: R/encoding.R

compute_optimal_encodingR Documentation

Compute the optimal encoding for each state

Description

Compute the optimal encoding for categorical functional data using an extension of the multiple correspondence analysis to a stochastic process.

Usage

compute_optimal_encoding(
  data,
  basisobj,
  computeCI = TRUE,
  nBootstrap = 50,
  propBootstrap = 1,
  method = c("precompute", "parallel"),
  verbose = TRUE,
  nCores = max(1, ceiling(detectCores()/2)),
  ...
)

Arguments

data

data.frame containing id, id of the trajectory, time, time at which a change occurs and state, associated state. All individuals must begin at the same time T0 and end at the same time Tmax (use cut_data).

basisobj

basis created using the fda package (cf. create.basis).

computeCI

if TRUE, perform a bootstrap to estimate the variance of encoding functions coefficients

nBootstrap

number of bootstrap samples

propBootstrap

size of bootstrap samples relative to the number of individuals: propBootstrap * number of individuals

method

computation method: "parallel" or "precompute": precompute all integrals (efficient when the number of unique time values is low)

verbose

if TRUE print some information

nCores

number of cores used for parallelization (only if method == "parallel"). Default is half the cores.

...

parameters for integrate function (see details).

Details

See the vignette for the mathematical background: RShowDoc("cfda", package = "cfda")

Extra parameters (...) for the integrate function can be:

  • subdivisions the maximum number of subintervals.

  • rel.tol relative accuracy requested.

  • abs.tol absolute accuracy requested.

Value

A list containing:

  • eigenvalues eigenvalues

  • alpha optimal encoding coefficients associated with each eigenvectors

  • pc principal components

  • F matrix containing the F_{(x,i)(y,j)}

  • V matrix containing the V_{(x,i)}

  • G covariance matrix of V

  • basisobj basisobj input parameter

  • pt output of estimate_pt function

  • bootstrap Only if computeCI = TRUE. Output of every bootstrap run

  • varAlpha Only if computeCI = TRUE. Variance of alpha parameters

  • runTime Total elapsed time

Author(s)

Cristian Preda, Quentin Grimonprez

References

  • Deville J.C. (1982) Analyse de données chronologiques qualitatives : comment analyser des calendriers ?, Annales de l'INSEE, No 45, p. 45-104.

  • Deville J.C. et Saporta G. (1980) Analyse harmonique qualitative, DIDAY et al. (editors), Data Analysis and Informatics, North Holland, p. 375-389.

  • Saporta G. (1981) Méthodes exploratoires d'analyse de données temporelles, Cahiers du B.U.R.O, Université Pierre et Marie Curie, 37-38, Paris.

  • Preda C, Grimonprez Q, Vandewalle V. Categorical Functional Data Analysis. The cfda R Package. Mathematics. 2021; 9(23):3074. https://doi.org/10.3390/math9233074

See Also

plot.fmca print.fmca summary.fmca plotComponent get_encoding

Other encoding functions: get_encoding(), plot.fmca(), plotComponent(), plotEigenvalues(), predict.fmca(), print.fmca(), summary.fmca()

Examples

# Simulate the Jukes-Cantor model of nucleotide replacement
K <- 4
Tmax <- 5
PJK <- matrix(1 / 3, nrow = K, ncol = K) - diag(rep(1 / 3, K))
lambda_PJK <- c(1, 1, 1, 1)
d_JK <- generate_Markov(
  n = 10, K = K, P = PJK, lambda = lambda_PJK, Tmax = Tmax,
  labels = c("A", "C", "G", "T")
)
d_JK2 <- cut_data(d_JK, Tmax)

# create basis object
m <- 5
b <- create.bspline.basis(c(0, Tmax), nbasis = m, norder = 4)

# compute encoding
encoding <- compute_optimal_encoding(d_JK2, b, computeCI = FALSE, nCores = 1)
summary(encoding)

# plot the optimal encoding
plot(encoding)

# plot the two first components
plotComponent(encoding, comp = c(1, 2))

# extract the optimal encoding
get_encoding(encoding, harm = 1)

modal-inria/cfda documentation built on Oct. 19, 2023, 10:03 a.m.