zi_fit_pms_choose_degree: Fits and chooses a Hurdle conditional model with pms...
In sqyu/ZiDAG: Directed Graphical Models and Causal Discovery for Zero-Inflated Data

Description Usage Arguments Details Value Examples

Fits and chooses a Hurdle conditional model with pms parametrization of degree <= a maximum degree.

zi_fit_pms_choose_degree(
  V,
  Y,
  left,
  right,
  max_uniform_degree,
  extra_regressors = NULL,
  extra_reg_pen_factors = NULL,
  value_only = TRUE,
  tol = 1e-08,
  maxit = 1e+05,
  seed = NULL,
  penalize_decider = function(X) {     ncol(X) >= nrow(X)/2 },
  nfits = 10,
  runs = 2,
  print_best_degree = FALSE
)

`V`	A matrix of 0/1s, equal to Y != 0.
`Y`	A data matrix of the same size as `V`.
`left`	An integer between 1 and `ncol(Y)`. The index of the variable to be fit.
`right`	A vector of integers between 1 and `ncol(Y)` different from `left`. Indices of the "regressors".
`max_uniform_degree`	A positive integer, the maximum degree for the Hurdle polynomials.
`extra_regressors`	A matrix with the same number of rows as `V` and `Y`, extra regressors to be included in both regressions (conditional log odds/conditional mean). Defaults to `NULL`.
`extra_reg_pen_factors`	A vector of non-negative numbers, defaults to `NULL`. Penalty factors for `extra_regressors`. If the main design matrix has `d` columns, `c(rep(1, d), extra_reg_pen_factors)` will be passed as the `penalty.factor` argument to `glmnet::glmnet()`. If `intercept == TRUE`, a `0` will also be prepended.
`value_only`	If `TRUE`, returns the minimized negative log likelihood only. Defaults to `TRUE`.
`tol`	A number, tolerance. Defaults to `1e-8`. Passed to `stats::glm()` for penalized logistic regressions, or as the `thresh` argument to `glmnet::glmnet()` for both logistic and linear regressions if penalized.
`maxit`	An integer, the maximum number of iterations. Defaults to `100000`. Passed to `stats::glm()` for penalized logistic regressions, or to `glmnet::glmnet()` for both logistic and linear regressions if penalized.
`seed`	A number, the random seed passed to `zi_fit_lm()` for both regressions (conditional log odds/conditional mean).
`penalize_decider`	A logical or a function that takes a design matrix and returns a logical. Defaults to `function(X){ncol(X)>=nrow(X)/2}`. Used to decide whether to use penalized l2 (ridge) regression (if `TRUE`) when fitting each conditional distribution. Note that for either regression (conditional log odds/conditional mean), if the fits for unpenalized regressions are almost perfect, penalized regressions will be automatically used.
`nfits`	A positive integer, defaults to `10`. Used for penalized regressions, as number of folds if `CV_BIC == TRUE` (`nfits` argument to `glmnet::cv.glmnet()`, with `nlambda` set to `100`), or the number of lambdas if `BIC == FALSE` (as the `nlambda` argument to `glmnet::glmnet()`).
`runs`	A positive integer, the number of reruns. The fit with the maximum likelihood will be returned. Defaults to `2`.
`print_best_degree`	A logical, whether to print the degree (1, ..., `max_uniform_degree`) that minimizes the BIC.

A Hurdle conditional model with pms parametrization for the left node given those in right has log density with respect to the sum of the Lebesgue measure and a point mass at 0 equal to (in terms of y) log(1-p) if y == 0, or log(p)-(y-mu)^2/2/sigmasq otherwise. That is, it is a mixture of a binomial with probability of success p and a Gaussian with conditional mean mu and conditional variance sigmasq. Here sigmasq is assumed constant, and parameters log(p/(1-p)) and mu are Hurdle polynomials, i.e. polynomials in the values for right and their indicators. This function thus fits such a model using Y[,left], Y[,right] and V[,right] = (Y[,right] != 0), using a logistic for the log odds log(p/(1-p)) and a linear regression for mu.

Writing Yo <- Y[,right], a Hurdle polynomial in parents Yo is a polynomial in Yo and their 0/1 indicators Vo. The degree of a term in a Hurdle polynomial is the number of V terms plus the sum of the degrees of the Y terms. For example, Y1^2 * V2 * Y3^3 * V4 * V5 has degree equal to 2+1+3+1+1=8. Given a degree, the design matrix thus includes all possible terms with degree less than or equal to the specified degree. For example, if Vo and Yo has two columns and if we choose degree 2, the design matrix has columns V1, V2, V1*V2, Y1, Y2, Y1*Y2, Y1^2, Y2^2, Y1*V2, Y2*V1. Note that terms like V1*Y1 are not included as it is equivalent to Y1.

This function fits models using Hurdle polynomials with degrees 1, 2, ..., max_uniform_degree, and automatically chooses the degree that minimizes the BIC. It is equivalent to calling zi_fit_pms() with all degree arguments equal to d, with d in 1, ..., max_uniform_degree, and returning the one with the smallest BIC.

If value_only == TRUE, returns the minimized negative log likelihood only. Otherwise, returns

`nll`	A number, the minimized negative log likelihood.
`par`	A vector of length `4*length(right)+3`, the fitted parameters, in the other of: the intercept for the `a` (a scalar), linear coefficients on `V[,right]` for `a`, linear coefficients on `Y[,right]` for `a`, the intercept for the `b` (a scalar), linear coefficients on `V[,right]` for `b`, linear coefficients on `Y[,right]` for `b`.
`n`	An integer, the sample size.
`effective_df`	`4*length(right)+3`, the effective degree of freedom.

m <- 3; n <- 1000
adj_mat <- make_dag(m, "complete")
dat <- gen_zero_dat(1, "pms", adj_mat, n, k_mode=1, min_num=10, gen_uniform_degree=1)
extra_regressors <- matrix(rnorm(n * 4), nrow=n)
extra_reg_pen_factors <- c(1, 2, 3, 4) / sum(c(1, 2, 3, 4))
zi_fit_pms_choose_degree(dat$V, dat$Y, 3, 1:2, max_uniform_degree=2L,
    extra_regressors=extra_regressors, extra_reg_pen_factors=extra_reg_pen_factors,
    value_only=TRUE, print_best_degree=TRUE)
zi_fit_pms_choose_degree(dat$V, dat$Y, 3, 1:2, max_uniform_degree=2L,
    extra_regressors=extra_regressors, extra_reg_pen_factors=extra_reg_pen_factors,
    value_only=FALSE, print_best_degree=TRUE)

sqyu/ZiDAG documentation built on Jan. 19, 2021, 4:11 p.m.

sqyu/ZiDAG index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

sqyu/ZiDAG
Directed Graphical Models and Causal Discovery for Zero-Inflated Data

zi_fit_pms_choose_degree: Fits and chooses a Hurdle conditional model with pms...
In sqyu/ZiDAG: Directed Graphical Models and Causal Discovery for Zero-Inflated Data

Description

Usage

Arguments

Details

Value

Examples

Related to zi_fit_pms_choose_degree in sqyu/ZiDAG...

R Package Documentation

Browse R Packages

We want your feedback!

sqyu/ZiDAG Directed Graphical Models and Causal Discovery for Zero-Inflated Data

zi_fit_pms_choose_degree: Fits and chooses a Hurdle conditional model with pms... In sqyu/ZiDAG: Directed Graphical Models and Causal Discovery for Zero-Inflated Data

Description

Usage

Arguments

Details

Value

Examples

Related to zi_fit_pms_choose_degree in sqyu/ZiDAG...

R Package Documentation

Browse R Packages

We want your feedback!

sqyu/ZiDAG
Directed Graphical Models and Causal Discovery for Zero-Inflated Data

zi_fit_pms_choose_degree: Fits and chooses a Hurdle conditional model with pms...
In sqyu/ZiDAG: Directed Graphical Models and Causal Discovery for Zero-Inflated Data