zi_fit_pms: Fits a Hurdle conditional model with pms parametrization of...
In sqyu/ZiDAG: Directed Graphical Models and Causal Discovery for Zero-Inflated Data

Description Usage Arguments Details Value Examples

Fits a Hurdle conditional model with pms parametrization of specified degree.

zi_fit_pms(
  V,
  Y,
  left,
  right,
  extra_regressors = NULL,
  extra_reg_pen_factors = NULL,
  p_V_degree = 1,
  p_Y_degree = 1,
  p_Y_V_degree = 1,
  mu_V_degree = 1,
  mu_Y_degree = 1,
  mu_Y_V_degree = 1,
  value_only = TRUE,
  tol = 1e-08,
  maxit = 1e+05,
  seed = NULL,
  penalize_decider = function(X) {     ncol(X) >= nrow(X)/2 },
  nfits = 10,
  runs = 2
)

`V`	A matrix of 0/1s, equal to Y != 0.
`Y`	A data matrix of the same size as `V`.
`left`	An integer between 1 and `ncol(Y)`. The index of the variable to be fit.
`right`	A vector of integers between 1 and `ncol(Y)` different from `left`. Indices of the "regressors".
`extra_regressors`	A matrix with the same number of rows as `V` and `Y`, extra regressors to be included in both regressions (conditional log odds/conditional mean). Defaults to `NULL`.
`extra_reg_pen_factors`	A vector of non-negative numbers, defaults to `NULL`. Penalty factors for `extra_regressors`. If the main design matrix has `d` columns, `c(rep(1, d), extra_reg_pen_factors)` will be passed as the `penalty.factor` argument to `glmnet::glmnet()`. If `intercept == TRUE`, a `0` will also be prepended.
`p_V_degree`	A non-negative integer, the degree for the `Vo` in the Hurdle polynomial for the conditional log odds. Defaults to 1.
`p_Y_degree`	A non-negative integer, the degree for the `Yo` in the Hurdle polynomial for the conditional log odds. Defaults to 1.
`p_Y_V_degree`	A non-negative integer, the degree for interaction between `Vo` and `Yo` in the Hurdle polynomial for the conditional log odds. Defaults to 1. If equal to 1, no interaction will be included (since it would be either a pure `V` term or a pure `Y` term).
`mu_V_degree`	A non-negative integer, the degree for the `Vo` in the Hurdle polynomial for the conditional mean. Defaults to 1.
`mu_Y_degree`	A non-negative integer, the degree for the `Yo` in the Hurdle polynomial for the conditional mean. Defaults to 1.
`mu_Y_V_degree`	A non-negative integer, the degree for interaction between `Vo` and `Yo` in the Hurdle polynomial for the conditional mean. Defaults to 1. If equal to 1, no interaction will be included (since it would be either a pure `V` term or a pure `Y` term).
`value_only`	If `TRUE`, returns the minimized negative log likelihood only. Defaults to `TRUE`.
`tol`	A number, tolerance. Defaults to `1e-8`. Passed to `stats::glm()` for penalized logistic regressions, or as the `thresh` argument to `glmnet::glmnet()` for both logistic and linear regressions if penalized.
`maxit`	An integer, the maximum number of iterations. Defaults to `100000`. Passed to `stats::glm()` for penalized logistic regressions, or to `glmnet::glmnet()` for both logistic and linear regressions if penalized.
`seed`	A number, the random seed passed to `zi_fit_lm()` for both regressions (conditional log odds/conditional mean).
`penalize_decider`	A logical or a function that takes a design matrix and returns a logical. Defaults to `function(X){ncol(X)>=nrow(X)/2}`. Used to decide whether to use penalized l2 (ridge) regression (if `TRUE`) when fitting each conditional distribution. Note that for either regression (conditional log odds/conditional mean), if the fits for unpenalized regressions are almost perfect, penalized regressions will be automatically used.
`nfits`	A positive integer, defaults to `10`. Used for penalized regressions, as number of folds if `CV_BIC == TRUE` (`nfits` argument to `glmnet::cv.glmnet()`, with `nlambda` set to `100`), or the number of lambdas if `BIC == FALSE` (as the `nlambda` argument to `glmnet::glmnet()`).
`runs`	A positive integer, the number of reruns. The fit with the maximum likelihood will be returned. Defaults to `2`.

A Hurdle conditional model with pms parametrization for the left node given those in right has log density with respect to the sum of the Lebesgue measure and a point mass at 0 equal to (in terms of y) log(1-p) if y == 0, or log(p)-(y-mu)^2/2/sigmasq otherwise. That is, it is a mixture of a binomial with probability of success p and a Gaussian with conditional mean mu and conditional variance sigmasq. Here sigmasq is assumed constant, and parameters log(p/(1-p)) and mu are Hurdle polynomials, i.e. polynomials in the values for right and their indicators. This function thus fits such a model using Y[,left], Y[,right] and V[,right] = (Y[,right] != 0), using a logistic for the log odds log(p/(1-p)) and a linear regression for mu.

Writing Yo <- Y[,right], a Hurdle polynomial in parents Yo is a polynomial in Yo and their 0/1 indicators Vo. The V_degree of a term that is a product of some columns of Vo only is the number of parents that appears in it. For example, V1 * V2 * V3 has V_degree equal to 3. Note that V1^p is equal to V1 for any p >= 1 so it does not make sense to include a power. The Y_degree of a term that is a product of powers of some columns of Yo only is the degree of a polynomial in its usual sense. For example, Y1^2 * Y2 * Y3^3 has Y_degree equal to 2+1+3=6. The Y_V_degree of a term that involves both some columns of Vo and some of Yo is the sum of the V_degree of the V part and the Y_degree of the Y part. For example, Y1^2 * V2 * Y3^3 * V4 * V5 has Y_V_degree equal to 2+1+3+1+1=8. The design matrix thus includes all possible terms with V_degree, Y_degree, Y_V_degree less than or equal to those specified. For example, if Vo and Yo has two columns and V_degree == 2, Y_degree == 2, Y_V_degree == 2, the design matrix has columns V1, V2, V1*V2, Y1, Y2, Y1*Y2, Y1^2, Y2^2, Y1*V2, Y2*V1. Note that terms like V1*Y1 are not included as it is equivalent to Y1. Parameters p_V_degree, p_Y_degree, p_Y_V_degree, mu_V_degree, mu_Y_degree, and mu_Y_V_degree specify these degrees for the regressions for the log odds log(p/(1-p)) and the conditional mean mu, respectively.

For automatically choosing a uniform degree <= a specified maximum degree, please use zi_fit_pms_choose_degree().

If value_only == TRUE, returns the minimized negative log likelihood only. Otherwise, returns

`nll`	A number, the minimized negative log likelihood.
`par`	A vector of length `4*length(right)+3`, the fitted parameters, in the other of: the intercept for the `a` (a scalar), linear coefficients on `V[,right]` for `a`, linear coefficients on `Y[,right]` for `a`, the intercept for the `b` (a scalar), linear coefficients on `V[,right]` for `b`, linear coefficients on `Y[,right]` for `b`.
`n`	An integer, the sample size.
`effective_df`	`4*length(right)+3`, the effective degree of freedom.

m <- 3; n <- 1000
adj_mat <- make_dag(m, "complete")
dat <- gen_zero_dat(1, "pms", adj_mat, n, k_mode=1, min_num=10, gen_uniform_degree=1)
extra_regressors <- matrix(rnorm(n * 4), nrow=n)
extra_reg_pen_factors <- c(1, 2, 3, 4) / sum(c(1, 2, 3, 4))
zi_fit_pms(dat$V, dat$Y, 3, 1:2, extra_regressors=extra_regressors,
    extra_reg_pen_factors=extra_reg_pen_factors, p_V_degree=2, p_Y_degree=2,
    p_Y_V_degree=2, mu_V_degree=2, mu_Y_degree=2, mu_Y_V_degree=2, value_only=TRUE)
zi_fit_pms(dat$V, dat$Y, 3, 1:2, extra_regressors=extra_regressors,
    extra_reg_pen_factors=extra_reg_pen_factors, p_V_degree=2, p_Y_degree=2,
    p_Y_V_degree=2, mu_V_degree=2, mu_Y_degree=2, mu_Y_V_degree=2, value_only=FALSE)

sqyu/ZiDAG documentation built on Jan. 19, 2021, 4:11 p.m.

sqyu/ZiDAG index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

sqyu/ZiDAG
Directed Graphical Models and Causal Discovery for Zero-Inflated Data

zi_fit_pms: Fits a Hurdle conditional model with pms parametrization of...
In sqyu/ZiDAG: Directed Graphical Models and Causal Discovery for Zero-Inflated Data

Description

Usage

Arguments

Details

Value

Examples

Related to zi_fit_pms in sqyu/ZiDAG...

R Package Documentation

Browse R Packages

We want your feedback!

sqyu/ZiDAG Directed Graphical Models and Causal Discovery for Zero-Inflated Data

zi_fit_pms: Fits a Hurdle conditional model with pms parametrization of... In sqyu/ZiDAG: Directed Graphical Models and Causal Discovery for Zero-Inflated Data

Description

Usage

Arguments

Details

Value

Examples

Related to zi_fit_pms in sqyu/ZiDAG...

R Package Documentation

Browse R Packages

We want your feedback!

sqyu/ZiDAG
Directed Graphical Models and Causal Discovery for Zero-Inflated Data

zi_fit_pms: Fits a Hurdle conditional model with pms parametrization of...
In sqyu/ZiDAG: Directed Graphical Models and Causal Discovery for Zero-Inflated Data