samplingDistCalculation: samplingDistCalculation

Description Usage Arguments Value Note Author(s)

View source: R/samplingDistCalculation.R

Description

Internal function to set up subsampling distribution to execute the stochastic version of a stagewise approach. The subsampling is coducted at the cluster level, not the individual observation level. Sampling probabilities are first calculated or provided for each observation individually, and then the sampling probability for each cluster is taken to be the average probability across all observations in the cluster.

Usage

1
2
samplingDistCalculation(sampleProb, y, x, clusterID, waves, beta, beta0, phi,
  alpha, offset, meanLinkInv, varianceLink, corstr, mu.eta)

Arguments

sampleProb

A user provided value for the probability associated with each observation. sampleProb can be provided as 1) a vector of fixed values of length equal to the resposne vector y, 2) a function that takes in a list of values (full list of values given in details) and returns a vector of length equal to the response vector y, or 3) the default value of NULL, which results in a uniform distribution

y

The vector of the response values provided to the original stagewise function

x

The covariate matrix provided to the original stagewise function

clusterID

The vector of cluster ID numbers provided to the original stagewise function

waves

The waves parameter identifying the order of observations within the clusters that is provided to the original stagewise function

beta

The vector of the current estimates of the coefficients

beta0

The current estimate of the intercept

phi

Current estimate of the scale parameter

alpha

Current estimate of the parameter affecting the within cluster correlation

offset

offset in the linear predictor provided to the original stagewise function

meanLinkInv

The link inverse function from the family object provided to the original stagewise function indicating what family of mean and variance structure is assumed

varianceLink

The variance link function from the family object provided to the original stagewise function indicating what family of mean and variance structure is assumed

corstr

The structure of the working correlation matrix that was provided to the original stagewise function

mu.eta

Derivative function of mu, the conditional mean of the response, with respect to eta, the linear predictor, from the family object provided to the original stagewise function indicating what family of mean and variance structure is assumed

Value

The sampling distribution probabilities to be used for the sub sampling. distribution is provided as a vector with length equal to the number of clusters.

Note

Internal function.

The function provided to sampleProb (through the sgee.control function) needs to calculate probabilities for each observation in the response vector y. How these calculations are done is up to the user and the following values are provided to the sampleProb function as a list called values: y, x, clusterID, waves, beta, beta0, phi, alpha, offset, meanLinkInv, varianceLink, corstr, mu.eta. additionally, all of the values produced by sampleProb need to be non-negative.

Author(s)

Gregory Vaughan


sgee documentation built on May 1, 2019, 7:10 p.m.