# SobolIndices-class: Estimating Sobol Indices Using Sensitivity Analysis In SobolSensitivity: Computing the Sobol Sensitivity Indices

## Description

Sobol  proposed a definition called Sobol Indices for estimating the importance of single variable or multiple variales' interaction. We have derived the formulas for main effect Sobol indices by using sensitivity analysis under GLM of three link functions in `SobolIndices` class, and enhanced the computation by automating the whole procedure.

## Usage

 ```1 2``` ```SobolIndices(xdata, varinput=1, beta=0, link=c("identity","log","logit")) summary(object) ```

## Arguments

 `xdata` A data set of class 'matrix' or 'data.frame' which only includes the variables or features. `varinput` A vector; the indices of the variables which are of interest for computing their single or interaction (usually high order) main effect Sobol indices. `beta` A vector; the intercept and coefficients of the variables estimated from the GLM model. `link` A character; the link function used under the GLM model. `object` An object of the `SobolIndices` class.

## Details

The proposed algorithm for computing the Sobol Indices is to use a simple strategy under the GLM model with independent or multivariate normal inputs:

g(E(Y|X))=β_0 + X β_1

where X is the data matrix of the varibles or features, g(.) is the link function under GLM, and β=(β_0, β_1) is the vector of intercept and coefficients estimates in GLM. Note that β_0=0 if there is no intercept in the setting of fitting GLM.

We derive the conditional expectations of the response with respect to the input subsets, and then estimate the main effect Sobol' sensitivity indices directly as follows by using closed formulas or (approximate) numerically using empirical variance estimates for a large number of GLMs:

S_P=Var(E(Y|X_P))/Var(Y)

where P is the index set for the subset of variables of interest.

The results (numerator of Sobol Indices) can enable us to perform ANOVA-type variance decomposition analysis on data with multicollinearity issue, not only under Gaussian regression but also under other types of GLMs such as Poisson and logistic regression. The resulting main effect Sobol indices for the variables of interest are stored in the `sobol.indices` slot.

## Value

The `SobolIndices` function computes the main effect Sobol indices for variables of interest, constructs and returns an object of the `SobolIndices` class.

## Objects from the Class

Objects should be created using the `SobolIndices` constructor.

## Slots

`xdata`:

A data set of class 'matrix' or 'data.frame' which only includes the variables or features.

`varinput`

A vector which include the indices of the variables which are of interest for computing their main effect Sobol indices.

`beta`:

A vector which are the coefficients of the variables in a regression model.

`link`:

A character which is the link function used under the GLM model.

`sobol.indices`:

A numeric number which is the sobol indices of variable(s) of interest.

## Methods

summary

`(object="SobolIndices")`: ...

## Author(s)

Min Wang <wang.1807@mbi.osu.edu>

## References

 Sobol, I. M. (1990). On sensitivity estimation for nonlinear mathematical models, Matematicheskoe Modelirovanie, 2, 112-118.

 Lu, R., Wang D., Wang, M. and Rempala, G. (2016). Identifying Gene-gene Interactions Using Sobol Sensitivity Indices, submitted.

`identitySIfunction`, `logSIfunction` and `logitSIfunction` to get a complete list of the functions under different link functions to compute the sobol indices.
 ``` 1 2 3 4 5 6 7 8 9 10 11 12``` ```showClass("SobolIndices") # simulate xdata and beta xdata <- matrix(rnorm(20*5, 1), ncol=5) beta <- runif(6, min=-1, max=1) # variables 1 and 2 interaction is of interest varinput <- c(1,2) # link function is identity link (gaussian, possion, etc.) link <- "identity" # apply the proposed method si <- SobolIndices(xdata, varinput=varinput, beta, link="identity") # Review the results summary(si) ```