# Forwards, backwards variable selection, picking variables to maximize explained variance.

### Description

Attempts to find the best explanatory set of variables to explain a single variable in a data set. Iterates between adding the next best variable to the set and removing the variable (if any) whose exclusion maximizes the overall score.

### Usage

1 | ```
fbvs(dataSet,one,maxv,linear)
``` |

### Arguments

`dataSet` |
the n x m data frame representing n observations of m variables. |

`one` |
a string specifying the name of one variable in the dataset, for which the best explanatory set is required. Defaults to the name of the last variable in the dataset. |

`maxv` |
an integer limiting the maximum number of variables in the explanatory set. Defaults to m-1. |

`linear` |
a boolean flag which causes fbvs to use a linear model to estimate R^2
instead of matie to estimate A when running the selection algorithm.
Defaults to |

### Details

Variable names are only added to the explanatory set if their inclusion results in an increase in the association measure.

### Value

Returns a list containing the following items:

`one` |
the name of the one variable that requires the explanatory set |

`best` |
the best set of explanatory variables |

`Rsq` |
an estimate for R^2 provided by the best set of explanatory variables |

### Note

The data set can be of any dimension

### Author(s)

Ben Murrell, Dan Murrell & Hugh Murrell.

### References

Discovering general multidimensional associations, http://arxiv.org/abs/1303.1828

### See Also

`ma`

`agram`

### Examples

1 2 3 4 5 6 7 | ```
# measure association for all pairs in a subrange of the baseball dataset
data(baseballData)
fbvs(baseballData,one="Salary")
fbvs(baseballData,one="Salary",linear=TRUE)
fbvs(baseballData,one="Salary",maxv=2)
fbvs(baseballData,one="Salary",maxv=2,linear=TRUE)
``` |

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker. Vote for new features on Trello.