R/learn.R

#' @title Learn a Regression Model from the Given Data Set
#' @description Use the \code{learnerSelectoR} package to apply a set of
#' learners to a set of data representations and pick the approach which
#' generalizes best.
#'
#' The data is represented by two vectors, \code{x} and \code{y}.
#'
#' Each learner must be function with exactly four arguments named
#' \code{metric}, \code{transformation.x}, \code{transformation.y}, and
#' \code{metric.transformed} Its parameter \code{metric} will be an instance
#' of \code{\link{RegressionQualityMetric}} which guides the search on the
#' actual, raw data. However, since we internally use the
#' \code{\link{Transformation.applyDefault2D}} method from the
#' \code{dataTransformeR} package by default to generate different
#' representations of the raw data, each model fitting procedure may take
#' place in two steps, first on a transformed representation of the data
#' (\code{metric.transformed} based on \code{transformation.x} and
#' \code{transformation.y}) and then the actual finalization fitting the
#' actual \code{metric}.
#'
#' A learner returns an instance of \code{\link{FittedModel}} which
#' represents, well, the model it has fitted to its input data. Each learner
#' thus represents the process of adapting a specific model to some data.
#'
#' By default, this method uses all the learners which are generated by
#' \code{\link{regressoR.defaultLearners}}.
#'
#' The \code{metricGenerator} is a function which accepts two vectors \code{x}
#' and \code{y} and returns an instance of
#' \code{\link{RegressionQualityMetric}}. It will be used to generate the
#' quality metrics for guiding the model fitters. Since we internally use the
#' \code{learning.learn} method from the \code{learnerSelectoR}
#' package, the model may be chosen based on cross-validation and the metric
#' generator is then also used to generate quality metrics for the training
#' and test datasets used internally. If nothing else is specified, we use
#' \code{\link{RegressionQualityMetric.default}} to generate the quality
#' metrics.
#'
#' \code{representations} is a list of \code{\link{TransformedData2D}}
#' instances providing alernative views on the data, or \code{NULL} if only
#' the raw data should be concerned. By default, we use
#' \code{\link{Transformation.applyDefault2D}} to get a set of representations
#' if nothing else is specified.
#'
#' The return value of this method will be an instance of
#' \code{\link{FittedModel}} or \code{NULL} if no learner could produce any
#' result.
#' @param x the \code{x} coordinates, i.e., the input values
#' @param y the \code{y} coordinates, i.e., the output values
#' @param learners the learners to apply
#' @param representations the list of data representations, or \code{NULL} if
#'   fitting should take place only on the raw data
#' @param metricGenerator the metric generator function
#' @param q the effort parameter: 0=minimum effort=fast/low quality, 1=maximum
#'   effort=slow=highest quality
#' @return an instance of \code{\link{FittedModel}} which represents the
#'   relationship between the \code{x} and \code{y} values
#' @export regressoR.learn
#' @importFrom regressoR.base regressoR.applyLearners
#' @importFrom dataTransformeR Transformation.applyDefault2D
#' @importFrom regressoR.quality RegressionQualityMetric.default
#' @include defaultLearners.R
#' @include learnTrivial.R
#' @examples \dontrun{
#' dx <- rnorm(100);
#' dy <- rnorm(n=100, mean=50*dx*dx-33);
#' plot(dx, dy)
#' result <- regressoR.learn(x=dx, y=dy);
#' result@f
#' # function (x)
#' # -32.5442186855071 + (x * (0.776119279549966 + (x * 49.7907873618706)))
#' result@quality
#' # [1] 0.2455075
#' dx.sorted <- sort(dx)
#' lines(dx.sorted, result@f(dx.sorted), col="red")
#' }
regressoR.learn <- function(x, y, learners = regressoR.defaultLearners(),
                            representations=Transformation.applyDefault2D(x=x, y=y, addIdentity=TRUE),
                            metricGenerator=RegressionQualityMetric.default,
                            q=0.75) {
  # is there a trivial solution?
  res <- .regressoR.learnTrivial(x, y);
  if(!is.null(res)) {
    # yes, then no learning is necessary
    return(res);
  }
  # no, so let's learn
  return(regressoR.applyLearners(x=x, y=y, learners=learners,
                                 representations=representations,
                                 metricGenerator=metricGenerator,
                                 q=q));
}
thomasWeise/regressoR documentation built on May 9, 2019, 8:12 p.m.