# R/FIT.R In FIT: Transcriptomic Dynamics Models in Field Conditions

#' FIT: a statistical modeling tool for transcriptome dynamics under fluctuating field conditions
#'
#'
#' Provides functionality for constructing statistical models of transcriptomic dynamics in field
#' conditions. It further offers the function to predict expression of a gene given the attributes
#' of samples and meteorological data. Nagano, A. J., Sato, Y., Mihara, M., Antonio, B. A.,
#' Motoyama, R., Itoh, H., Naganuma, Y., and Izawa, T. (2012). <doi:10.1016/j.cell.2012.10.048>.
#' Iwayama, K., Aisaka, Y., Kutsuna, N., and Nagano, A. J. (2017).
#' <doi:10.1093/bioinformatics/btx049>.
#'
#' @references
#' [Nagano et al.] A.J.~Nagano, et al.
#' Deciphering and prediction of transcriptome dynamics under fluctuating field conditions,''
#' Cell~151, 6, 1358--69 (2012)
#'
#' [Iwayama] K.~Iwayama, et al.
#' FIT: statistical modeling tool for transcriptome dynamics under fluctuating field conditions,''
#' Bioinformatics, btx049 (2017)
#'
#' @section Overview:
#' The \pkg{FIT} package is an \code{R} implementation
#' of a class of transcriptomic models that
#' relates gene expressions of plants and weather conditions to which
#' the plants are exposed.
#' (The reader is referred to [Nagano et al.] for the detail of
#' the class of models concerned.)
#'
#' By providing
#' (a) gene expression profiles of plants brought up in a field condition,
#' and (b) the relevant weather history (temperature etc.) of the said field,
#' the user of the package is able to
#' (1) construct optimized models (one for each gene) for their expressions,
#' and
#' (2) use them to predict the expressions for another weather history
#' (possibly in a different field).
#'
#' Below, we briefly explain
#' the construction of the optimized models (training phase'')
#' and the way to use them to make predictions (prediction phase'').
#'
#' \subsection{Model training phase}{
#'
#' The model of [Nagano et al.] belongs to the class of statistical models
#' called linear models''
#' and are specified by a set of parameters'' and
#' (linear regression) coefficients''.
#' The former are used to convert weather conditions to
#' the input variables'' for a regression, and the latter are then
#' multiplied to the input variables to form the expectation values
#' for the gene expressions.
#' The reader is referred to the original article [Nagano et al.]
#' for the formulas for the input variables.
#'
#' The training phase consists of three stages:
#' \enumerate{
#' \item \code{Init}: fixes the initial model parameters
#' \item \code{Optim}: optimizes the model parameters
#' \item \code{Fit}: fixes the linear regression coefficients
#' }
#' The user can configure the training phase
#' through a custom data structure (recipe''),
#' which can be constructed by using the utility function
#' \code{FIT::make.recipe()}.
#'
#' The role of the first stage \code{Init} is to fix the initial values
#' for the model parameters from which the parameter optimization is performed.
#' At the moment two methods, \code{'manual'} and \code{'gridsearch'},
#' are implemented.
#' With the \code{'manual'} method the user can simply specify the set of
#' initial values that he thinks is promising.
#' For the \code{'gridsearch'} method the user discretizes
#' the parameter space to a grid by providing
#' a finite number of candidate values for each parameter.
#' \pkg{FIT} then performs a search over the grid
#' for the best'' combinations of the initial parameters.
#' % In both cases relevant data are passed through \code{init.data}.
#'
#' The second stage \code{Optim} is the main step of the model training,
#' and \pkg{FIT} tries to gradually improve the model parameters
#'
#' This stage could be run one or more times where each can be run
#' using the method \code{'none'}, \code{'lm'} or \code{'lasso'}.
#' The \code{'none'} method passes the given parameter as-is
#' to the next method in the \code{Optim} pipeline or to the next stage \code{Fit}.
#' (Basically, the method is there so that the user can skip the entire
#' \code{Optim} stage, but the method could be used for slightly warming-up the CPU as well.)
#'
#' The \code{'lm'} method uses the a simple (weighted) linear regression to
#' guide the parameter optimization. That is, \pkg{FIT}
#' first computes the input variables'' from the current parameters and
#' associated weather data, and then finds the set of linear coefficients
#' that best explains the output variables'' (gene expressions).
#' Finally, the quadratic residual is used as the measure for the
#' error and is fed back to the Nelder-Mead method.
#'
#' The \code{'lasso'} method is similar to the \code{'lm'} method
#' but uses the (weighted) Lasso regression
#' (linear'' regression with an L1-regularization for the regression coefficients)
#' instead of the simple linear regression.
#' \pkg{FIT} uses the \pkg{glmnet} package to perform
#' the Lasso regression and the strength of the L1-regularization
#' is fixed via a cross validation. (See \code{cv.glmnet()} from the \pkg{glmnet}
#' package.
#' The Lasso regression is said to suppress irrelevant input variables automatically
#' and tends to create models with better prediction ability.
#' On the other hand, \code{'lasso'} runs considerably slower than \code{'lm'}.
#'
#' For example, passing a vector \code{c('lm', 'lasso')} to the
#' argument \code{optim} (of \code{make.recipe()}) creates a recipe
#' that instructs the \code{Optim} stage to
#' (1) first optimize using the \code{'lm'} method,
#' (2) and then fine tunes the parameters using the \code{'lasso'} method.
#'
#' After fixing the model parameters in the \code{Optim} stage,
#' the \code{Fit} stage can be used to fix the linear coefficients
#' of the models.
#' Here, either \code{'fit.lm'} or \code{'fit.lasso'} can be used
#' to find the best'' coefficients, the main difference being that
#' the coefficients are penalized by an L1-norm for the latter.
#' Note that it is perfectly okay to use \code{'fit.lasso'} for
#' the parameters optimized using \code{'lm'}.
#'
#' In order to prepare for the possibly huge variations
#' of expression data as measured by RNA-seq,
#' \pkg{FIT} provides a way to weight regression penalties from each sample
#' with different weights as in
#' \code{sum_{s in samples} (weight_s) (error_s)^2}.
#'
#' } % subsection model training
#'
#' \subsection{Prediction phase}{
#' For each gene, the trained model of the previous subsection
#' can be thought of as a black box that maps
#' the field conditions (weather data),
#' to which a plant containing the gene is exposed,
#' to its expected expression.
#' \pkg{FIT} provides a simple function
#' \code{FIT::predict()} that does just this.
#'
#' \code{FIT::predict()} takes as its argument
#' a list of pretrained models
#' as well as actual/hypothetical plant sample attributes and weather data,
#' and returns the predicted values of gene expressions.
#'
#' When there is a set of actually measured expressions,
#' an associated function \code{FIT::prediction.errors()})
#' can be used to check the validity of the predictions made by
#' the models.
#' } % subsection prediction phase
#'
#' @section Namespece contamination:
#' The \pkg{FIT} package exports fairly ubiquitous names
#' auch as \code{optim}, \code{predict} etc.\ as its API.
#' via \code{requireNamespace('FIT')} and use its API function with
#' a namaspace qualifier (e.g.~\code{FIT::optim()})
#'
#' @section Basic usage:
#' See vignettes for examples of actual scripts that use \pkg{FIT}.
#'
#' @examples
#' \dontrun{
#' # The following snippet shows the structure of a typical
#' # driver script of the FIT package.
#' # See vignettes for examples of actual scripts that use FIT.
#'
#' ##############
#' ## training ##
#' ##############
#' ## discretized parameter space (for 'gridsearch')
#' grid.coords <- list(
#'   clock.phase = seq(0, 23*60, 1*60),
#'   # :
#' )
#'
#' ## create a training recipe
#'                            init  = 'gridsearch',
#'                            init.data = grid.coords,
#'                            optim = c('lm'),
#'                            fit   = 'fit.lasso',
#'                            time.step = 10,
#'                            opts =
#'                              list(lm    = list(maxit = 900),
#'                              lasso = list(maxit = 1000))
#'                            )
#'
#' ## names of genes to construct models
#' genes <- c('Os12g0189300', 'Os02g0724000')
#'
#' }
#'
#'
#' \dontrun{
#' training.expression <- FIT::load.expression('expression.2008.dat', 'ex', genes)
#'
#' ## models will be a list of trained models (length: ngenes)
#' models <- FIT::train(training.expression,
#'                      training.attribute,
#'                      training.weather,
#'                      recipe)
#'
#' }
#'
#' ################
#' ## prediction ##
#' ################
#'
#' \dontrun{
#' prediction.expression <- FIT::load.expression('expression.2009.dat', 'ex', genes)
#'
#' ## predict
#' prediction.result <- FIT::predict(models[[1]],
#'                                  prediction.attribute,
#'                                  prediction.weather)
#'
#'
#'}
#'
#' @docType package
#' @name FIT
NULL
#> NULL

################################################################
###
#' @useDynLib FIT
#' @importFrom Rcpp sourceCpp
NULL
#> NULL

# cleanup
}

######################################################################
### Namespaces for submodules

# internal use only
Model <- new.env()
Train <- new.env()
IO    <- new.env()
Norm  <- new.env()
Jma   <- new.env()

######################################################################
### Enduser API functions for model construction (training) and prediction

#' Supported weather factors.
#' @examples
#' length(FIT::weather.entries)
#' @export
weather.entries <- c('wind', 'temperature', 'humidity',

#' Creates a recipe for training models.
#'
#' @param envs An array of weather factors to be taken into account
#'     during the construction of models.
#'     At the moment, the array \code{envs} can only contain a single weather factor
#'     from \code{weather.entries}, though there is a plan to remove the restriction
#'     in a future version.
#' @param init A string to specify the method to choose the initial parameters.
#'     (One of \code{'gridsearch'} or \code{'manual'}.)
#' @param optim A string to specify the method to be used for optimizing
#'     the model parameters.
#'     (One of \code{'none'}, \code{'lm'} or \code{'lasso'})
#' @param fit A string to specify the method to be used for fixing
#'     the linear regression coefficients.
#'     (One of \code{'fit.lm'} or \code{'fit.lasso'}.)
#' @param init.data Auxiliary data needed to perform the Init stage
#'     using the method specified by the \code{init} argument.
#'     When \code{init} is \code{'gridsearch'}, it should be a list representing
#'     a discretized parameter space.
#'     When \code{init} is \code{'manual'}, it should be a list of parameter
#'     values that is used as the initial values for the parameters in
#'     the Optim stage.
#' @param time.step An integer to specify the basic unit of time (in minute)
#'     for the transcriptomic models.
#'     Must be a multiple of the time step of weather data.
#' @param gate.open.min The minimum opning length in minutes of the gate function for
#'     environmental inputs.
#' @param opts An optional named list that specifies the arguments to be passed
#'     to methods that constitute each stage of the model training.
#'     Each key of the list corresponds to a name of a method.
#'
#'     See examples for the supported options.
#' @return An object representing the procedure to construct models.
#' @examples
#' \dontrun{
#' init.params <- .. # choose them wisely
#' # Defined in Train.R:
#' # default.opts <- list(
#' #   none  = list(),
#' #   lm    = list(maxit=1500, nfolds=-1), # nfolds for lm is simply ignored
#' #   lasso = list(maxit=1000, nfolds=10)
#' # )
#' recipe <- FIT::make.recipe(c('wind', 'temperature'),
#'                            init = 'manual',
#'                            init.data = init.params,
#'                            optim = c('lm', 'none', 'lasso'),
#'                            fit = 'fit.lasso',
#'                            time.step = 10,
#'                            opts =
#'                              list(lm    = list(maxit = 900),
#'                                   lasso = list(maxit = 1000)))
#' }
#'
#' @export
make.recipe <- function(envs, init, optim, fit, init.data, time.step,
gate.open.min = 0, opts = NULL) {
all <- weather.entries
if (!all(vapply(envs, function(o) o %in% all, TRUE))) stop('some envs are invalid: ', envs)
if (length(time.step) != 1) stop('multiple time.step forbidden: ', time.step)
if (init != 'gridsearch' && init != 'manual') stop('invalid init method: ', init)
if (!all(vapply(optim, function(o) o %in% c('none', 'lm', 'lasso'), TRUE)))
stop('some optim methods are invalid: ', optim)
if (fit != 'fit.lm' && fit != 'fit.lasso') stop('invalid fitting method: ', fit)
if (gate.open.min < 0 || gate.open.min > 1440) stop('invalid gate.open.min')
if (is.null(opts)) opts <- list()

Model$Recipe(envs=envs, init=init, optim=optim, fit=fit, init.data=init.data, time.step=as.integer(time.step), gate.open.min=gate.open.min, opts=opts) } ### Notation: env runs over envs; e runs over entries of an env #' Constructs models following a recipe. #' #' @param expression An object that represents gene expression data. #' The object can be created from a dumped/saved dataframe #' of size \code{nsamples * ngenes} #' using \code{FIT::load.expression()}. #' (At the moment it is an instance of a hidden class IO$Expression,
#'     but this may be subject to change.)
#' @param attribute An object that represents the attributes of
#'     microarray/RNA-seq data.
#'     The object can be created from a dumped/saved dataframe
#'     of size \code{nsamples * nattributes}
#'     (At the moment it is an instance of a hidden class IO$Attribute, #' but this may be subject to change.) #' @param weather An object that represents actual or hypothetical weather data #' with which the training of models are done. #' The object can be created from a dumped/saved dataframe #' of size \code{ntimepoints * nfactors} #' using \code{FIT::load.weather()}. #' (At the moment it is an instance of a hidden class IO$Weather,
#'     but this may be subject to change.)
#' @param recipe An object that represents the training protocol of models.
#'     A recipe can be created using \code{FIT::make.recipe()}.
#' @param weight An optional numerical matrix of size \code{nsamples * ngenes}
#'     that during regression penalizes errors from each sample
#'     using the formula
#'     \code{sum_{s in samples} (weight_s) (error_s)^2}.
#'
#'     This argument is optional for a historical reason,
#'     and when it is omitted, all samples are equally penalized.
#' @param min.expressed.rate A number used to
#'   A gene with \code{var(expr) < thres.expr} is regarded as unexpressed,
#'   and \pkg{FIT} sets its model as: \code{expr = log(offset) + 0*inputs}.
#' @return A collection of trained models.
#'
#' @examples
#' \dontrun{
#' # create recipe
#' recipe <- FIT::make.recipe(..)
#'
#  genes <- c('Os01g0182600', 'Os02g0618200')
#' training.expression <- FIT::load.expression('expression.2008.dat', 'ex', genes)
#' training.weight     <- FIT::load.weight('weight.2008.dat', 'weight', genes)
#'
#' # train models
#' models <- FIT::train(training.expression,
#'                      training.attribute,
#'                      training.weather,
#'                      recipe,
#'                      training.weight)
#' }
#' @export
train <- function(expression, attribute, weather, recipe, weight = NULL, min.expressed.rate = 0.01) {
genes <- expression$entries exprs <- expression$rawdata
if (!all(genes == colnames(exprs))) stop('inconsistent expression data')

samples.n <- nrow(exprs)
# genes.n   <- ncol(exprs)

if (nrow(attribute$data) != samples.n) stop('inconsistent attribute data') if (is.null(weight)) weight <- IO$trivialWeights(samples.n, genes)
if (!all(dim(expression$rawdata) == dim(weight$rawdata))) stop('inconsistnet weight data')
weights <- weight$rawdata cat('# * Training..\n') if (recipe$time.step %% weather$data.step != 0) stop('recipe$time.step (= ', recipe$time.step, ') must be an integral multiple of weather$data.step, (= ',
weather$data.step, ')') cat('# ** Prep+Init:\n') models <- Train$init(exprs, weights, attribute$data, weather$data,
recipe$envs, recipe$init, recipe$init.data, weather$data.step, recipe$time.step) cat('# ** Optim ('); cat(recipe$optim, sep=', '); cat('):\n')
os <- recipe$optim for (o in os) models <- Train$optim(exprs, weights, attribute$data, weather$data, models, o,
weather$data.step, recipe$time.step,
recipe$opts[[o]]$maxit, recipe$opts[[o]]$nfolds,
min.expressed.rate,
recipe$gate.open.min) cat('# ** Creating optimized models\n') models <- Train$fit(exprs, weights, attribute$data, weather$data, models, recipe$fit, weather$data.step, recipe$time.step) cat('# Done (training)\n') names(models) <- genes # should be unnecessary, but for document purpose.. models } ################################ ## From this layer down, ## (1) weights are non-optional, and ## (2) placed next to exprs ## ## Users are not recommended to use them directly. #' A raw API for initializing model parameters. #' #' Note: use \code{train()} unless the user is willing to #' accept breaking API changes in the future. #' #' @param expression An object that represents gene expression data. #' The object can be created from a dumped/saved dataframe #' of size \code{nsamples * ngenes} #' using \code{FIT::load.expression()}. #' (At the moment it is an instance of a hidden class IO$Attribute,
#'     but this may be subject to change.)
#' @param weight A matrix of size \code{nsamples * ngenes}
#'     that during regression penalizes errors from each sample
#'     using the formula
#'     \code{sum_{s in samples} (weight_s) (error_s)^2}.
#'
#'     Note that, unlike for \code{FIT::train()}, this argument
#'     is NOT optional.
#' @param attribute An object that represents the attributes of a
#'     microarray/RNA-seq data.
#'     The object can be created from a dumped/saved dataframe
#'     of size \code{nsamples * nattributes}
#'     (At the moment it is an instance of a hidden class IO$Attribute, #' but this may be subject to change.) #' @param weather An object that represents actual or hypothetical weather data #' with which the training of models are done. #' The object can be created from a dumped/saved dataframe #' of size \code{ntimepoints * nfactors} #' using \code{FIT::load.weather()}. #' (At the moment it is an instance of a hidden class IO$Weather,
#'     but this may be subject to change.)
#' @param recipe An object that represents the training protocol of models.
#'     A recipe can be created using \code{FIT::make.recipe()}.
#' @return A collection of models whose parameters are
#'     set by using the \code{'init'} method in the argument \code{recipe}.
#' @export
init <- function(expression, weight, attribute, weather, recipe) {
Train$init(expression$rawdata, weight$rawdata, attribute$data, weather$data, recipe$envs, recipe$init, recipe$init.data,
weather$data.step, recipe$time.step)
}

#' A raw API for optimizing model parameters.
#'
#' Note: use \code{train()} unless the user is willing to
#' accept breaking API changes in the future.
#'
#' @param expression An object that represents gene expression data.
#'     The object can be created from a dumped/saved dataframe
#'     of size \code{nsamples * ngenes}
#'     (At the moment it is an instance of a hidden class IO$Attribute, #' but this may be subject to change.) #' @param weight A matrix of size \code{nsamples * ngenes} #' that during regression penalizes errors from each sample #' using the formula #' \code{sum_{s in samples} (weight_s) (error_s)^2}. #' #' Note that, unlike for \code{FIT::train()}, this argument #' is NOT optional. #' @param attribute An object that represents the attributes of #' microarray/RNA-seq data. #' The object can be created from a dumped/saved dataframe #' of size \code{nsamples * nattributes} #' using \code{FIT::load.attribute()}. #' (At the moment it is an instance of a hidden class IO$Attribute,
#'     but this may be subject to change.)
#' @param weather An object that represents actual or hypothetical weather data
#'     with which the training of models are done.
#'     The object can be created from a dumped/saved dataframe
#'     of size \code{ntimepoints * nfactors}
#'     (At the moment it is an instance of a hidden class IO$Weather, #' but this may be subject to change.) #' @param recipe An object that represents the training protocol of models. #' A recipe can be created using \code{FIT::make.recipe()}. #' @param models A collection of models being trained as is returnd by #' \code{FIT::init()}. #' #' At this moment, it must be a list (genes) of a list (envs) of models #' and must contain at least one model. #' (THIS MIGHT CHANGE IN A FUTURE.) #' @param maxit An optional number that specifies #' the maximal number of times that the parameter optimization is performed. #' #' The user can control this parameter by using the \code{opts} argument #' for \code{FIT::train()}. #' @param nfolds An optional number that specifies the order of #' cross validation when \code{optim} method is \code{'lasso'}. #' This is simply ignored when \code{optim} method is \code{'lm'}. #' @return A collection of models whose parameters are #' optimized by using the \code{'optim'} pipeline #' in the argument \code{recipe}. #' @export optim <- function(expression, weight, attribute, weather, recipe, models, maxit = NULL, nfolds = NULL) { # DOC: 'models' must be a list (genes) of a list (envs) of models # and must contain at least one model. os <- recipe$optim
log <- models[[1]][[1]][['log']][-1]
log <- log[log != 'none']
if (length(log) < length(os)) {
Train$optim(expression$rawdata, weight$rawdata, attribute$data, weather$data, models, os[[length(log)+1]], weather$data.step, recipe$time.step, maxit, nfolds, gate.open.min = recipe$gate.open.min)
} else {
models
}
}

#' A raw API for fixing linear regression coefficients.
#'
#' Note: use \code{train()} unless the user is willing to
#' accept breaking API changes in the future.
#'
#' @param expression An object that represents gene expression data.
#'     The object can be created from a dumped/saved dataframe
#'     of size \code{nsamples * ngenes}
#'     (At the moment it is an instance of a hidden class IO$Attribute, #' but this may be subject to change.) #' @param weight A matrix of size \code{nsamples * ngenes} #' that during regression penalizes errors from each sample #' using the formula #' \code{sum_{s in samples} (weight_s) (error_s)^2}. #' #' Note that, unlike for \code{FIT::train()}, this argument #' is NOT optional. #' @param attribute An object that represents the attributes of #' microarray/RNA-seq data. #' The object can be created from a dumped/saved dataframe #' of size \code{nsamples * nattributes} #' using \code{FIT::load.attribute()}. #' (At the moment it is an instance of a hidden class IO$Attribute,
#'     but this may be subject to change.)
#' @param weather An object that represents actual or hypothetical weather data
#'     with which the training of models are done.
#'     The object can be created from a dumped/saved dataframe
#'     of size \code{ntimepoints * nfactors}
#'     (At the moment it is an instance of a hidden class IO$Weather, #' but this may be subject to change.) #' @param recipe An object that represents the training protocol of models. #' A recipe can be created using \code{FIT::make.recipe()}. #' @param models A collection of models being trained as is returnd by #' \code{FIT::optim()}. #' @return A collection of models whose parameters and regression coeffients #' are optimized. #' @export fit.models <- function(expression, weight, attribute, weather, recipe, models) { Train$fit(expression$rawdata, weight$rawdata, attribute$data, weather$data,
models, recipe$fit, weather$data.step, recipe$time.step) } ################################################################ #' Predicts gene expressions using pretrained models. #' #' @param models A list of trained models for the genes of interest. #' #' At the moment the collection of trained models returned #' by \code{FIT::train()} cannot be directly passed to \code{FIT::predict()}: #' the user has to explicitly convert it to an appropriate format by using #' \code{FIT::train.to.predict.adaptor()}. #' (This restriction might be removed in a future.) #' @param attribute An object that represents the attributes of #' microarray/RNA-seq data. #' The object can be created from a dumped/saved dataframe #' of size \code{nsamples * nattributes} #' using \code{FIT::load.attribute()}. #' (At the moment it is an instance of a hidden class IO$Attribute,
#'     but this may be subject to change.)
#' @param weather An object that represents actual or hypothetical weather data
#'     with which predictions of gene expressions are made.
#'     The object can be created from a dumped/saved dataframe
#'     of size \code{ntimepoints * nfactors}
#'     (At the moment it is an instance of a hidden class IO$Weather, #' but this may be subject to change.) #' @return A list of prediction results as returned by the models. #' #' @examples #' \dontrun{ #' # prepare models #' # NOTE: FIT::train() returns a nested list of models #' # so we have to flatten it using FIT::train.to.predict.adaptor() #' # before passing it to FIT::predict(). #' models <- FIT::train(..) #' models.flattened <- FIT::train.to.predict.adaptor(models) #' #' # load data used for prediction #' prediction.attribute <- FIT::load.attribute('attribute.2009.txt') #' prediction.weather <- FIT::load.weather('weather.2009.dat', 'weather') #' prediction.expression <- FIT::load.expression('expression.2009.dat', 'ex', genes) #' #' prediction.results <- FIT::predict(models.flattened, #' prediction.attribute, #' prediction.weather) #' } #' @export predict <- function(models, attribute, weather) { lapply(models, function(m) m$predict(attribute$data, weather$data, weather$data.step)) } #' Computes the prediction errors using the trained models. #' #' @param models A list of trained models for the genes of interest. #' #' At the moment the collection of trained models returned #' by \code{FIT::train()} cannot be directly passed to \code{FIT::predict()}: #' the user has to explicitly convert it to an appropriate format by using #' \code{FIT::train.to.predict.adaptor()}. #' (This restriction might be removed in a future.) #' @param expression An object that represents the actual measured data of #' gene expressions. #' The object can be created from a dumped/saved dataframe #' of size \code{nsamples * ngenes} #' using \code{FIT::load.expression()}. #' (At the moment it is an instance of a hidden class IO$Attribute,
#'     but this may be subject to change.)
#' @param attribute An object that represents the attributes of
#'     microarray/RNA-seq data.
#'     The object can be created from a dumped/saved dataframe
#'     (At the moment it is an instance of a hidden class IO$Attribute, #' but this may be subject to change.) #' @param weather An object that represents actual or hypothetical weather data #' with which predictions of gene expressions are made. #' The object can be created from a dumped/saved dataframe #' using \code{FIT::load.weather()}. #' (At the moment it is an instance of a hidden class IO$Weather,
#'     but this may be subject to change.)
#' @return A list of deviance (a measure of validity of predictions,
#'     as is defined by each model) between the prediction results
#'     and the measured results (as is provided by the user through
#'     \code{expression} argument).
#' @examples
#' \dontrun{
#' # see the usage of FIT::predict()
#' }
#' @export
prediction.errors <- function(models, expression, attribute, weather) {
lapply(models, function(m) m$deviance(expression$rawdata, attribute$data, weather$data, weather$data.step)) } ################################################################ ### reexport some IO stuff #' Converts attribute data from a dataframe into an object. #' #' @param data A dataframe of the attributes of microarray/RNA-seq data. #' @param sample An optional numeric array that designates #' the samples, that is rows, of the dataframe to be loaded. #' @return An object that represents the attributes of #' microarray/RNA-seq data. #' Internally, the object holds a dataframe whose number of entries #' (rows) equals that of the samples. #' @export convert.attribute <- function(data, sample = NULL) IO$Attribute$new(data, sample) #' Loads attribute data. #' #' @param path A path of a file that contains attribute data to be loaded. #' When the file is a loadable \code{.Rdata}, #' \code{name} of the dataframe object in the \code{.Rdata} #' (that actually contains the relevant data) #' has to be specified as well. #' @param variable An optional string that designates the name of a #' dataframe object that has been saved in an \code{.Rdata}. #' (See the description of \code{path}.) #' @param sample An optional numeric array that designates #' the samples, that is rows, of the dataframe to be loaded. #' @return An object that represents the attributes of #' microarray/RNA-seq data. #' Internally, the object holds a dataframe whose number of entries #' (rows) equals that of the samples. #' #' @export load.attribute <- function(path, variable = NULL, sample = NULL){ file.data <- IO$slurp(path, variable)
IO$Attribute$new(file.data, sample)
}

#' converts expression data from a dataframe into an object.
#'
#' @param data A dataframe of expression data to be loaded.
#' @param entries An optional string array that designates
#'     the entries of the dataframe to be loaded.
#' @return An object that represents the expression data of microarray/RNA-seq.
#'     Internally, the object holds a matrix of size
#'     \code{nsamples * ngenes}.
#' @export
convert.expression <- function(data, entries = NULL)
IO$Expression$new(data, entries)
#'
#' @param path A path of a file that contains attribute data to be loaded.
#'     When the file is a loadable \code{.Rdata},
#'     \code{name} of the dataframe object in the \code{.Rdata}
#'     (that actually contains the relevant data)
#'     has to be specified as well.
#' @param variable An optional string that designates the name of a
#'     dataframe object that has been saved in an \code{.Rdata}.
#'     (See the description of \code{path}.)
#' @param entries An optional string array that designates
#'     the entries of the dataframe to be loaded.
#' @return An object that represents the expression data of microarray/RNA-seq.
#'     Internally, the object holds a matrix of size
#'     \code{nsamples * ngenes}.
#' @export
load.expression <- function(path, variable = NULL, entries = NULL){
file.data <- IO$slurp(path, variable) IO$Expression$new(file.data, entries) } #' Converts weather data from a dataframe into an object. #' #' @param data A dataframe of weather data to be converted. #' @param entries An optional string array that designates #' the entries of the dataframe to be loaded. #' @return An object that reprents the timeseries data of weather factors. #' Internally, the object holds a dataframe of size #' \code{ntimepoints * nfactors}. #' @export convert.weather <- function(data, entries = IO$weather.entries)
IO$Weather$new(data, entries)

#'
#' @param path A path of a file that contains weather data to be loaded.
#'     When the file is a loadable \code{.Rdata},
#'     \code{name} of the dataframe object in the \code{.Rdata}
#'     (that actually contains the relevant data)
#'     has to be specified as well.
#' @param variable An optional string that designates the name of a
#'     dataframe object that has been saved in an \code{.Rdata}.
#'     (See the description of \code{path}.)
#' @param entries An optional string array that designates
#'     the entries of the dataframe to be loaded.
#' @return An object that reprents the timeseries data of weather factors.
#'     Internally, the object holds a dataframe of size
#'     \code{ntimepoints * nfactors}.
#' @export
load.weather <- function(path, variable = NULL, entries = IO$weather.entries){ file.data <- IO$slurp(path, variable)
IO$Weather$new(file.data, entries)
}

#' Converts regression weight data from a dataframe into an object.
#'
#' @param data A dataframe that contains weight data to be loaded.
#' @param entries An optional string array that designates
#'     the entries of the dataframe to be loaded.
#' @return An object that represents the weights
#'     Internally, the object holds a matrix of size
#'     \code{nsamples * ngenes}.
#' @export
convert.weight <- function(data, entries = NULL)
IO$Weights$new(data, entries)
#'
#' @param path A path of a file that contains weight data to be loaded.
#'     When the file is a loadable \code{.Rdata},
#'     \code{name} of the dataframe object in the \code{.Rdata}
#'     (that actually contains the relevant data)
#'     has to be specified as well.
#' @param variable An optional string that designates the name of a
#'     dataframe object that has been saved in an \code{.Rdata}.
#'     (See the description of \code{path}.)
#' @param entries An optional string array that designates
#'     the entries of the dataframe to be loaded.
#' @return An object that represents the weights
#'     Internally, the object holds a matrix of size
#'     \code{nsamples * ngenes}.
#' @export
load.weight <- function(path, variable = NULL, entries = NULL){
file.data <- IO$slurp(path, variable) IO$Weights$new(file.data, entries) } #' Makes trivial weight data #' #' @param samples.n A number of samples. #' @param genes A list of genes. #' @return An object that represens the trivial weights. #' Internally, the object holds an identity matrix of size #' \code{nsamples * ngenes}. #' @export make.trivial.weights <- function(samples.n, genes) IO$trivialWeights(samples.n, genes)


## Try the FIT package in your browser

Any scripts or data that you put into this service are public.

FIT documentation built on May 8, 2018, 9:04 a.m.