fit.analysis: Predictive analysis on clusters

View source: R/analysis.R

fit.analysisR Documentation

Predictive analysis on clusters

Description

Fits predictive model of some outcome (by default, cluster growth) to some cluster-level variable (by default, cluster size). This fit is done for each cluster set. Multiple models can be inputted as a named list of functions taking in cluster data (see example)

Usage

fit.analysis(
  cluster.data,
  predictor.transformations = list(),
  predictive.models = list(NullModel = function(x) {
     glm(Growth ~ Size, data = x,
    family = "poisson")
 })
)

Arguments

cluster.data:

data.table, Inputted set(s) of clusters. Possibly multiple ranges The following columns are required: Size: The number of sequences in clusters, not including new growth sequences. Growth: The number of new sequences added to the cluster. SetID: unique identifier for a set of clusters (obtained under given criteria) RangeID:

predictor.transformations:

A named list of transformation functions for each predictor variable, e.g., list("Data"==sum). Because clustered meta data takes the form of a list these functions are often necessary to obtain a single, cluster-level variable. Typical functions include mean and median.

predictive.models:

A named list of functions, each of which applies a model to inputted cluster data (x). By default a "NullModel" example. Where Growth is predicted only by cluster size

Value

list, each entry labelled with SetID (to link back to the parameter list) Entries contain S3 objects of class "glm" or "lm".

Examples

cluster.data <- cluster.ex
cluster.data[,"RangeID":=0]

fit.result <- fit.analysis(cluster.data)

mod.performance <- fit.result$NullModel

PoonLab/clustuneR documentation built on Jan. 29, 2024, 2:40 a.m.