cluster_plot: Plot estimated functions for experimental units faceted by...

View source: R/gp_cluster_plot.R

cluster_plotR Documentation

Plot estimated functions for experimental units faceted by cluster versus data to assess fit.

Description

Uses as input the output object from the gpdpgrow() and gmrfdpgrow() functions.

Usage

cluster_plot(
  object,
  N_clusters = NULL,
  time_points = NULL,
  units_name = "unit",
  units_label = NULL,
  date_field = NULL,
  x.axis.label = NULL,
  y.axis.label = NULL,
  smoother = TRUE,
  sample_rate = 1,
  single_unit = FALSE,
  credible = FALSE,
  num_plot = NULL
)

Arguments

object

A gpdpgrow or gmrfdpgrow object.

N_clusters

Denotes the number of largest sized (in terms of membership) clusters to plot. Defaults to all clusters.

time_points

Inputs a vector of common time points at which the collections of functions were observed (with the possibility of intermittent missingness). The length of time_points should be equal to the number of columns in the data matrix, y. Defaults to time_points = 1:ncol(y).

units_name

The plot label for observation units. Defaults to units_name = "function".

units_label

A vector of labels to apply to the observation units with length equal to the number of unique units. Defaults to sequential numeric values as input with data, y.

date_field

A vector of Date values for labeling the x-axis tick marks. Defaults to 1:T .

x.axis.label

Text label for x-axis. Defaults to "time".

y.axis.label

Text label for y-axis. Defaults to "function values".

smoother

A scalar boolean input indicating whether to co-plot a smoother line through the functions in each cluster.

sample_rate

A numeric value in (0,1] indicating percent of functions to randomly sample within each cluster to address over-plotting. Defaults to 1.

single_unit

A scalar boolean indicating whether to plot the fitted vs data curve for only a single experimental units (versus a random sample of 6). Defaults to single_unit = FALSE.

credible

A scalar boolean indicating whether to plot 95 percent credible intervals for estimated functions, bb, when plotting fitted functions versus data. Defaults to credible = FALSE

num_plot

A scalar integer indicating how many randomly-selected functions to plot (each in it's own plot panel) in the plot of functions versus the observed time series in the case that single_unit == TRUE. Defaults to num_plot = 6.

Value

A list object containing the plot of estimated functions, faceted by cluster, and the associated data.frame object.

p.cluster

A ggplot2 plot object

dat.cluster

A data.frame object used to generate p.cluster.

Author(s)

Terrance Savitsky tds151@gmail.com

See Also

gpdpgrow, gmrfdpgrow

Examples

{
library(growfunctions)

## load the monthly employment count data for a collection of 
## U.S. states from the Current 
## Population Survey (cps)
data(cps)
## subselect the columns of N x T, y, associated with 
## the years 2008 - 2013
## to examine the state level employment levels 
## during the "great recession"
y_short             <- cps$y[,(cps$yr_label %in% c(2008:2013))]

## Run the DP mixture of iGMRF's to estimate posterior 
## distributions for model parameters
## Under default RW2(kappa) = order 2 trend 
## precision term
res_gmrf            <- gmrfdpgrow(y = y_short, 
                                     n.iter = 40, 
                                     n.burn = 20, 
                                     n.thin = 1) 
                                     
## 2 plots of estimated functions: 1. faceted by cluster and fit;
## 2.  data for experimental units.
## for a group of randomly-selected functions
fit_plots_gmrf      <- cluster_plot( object = res_gmrf, 
                                     units_name = "state", 
                                     units_label = cps$st, 
                                     single_unit = FALSE, 
                                     credible = TRUE )   
}

growfunctions documentation built on June 22, 2024, 11:49 a.m.