gen_pivot_data: Generate pivot data set
In camroach87/semisupervisr: Semi-supervised training functions

Description Usage Arguments Details Value Author(s) Examples

Generates a data set with pivot and non-pivot features for several domains. Pivot features are features that have the same distribution across domains. Non-pivot features preserve the class relationships but distribution means have been shifted across domains (use the plot method to observe this).

1 2	gen_pivot_data(n_nonpivots, n_pivots, n_domains, n_classes, n, sd_class_means = 1, sd_np_means = 1, sd_obs = 1)

`n_nonpivots`	Number of non-pivot features.
`n_pivots`	Number of pivot features.
`n_domains`	Number of domains.
`n_classes`	Number of possible classes.
`n`	Number of observations. This is adjusted to the nearest number to allow for a balanced data set.
`sd_class_means`	Standard deviation of class means. Smaller values will result in features with overlapping distributions.
`sd_np_means`	Standard deviation of the non-pivot feature means. This controls the distribution shift across domains for non-pivot features.
`sd_obs`	Standard deviation of the observations.

This function outputs a balanced data set (same number of observations for each class).

gen_pivot_data returns an object of type "pivot_data" and "data.frame".

The function plot produces a plot of domain densities facetted by pivot and non-pivot features.

Cameron Roach

pivot_data <- gen_pivot_data(1, 1, 2, 2, 200)
plot(pivot_data)
require(ggplot2)
ggplot(pivot_data, aes(x = NP_Feature_1, y = P_Feature_1, colour = Class)) +
geom_point() +
facet_wrap(~Domain)