gen_pivot_data: Generate pivot data set

Description Usage Arguments Details Value Author(s) Examples

Description

Generates a data set with pivot and non-pivot features for several domains. Pivot features are features that have the same distribution across domains. Non-pivot features preserve the class relationships but distribution means have been shifted across domains (use the plot method to observe this).

Usage

1
2
gen_pivot_data(n_nonpivots, n_pivots, n_domains, n_classes, n,
  sd_class_means = 1, sd_np_means = 1, sd_obs = 1)

Arguments

n_nonpivots

Number of non-pivot features.

n_pivots

Number of pivot features.

n_domains

Number of domains.

n_classes

Number of possible classes.

n

Number of observations. This is adjusted to the nearest number to allow for a balanced data set.

sd_class_means

Standard deviation of class means. Smaller values will result in features with overlapping distributions.

sd_np_means

Standard deviation of the non-pivot feature means. This controls the distribution shift across domains for non-pivot features.

sd_obs

Standard deviation of the observations.

Details

This function outputs a balanced data set (same number of observations for each class).

Value

gen_pivot_data returns an object of type "pivot_data" and "data.frame".

The function plot produces a plot of domain densities facetted by pivot and non-pivot features.

Author(s)

Cameron Roach

Examples

1
2
3
4
5
6
pivot_data <- gen_pivot_data(1, 1, 2, 2, 200)
plot(pivot_data)
require(ggplot2)
ggplot(pivot_data, aes(x = NP_Feature_1, y = P_Feature_1, colour = Class)) +
geom_point() +
facet_wrap(~Domain)

camroach87/semisupervisr documentation built on May 13, 2019, 11:04 a.m.