GenerateSyntheticData: Generates synthetic data.

Description Usage Arguments Value References Examples

View source: R/clust_functions.R

Description

This function generates synthetic clustering data.

It generates synthetic by the following steps:

1. generates cluster centers with s informative and p total features according to the following steps: 1). generates orthonormal matrix of dimension k by s. 2). multiplies the matrix by signal_strength. 3). binds (p-s) zero columns to the above matrix.

2. randomly generates n by k one-hot cluster assignments.

3. generates n by p signal matrix and add scaled standard gaussian or t2 noise.

Usage

1
GenerateSyntheticData(n, p, s, k, signal_strength, noise_type)

Arguments

n

int. Number of observations.

p

int. Number of features.

s

int. Number of informative features.

k

int. Number of clusters.

signal_strength

float. Signal strength.

noise_type

character. Noise type. Must be either "gaussian" or "t2".

Value

list. The result contains two attributes: $data is the data matrix, $labels contain the cluster ids.

References

T. Liu, Y. Lu, B. Zhu, H. Zhao (2021). High-dimensional Clustering via Feature Selection with Applications to Single Cell RNA-seq Data.

Examples

1
GenerateSyntheticData(n=10, p=10, s=5, k=2, signal_strength=1, noise_type="gaussian")

TerenceLiu4444/SCFS documentation built on Feb. 13, 2022, 9:18 a.m.