generate.sample3: Sample3 generator of synthetic data

generate.sample3R Documentation

Sample3 generator of synthetic data

Description

Multivariate normally distributed data synthetic generator. Data sets with 3 clusters are randomly generated. n examples for each class are generated. n 1000-dimensional examples for each class are generated. All classes (each one of n examples) has 300 no-noisy features and 700 noisy features. There is a certain overlap between classes and a full covariance matrix (equal for all classes is used). The first class (first n examples) has its no-noisy features centered in 0. The second class (second n examples) has its no-noisy features centered in m The third class (last n examples) has its no-noisy features centered in -m Covariance matrix Sigma = (B, Zero; Zero', I) where B is a 300X300 matrix s.t. B[i,i]=1, B[i,i+1]=B[i,i-1]=0.5 and B[i,j]=0.1 j!=i-1,i,i+1; Zero is a 300X700 zero matrix and Zero' its transpose; I is a 700X700 identity matrix.

Usage

generate.sample3(n = 2, m = 2)

Arguments

n

number of examples for each class

m

vector center of the second class

Value

a matrix with 1000 rows (variables) and n*3 columns (examples)

Author(s)

Giorgio Valentini valentini@di.unimi.it

Examples


generate.sample3()
# Generation of a data set with 60 1000-dimensional examples, 
# with the examples of the first class  centered in the 1000-dimensional 
# 0 vector, the second class is centered in the 1 vector, the third in -1. 
generate.sample3(n = 20, m = 1)


clusterv documentation built on June 8, 2025, 10:21 a.m.