DataGen_rare_group_usage"
In MUGS: Multisource Graph Synthesis with EHR Data

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Introduction

The DataGen_rare_group function generates synthetic data for rare group analysis, simulating structured datasets for testing and validating algorithms. This vignette demonstrates how to use DataGen_rare_group with example inputs.

Load the Required Library

Ensure the MUGS package is loaded before running the example:

library(MUGS)

Generate Synthetic Data

Run the DataGen_rare_group function to generate the synthetic dataset:

# Generate data
seed =1
p = 5
n1 = 100
n2 = 100
n.common = 50
n.group = 30
sigma.eps.1 = 1
sigma.eps.2 = 3
ratio.delta = 0.05
network.k = 5
rho.beta = 0.5
rho.U0 = 0.4
rho.delta = 0.7
sigma.rare = 10
n.rare = 20
group.size = 5

DataGen.out <- DataGen_rare_group(seed, p, n1, n2, n.common, n.group, sigma.eps.1, sigma.eps.2, ratio.delta, network.k, rho.beta, rho.U0, rho.delta, sigma.rare, n.rare, group.size)

Examine the Output

Explore the structure and key components of the generated dataset:

# View structure of the output
str(DataGen.out)

# Print the first few rows and columns of the S.1 matrix
cat("\nFirst 5 rows and columns of S.1:\n")
print(DataGen.out$S.1[1:5, 1:5])

# Print the first few rows and columns of the S.2 matrix
cat("\nFirst 5 rows and columns of S.2:\n")
print(DataGen.out$S.2[1:5, 1:5])

Notes

Custom Parameters: Modify parameters like p, n1, n2, n.group, and others to test different scenarios.
Reproducibility: The seed parameter ensures reproducibility of results.
Applications: Use the generated data for testing rare group detection algorithms or performance benchmarking.

Summary

This vignette demonstrated how to use the DataGen_rare_group function to simulate structured data for rare group analysis. Adjust input parameters to suit specific use cases or experimental setups. For further details, refer to the package documentation.